Using the default judge (NetflixACAJudge), how does Kayenta consider a canary’s metric fail or success?

Quote from the documentation:

The primary metric comparison algorithm (classifier) in Kayenta uses a nonparametric statistical test to check for a significant difference between the canary and baseline metrics.

How much is a significant difference? Is there a percentage for it?

