The T-Test measures whether the difference in averages between two groups is statistically significant. T-Tests are used when sample sizes are small and the distribution of data for both groups is non-standard or unknown (which can often be the case for protected class consumers).
An output of a T-Test is a p-value which reflects the probability than an observed difference in averages is random or statistically significant. Common thresholds for significance are p-values of 0.05, 0.01, or 0.001.
A p-value less than 0.05 (often used as a standard threshold) suggests that there is less than a 5% probability that the observed difference in distributions is due to random chance. Thus, the result is considered statistically significant, indicating potential issues in fairness between the two groups. Lower thresholds (like 0.01 or 0.001) indicate even higher confidence that a difference in outcomes between groups is not random.
Here is a comparison of how the Z-Test and T-Test might measure distributions:
Although there are no concrete fairness thresholds, regulators may find: