Last Updated: May 29, 2026
A model can look strong overall and still fail specific groups. Aggregate metrics hide these gaps because majority groups dominate the average.
Fairness failures usually surface this way in production. There is no crash or error spike, just error rates, approval rates, exposure, or quality of experience that differ from one group to the next.
This chapter covers how bias enters ML systems, how to measure fairness, and what practical mitigation looks like.
Note: This chapter is about fairness bias in ML systems, not the statistical bias-variance tradeoff. Statistical bias is about model error from restrictive assumptions. Fairness bias is about systematic disadvantage or unequal treatment across groups.