Computer vision

Method predicts bias in face recognition models using unlabeled data

Eliminating the need for annotation makes bias testing much more practical.

November 8, 2022

In recent years, algorithmic bias has become a central topic of research across AI disciplines. Interest in the topic ballooned after a 2018 investigation of bias in face recognition software — where bias was defined as differential performance on subjects from different demographic groups.

The natural way to test a face recognition model for bias would be to feed it lots of images featuring subjects from different groups and see how it performs. But that requires data annotated to indicate subjects’ identity across images, and identity annotation is extremely costly — especially at the scale required to conclusively evaluate a face recognition model.

Program supports computational research with goal of creating trustworthy AI systems that can address some of society's grand challenges.

At this year’s European Conference on Computer Vision (ECCV), my colleagues and I presented a new method for assessing bias in face recognition systems that does not require data with identity annotations. Although the method only estimates a model’s performance on data from different demographic groups, our experiments indicate that those estimates are accurate enough to detect the differences in performance indicative of bias.

This result — the ability to predict the relative performance of a face identification model without test data annotated to indicate facial identity — was surprising, and it suggests an evaluation paradigm that should make it much more practical for creators of face recognition software to test their models for bias.

Bias estimation.png — This graph plots false-positive rate (false-match rate, or FMR) against false-negative rate (false-non-match rate, or FNMR) for a face recognition model trained on the Racial Faces in the Wild dataset with the “African” category intentionally omitted, to introduce bias. The solid lines are the ground truth, the dotted lines our model’s prediction, and the colored regions our model’s confidence bounds. Even at the outer edges of the confidence bounds, the differential performance is clear.

Besides its cost effectiveness, our method also has the advantage that it can be adapted on the fly to new demographic groups. It does require some means of identifying subjects who belong to those groups — such as image metadata from self-reporting — but it doesn’t require identity labels.

The model

From annotated training data, face recognition models typically learn to produce vector representations — embeddings — of input images and measure their distance from each other in the embedding space. Any embeddings whose distance falls below some threshold are classified as representing the same person.

We assume that the distances between true matches fall into some distribution, and the distances between non-identical faces fall into a different distribution. The goal of our method is to learn the parameters of those two distributions.

Dual distribution.png — We assume that the distance scores for true matches *(p₁)* and the scores for non-identical faces *(p₀)* fall into two different distributions *(blue curve and yellow curve)*. Our goal is to learn the parameters of those distributions *(q₁ and q₀)*.

Empirically, we found that the score distributions tend to be slightly skewed, so we modeled them using a two-piece distributions. Two-piece distributions divide the distribution around the mode — the most commonly occurring value — and the distributions on either side of the mode have different parameters.