By now, there have been many publicized instances of machine learning models that appear to discriminate on the basis of a protected attribute, such as race or gender. To mention a few, COMPAS, a model created by Northpointe (now Equivant) to measure the recidivism risk of many criminal defendants across the U.S., came under scrutiny because it categorized black people as high risk disproportionately more often, even after controlling for whether they actually recidivated. More recently, Amazon revealed that a model that they built to screen job applicants tended to favor men over women, leading them to scrap the model.
To be clear, I’m not claiming that this happened because someone at Northpointe or Amazon was a malicious racist or sexist. Rather, the issue is that it’s hard to ensure that a model is free of discriminatory biases; even if we don’t give the model access to the protected attribute directly, the model may learn to discriminate anyway by using a proxy that is correlated with the protected attribute.
How statistical notions of fairness can go wrong
There has been a lot of research in the field of fair machine learning, and in the process many different definitions of fairness have come up. One of the simplest and the most commonly used definitions is demographic parity, which states that the positive classification rate must be the same regardless of the protected attribute. For example, in the context of job applications, if 10% of the white applicants are ultimately hired, then 10% of the black applicants should be hired as well.
But just because a hiring process satisfies demographic parity does not mean that the hiring process is free of discrimination. For example, suppose that an employer uses a flawed model that results in the hiring of 10% of the white applicants and only 5% of the black applicants. Realizing the flaw of the model, the employer then attempts to remove the racial imbalance by hiring five more percent of the black applicants. Although the two racial groups now receive the same result on average, there still could be individual applicants who are harmed by the flawed model and are not hired later. This example is not just hypothetical; the state government of Connecticut did something very similar in the 1970s, and the U.S. Supreme Court decided for the above reason that demographic parity is not a complete defense to claims of discrimination.
Proxy-based notion of fairness
So instead of considering the statistical effect of the proxy on the model as a whole, we choose to identify the cause of the discriminatory behavior directly. Previously, Datta et al. formalized the definition of a proxy as any component of the model that is:
- sufficiently associated with the protected attribute, and
- sufficiently influential on the final output of the model.
This is a high-level summary of the definition, and in order to apply this definition in practice, we have to define what a component is, as well as specifying a measure of association and influence.
Our paper at NeurIPS 2018 does exactly that for the setting of linear regression. We then improve on the proxy detection procedure proposed by Datta et al.[5:1], which is basically a brute-force algorithm that enumerates all components of the model. In the setting of linear regression, we found a nice structure that allows for a convex optimization (second-order cone programming) procedure that detects proxies quickly and efficiently. This is useful in practice because it allows a model trainer to learn which part of the model is discriminatory and take steps to fix that part.
Justified use of a proxy
One final point of note is that not all proxies are bad. To see why, let’s go back to the example of job applications. If the job requires heavy lifting, a model that is aware of this requirement will use the weightlifting ability of the applicant to help make its predictions. Although the weightlifting ability is likely to be a proxy for gender, this proxy is considered justified in this context because it is relevant for this specific prediction task. So in general, we assume that once a proxy has been identified, a human domain expert would look at the proxy to decide whether it should be allowed to remain in the model. But humans, unlike computers, are limited in their ability to process large amounts of work, so we want to focus their attention on the proxies that we are really unsure about. In particular, in the above example, we are pretty sure that the use of the weightlifting ability is justified, so we want to avoid identifying proxies that are largely based on the weightlifting ability. In the NeurIPS paper[6:1], we formalize this intuition by designating an input attribute as exempt, and we show how to modify our proxy detection algorithm to avoid proxies that are driven by the exempt attribute.
I hope that this post clarified the motivation for our work on proxy use, and I encourage you to read the paper[6:2] for more technical details.