But in the biggest ever study of real-world mortgage data, economists Laura Blattner at Stanford University and Scott Nelson at the University of Chicago show that differences in mortgage approval between minority and majority groups is not just down to bias, but to the fact that minority and low-income groups have less data in their credit histories.
This means that when this data is used to calculate a credit score and this credit score used to make a prediction on loan default, then that prediction will be less precise. It is this lack of precision that leads to inequality, not just bias.
The implications are stark: fairer algorithms won’t fix the problem.
“It’s a really striking result,” says Ashesh Rambachan, who studies machine learning and economics at Harvard University, but was not involved in the study. Bias and patchy credit records have been hot issues for some time, but this is the first large-scale experiment that looks at loan applications of millions of real people.
Credit scores squeeze a range of socio-economic data, such as employment history, financial records, and purchasing habits, into a single number. As well as deciding loan applications, credit scores are now used to make many life-changing decisions, including decisions about insurance, hiring, and housing.
To work out why minority and majority groups were treated differently by mortgage lenders, Blattner and Nelson collected credit reports for 50 million anonymized US consumers, and tied each of those consumers to their socio-economic details taken from a marketing dataset, their property deeds and mortgage transactions, and data about the mortgage lenders who provided them with loans.
One reason this is the first study of its kind is that these datasets are proprietary and not publicly available to researchers. “We went to a credit bureau and basically had to pay them a lot of money to do this,” says Blattner.
They then experimented with different predictive algorithms to show that credit scores were not simply biased but “noisy,” a statistical term for data that can’t be used to make accurate predictions. Take a minority applicant with a credit score of 620. In a biased system, we might expect this score to always overstate the risk of that applicant and that a more accurate score would be 625, for example. In theory, this bias could then be accounted for via some form of algorithmic affirmative action, such as lowering the threshold for approval for minority applications.