A new book reveals the hidden biases that can creep into Data Science models
Yesterday (September 6, 2016)聽saw the launch of a new book by Cathy O'Neil with the provactive聽title聽Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. O'Neil holds a聽Ph.D. from Harvard in Math and was a聽tenure-track math聽professor until 2007, when she聽quit academia to join聽Wall Street. That fledging second career came to an end just a year later聽with the Financial Crisis, after which O'Neil again changed careers and became a data scientist. She runs a blog in which she identifies and debunks misuses of mathematical modelling.
O'Neil identifies three characteristics of the misused聽models that she calls聽"Weapons of Math Destruction":
- The model's reasoning is opaque, even to the entities that use them.
- The model's outputs can cause some harm to the people whose behaviour is being modelled, as in being fired, refused a job or admittance to a university;
- The model聽is being deployed聽on a large scale.
O'Neil's book provides many聽concrete examples of WMDs. Here are just two:聽
- Credit scores used to be based on people's actual history of taking loans and paying them back. People can view their credit histories and聽request corrections when they see a mistake.聽But some institutions are starting to judge a person's credit-worthiness聽using so-called "e-scores", which聽use聽data elements such as postal codes, telephone聽area codes and web-surfing history to predict who is likely to pay back a loan. Such a model will聽notice that people who live in affluent areas tend to pay back loans more often than do people from poor areas, hardly a surprising conclusion. But that means that credit-worthiness is聽judged not on what people have themselves done in the past but on the neighbourhoods in which they live. This kind of profiling will mean that disadvantaged people will remain at a disadvantage, having a harder time getting credit and paying higher interest when they do.
- Employers are now basing hiring decisions by looking at the聽past history of their own聽employees and聽seeing which traits best predict success.聽If there are already hidden biases in a company's workplace, such as a preponderance of males in a particular job, this kind of modelling will only perpetuate those biases. A female candidate will not "look like" these previously successful employees, so she will not be hired. The workplace history will remain predomindantly male, which will only reinforce the model's gender bias going forward, making it ever less likely that a female will be hired.
O'Neil's book聽has drawn a great deal of attention from mainstream media and several high-profile聽blogs. Here are just a few places where you can read more about it:
- A in the UK newspaper聽The Guardian.
- An in Time magazine.
- An in New York magazine.