page image

CHAPTER IV - Perils of Predictive Analytics in Criminal Justice

The unacknowledged biases of AI systems are perhaps most pointedly evident in the criminal justice system. Here AI is often used to make data-based generalizations about people, which are then used in policing on the street, criminal sentencing, parole decisions and much else.

It is all part of “the coded gaze,” as Joy Buolamwini of the Algorithmic Justice League puts it. The coded gaze is about the power to create and use AI technologies to evaluate other people and make decisions about their fates. Structural inequalities such as historic racial, ethnic and gender prejudices, and social and wealth inequalities feed into this process, with the result that AI amplifies past inequities. For example, said Buolamwini, facial recognition software may rely on datasets that are easily accessible, which may mean that ethnic minorities with under-represented faces, may be excluded from the system in the first place.i

New York Times reporter Adam Litwak in 2017 asked of Chief Justice John Roberts, Jr., “Can you foresee a day when smart machines, driven with artificial intelligences, will assist with courtroom fact-finding or, more controversially even, judicial decision-making?” Roberts’ answer stunned everyone: “It’s a day that’s here, and it’s putting a significant strain on how the judiciary goes about doing things.”

Court observers speculated that Roberts was talking about the case of a Wisconsin prisoner sentenced to six years in prison based in part on a software program produced by Northpointe, Inc., called COMPAS. A prosecutor used the program to persuade a trial judge that the prisoner showed “a high risk of violence, high risk of recidivism, high pretrial risk.” Meanwhile, the news organization Pro Publica published an article about the COMPAS secret algorithms in 2016 that concluded that black defendants in Broward County, Florida “were far more likely than white defendants to be incorrectly judged to be at a higher rate of recidivism.”

Joshua Browder, Founder and CEO of DoNotPay Robot Lawyer, a chat-bot lawyer that helps people with their legal issues, reported that “about 20% of all cases in the law now have automatic sentencing programs using AI technology that applies some sort of sentencing recommendation to the judge. There’s been this huge backlash because none of the algorithms are open despite sentencing being part of a public process.” (Browder had created his software chatbot to turn the tables on the formulaic and unfair nature of law: by enabling motorists to appeal parking tickets, over 175,000 succeeded in getting their tickets thrown out, saving them an estimated $5 million.)

For the most part, said one participant, judges are not unhappy to be able to shift responsibility for verdicts to AI systems and their supposedly more rigorous risk assessment scores. This has serious risks for the morality and legitimacy of the system, said Father Eric Salobir of OPTIC: When closed, proprietary algorithms become so deeply implicated in the justice system, “we move from a justice of consequences to a justice of correlation,” undermining the integrity of justice and the process. Explainability is a critical element of the judicial system.ii

Secret algorithms rendering risk assessments are not confined to criminal sentencing. They are also being used to try to predict criminal activity, which in turn is influencing policing priorities. “More and more police departments are adopting AI systems to predict where crime will happen,” said David Copps, Founder and CEO of Brainspace, a Dallas-based firm that does “investigative analytics.” Copps was once shown a video about an AI system in which police arrested a “suspicious” person in a neighborhood that was supposedly more crime-prone. It turned out that there was an arrest warrant out on the person, enabling the police to make an arrest. Copps said: “All I could think is that that person didn’t do anything. He had not committed a crime.” Even though predictions may be unreliable, said Copps, the danger is that “intention creates reality. It almost creates a false-positive.”

Kate Crawford cited a multi-year RAND study of predictive policing in Chicago. “It showed that the system was completely ineffective at reducing crime,” she said. “Absolutely net zero. It did have one significant net impact, however — increasing the harassment of people on the ‘heat list’, according to RAND.” Crawford also cited a research paper about an “automatic criminality detector” that purports to use machine-learning to discriminate between criminals and non-criminals based on their assessment of people’s faces. The system analyzed the faces of 1,586 real persons, nearly half of whom were convicted criminals, and concluded that the AI could detect “structural features for predicting criminality, such as lip curvature, eye inner corner distance, and the so-called nose-mouth angle.”

Crawford said that the authors of the piece, responding to great criticism, claimed that any resemblance to the [discredited] use of physiognomy and phrenology on their part is purely accidental because, in their view, machine learning is neutral. This is one of Crawford’s central concerns about AI technologies: they are often portrayed as “neutral tools.” Some users presume that by pushing lots of (potentially skewed) data through an opaque deep learning algorithm, the results are somehow neutral and reliable. “We need to think a lot more about what design specs are built into a system, and who has the power to shape those specifications,” Crawford said. Commissioner Terrell McSweeny of the Federal Trade Commission (speaking for herself and not the FTC) agreed, “These are completely opaque systems that are held by a very few powerful entities, and we have almost no access to determine when bias is even occurring. We are relying completely on companies’ own internal testing and control. One must question whether that is sustainable or desirable.”

Algorithmic Accountability
The defense of AI risk assessment and prediction systems is that they provide a more factual basis for decision-making and can therefore actually reduce biases and mistakes. “If AI is used to supplement human decisions in complex circumstances, shedding light on very, very complex situations with more variables than the human brain can hold, I think the technology opens us the possibility for more fairness,” said Jean-François Gagné of Element AI. He cited progress on Generative Adversarial Networks, or GANs, which attempt to extract and explain biases in algorithms in datasets.

FTC Commissioner Terrell McSweeny is troubled by this kind of automated justice, however: “You are eliminating human accountability and the role that humans play in sitting in judgment on other human beings.” That’s what the mandatory minimum federal drug sentencing guidelines for judges attempted to do, but failed, she said. “We have lost an entire generation of mostly men of color to a system that was meant to be weeding out bias in individual judges,” said McSweeny. “When you eliminate human discretion in decision making, you can potentially cause much greater problems.”

Another factor at play here is our varying levels of tolerance for error, she said. With self-driving cars, there is likely to be a very low willingness for mistakes. This does not appear to be the case with AI-assisted systems for assessing job performance and meting out criminal sentences.

A print cartoon about this theme has a defendant asking a computer, “How can you judge me?” The computer replies, “You wouldn’t understand.” In that sardonic humor lies a deeper philosophical debate about the need for transparency and accountability in jurisprudence. “I would argue that as we move into the AI realm, there is a new problem, the opacity of decisions,” said Marc Rotenberg of the Electronic Privacy Information Center. “Data can be enormously helpful in helping us to extract bias or reveal bias, but when we embed a rule, we are encoding whatever normative value we think produces necessary outcomes.”

This dynamic was dramatically on display in a federal case brought by the Houston Federation of Teachers Local 2415 against the Houston school district in April 2014. The teachers claimed that a proprietary algorithm that measures teacher performance based on student test scores could violate their civil rights because of a lack of due process to publicly evaluate evidence used to fire them. The system gave teachers a raw performance rating relative to the state average, with no recognition that some school districts may have larger populations, disadvantaged cohorts of students, or other relevant factors. In May 2017, the Fifth Circuit Court of Appeals ruled in Houston Federation of Teachers et al. v. Houston Independent School District that the teachers had legitimate arguments that the assessment program may violate their Fourteenth Amendment due process protections.iii

There is a similar lack of transparency involving the use of body cameras that an AI company, Axon (formerly Taser), is giving away to police departments across the US. Axon has added real-time facial recognition software to the cameras, which means that the video feeds are being held by a private company whose data models are unavailable to public authorities. “Currently, there are no public means of accountability,” said Kate Crawford, adding, “There should be a much higher bar for how these devices work, how they are being used, and how they are being audited. Ultimately, we need a commitment to forms of due process when high-stakes decision-making is involved, such as criminal justice, welfare and education.”

Now AI systems are being built into the inner workings of government itself, said Crawford, referring to Palantir, a company that has a very large-scale contract with the Trump administration to build a new machine-learning platform for the US Immigration and Customs Enforcement agency. “What we’ve learned about this platform is that it has enormously sensitive data. It is bringing together datasets from many different databases that have not been combined before. It could indeed be an engine for very large levels of deportation in this country. The question that I would put to us is, What sort of procedural due process rights would anyone have against a system like this?”

Crawford recommended reading Hannah Arendt’s classic essay, The Origins of Totalitarianism (1951), which she sees as presaging the surveillance-and-control ambitions of the Palantir platform. Arendt writes:

Now the police dreams that one would look at a gigantic map on the office wall, and that should suffice at any moment to establish who is related to whom, and in what degree of intimacy. This dream is not unrealizable, although its technical execution is difficult. If this map really did exist, not even memory would stand in the way of the totalitarian claim to domination. Such a map might make it possible to obliterate people without any trace, as if they had never existed at all.

As Crawford concluded: “Arendt’s fear was that the reason totalitarianism failed in the 20th century was because it simply didn’t have access to sufficiently powerful technologies.”

Can Insurgent Data Systems Neutralize Bias?
If powerful players can use algorithmic bias to affect or control people’s lives, the question arises: Can countervailing AI or data systems be built to act as correctives? Could open-source artificial intelligence neutralize or overcome AI bias by providing the transparency and diversity of perspectives that are needed?

J. Nathan Matias, a postdoctoral researcher at Princeton and Aspen Institute Guest Scholar believes that “there is an opportunity to use databases and AI systems to rethink, overturn or optimize an unjust system.” He cited projects by online feminist groups that have amassed data to protect themselves against online harassment, for example, and informal citizen communities that are using machine-learning systems to try to assure due process in court settings. Some intriguing innovations in citizen-based data projects include:

  • The Algorithmic Justice League, founded by Joy Buolamwini, a student at MIT Media Lab, is a collective that is dedicated to “highlighting algorithmic bias through media, art and science; providing space for people to voice their concerns and experiences with the ‘coded gaze’; and developing practices for accountability during the design, development and deployment of coded systems.” The project has sponsored research initiatives, reports, videos and art-driven protests, among other strategies to draw attention to the “coded gaze.”
  • Three enterprising data activists created an app called White Collar Crime Risk Zones iv that “uses machine learning to predict where financial crimes are most likely to occur throughout the US.” Upon entering a zip code into the search box, the app produces a map showing pushpins indicating where a documented crime occurred, the approximate “crime severity” (in US dollars), and the “top risk likelihoods” in that area (such as “breach of fiduciary duty” and “employment discrimination based on age”).
  • A project at the MIT Media Lab is building a new sort of human-machine interface that explicitly tries to identify and model bias as a way to root it out, said Joi Ito. The interface does this by interacting with a person to identify and highlight his or her own biases while also revealing how the machine itself is making judgments. The idea is to make visible the algorithmic assumptions of the machine and the biases of the human decision maker. The tool itself is positioned as augmented intelligence, not a substitute for human decision-making.
  • Gliimpse is a personal health data platform that lets people aggregate their health data from dozens (or more) data sources, and then collect and personalize the use of the data. A startup acquired by Apple in 2017, Gliimpse helps individuals take charge of their data in assessing their personal health, making healthcare decisions and sharing the data with trusted third parties.

Is Open Source AI Possible?
If the goal is to increase transparency and control of data for users, one of the most obvious strategic approaches is open source development. But is it really possible to create an “open AI” and could it be effective?

Naveen Rao of Intel said that “AI as a research and development field is actually one of the most open ones around,” noting that methodologies and code are generally published and not held as trade secrets. However, this may not be significant enough, replied Joy Buolamwini of MIT Media Lab, “because these are data-centric technologies. If you have access to the models and learn the predictions, but you don’t have access to the data, you’re missing half of the picture. The data itself needs to be part of any process to increase transparency within AI.”

For Jean-François Gagné of Element AI, the idea of open AI “kind of misses the point. When you think about it, AI is transforming the way we code, so we no longer prescribe and encapsulate insights into the code. We now build models that are tools, and then train them. So providing access to methods just doesn’t move the needle at all — and it’s not a matter of not having access to datasets. It’s much broader than that, much more fundamental,” he stressed.

What makes AI different from open source software, Gagné said, is that AI tools using enormous pools of data can self-learn and improve their methodologies faster and better than anyone else. Eventually, the tools will “reach a point of escape velocity” that insulate the big AI players from competition and AI tools themselves from scrutiny. “The volume of search queries and other data that Google can access is so big that its algorithms can perform way better than others,” said Gagné. The AI tools controlled by the big players are becoming more powerful for another reason, he said, they can convert what used to be considered “noise” — meaningless data-points buried within unfathomably vast datasets — into useable information. Meanwhile, said Gagné, “the social contract” that purportedly applies to this transfer of data is a relic from another time.

Kate Crawford argued that the problems go beyond access to AI methodologies and data, to the ability to access “infrastructure of scale.” Only a few companies — like Amazon Web Services, Google, Baidu and Microsoft — currently have the types of massively parallel computing infrastructures to produce the computations at competitive speeds and efficiencies, she said. “It’s a matter of data plus infrastructure plus large amounts of capital,” said Crawford. “There is already a profound concentration of power in the AI industry that they have left behind large swaths of the world. We are really talking about a tiny group of global players doing this.”

Jean-Francois Gagné added, “Startups are being totally crushed by a Google or Amazon, leveraging an absolutely unfair amount of datasets and algorithms to lower the price-point and be so much more efficient than anyone else, not to mention [saving money by] reporting tax in different countries. As we think about AI, these phenomena are going to get amplified. For a lot of countries that are out of this race, this is going to be a huge issue,” he predicted. “It is almost literally impossible to compete.”v


i For more on bias in face recognition technologies, seea report by Clare Garvie, Alvaro Bedoya and Jonathan Franle, “The Perpetual Lineup: Unregulated Police Face Recognition in America.” Georgetown Law Center on Privacy & Technology. (October 18, 2016). Available online: https://www.perpetuallineup.org.
ii The need for AI systems to “explain themselves” so that human institutions can justifiably rely on them has given rise to a new field of research, “Explainable A.I.,” or X.A.I. See Cliff Kuang. “Can A.I. Be Taught to Explain Itself?” The New York Times. (November 21, 2017). Available online: https://www.nytimes.com/2017/11/21/magazine/can-ai-be-taught-to-explain-itself.html.
iii Houston Federation of Teachers et al. v. Houston ISD. 14-1189 - Houston Federation of Teachers Local 2415 et al v. Houston Independent School District.
iv White Collar Crime Risk. Available online: https://thenewinquiry.com/white-collar-crime-risk-zones. The methodology for the project is laid out in a white paper by Brian Clifton, Sam Lavigne and Francis Tseng. “Abolish.” The New Inquiry Magazine. Vol. 59. (April 27, 2017). Available online: https://whitecollar.thenewinquiry.com/static/whitepaper.pdf.
v Or as a The New York Times headline put it, “If a wildly creative company with an app used by 178 million people every day can still be crushed by Facebook, how is anyone supposed to succeed?” The article about Snap, the parent company of Snapchat, was written by Kevin Roose. The New York Times. (November 16, 2017). Available online: https://www.nytimes.com/2017/11/15/business/snapchats-new-test-grow-like-facebook-without-the-baggage.html.

Share On