When a machine decision does you wrong, here’s what we should do

by Aziz Huq

Photo of a person holding a phone showing messages with exam results next to a school uniform badge.

Photo by Andy Buchanan/AFP/Getty

is the Frank and Bernice J Greenberg Professor of Law at the University of Chicago. His books include How to Save a Constitutional Democracy (2018), co-authored with Tom Ginsburg, and The Collapse of Constitutional Remedies (2021).

Edited by Sam Haselby

Aziz Huq

Edited by Sam Haselby

When a machine does you wrong, what should you do - litigate or regulate? Why AI decision-making should be fairer, by design

States and firms with whom we routinely interact are turning to machine-run, data-driven prediction tools to rank us, and then assign or deny us goods. For example, in 2020, the 175,000 students taking the International Baccalaureate (IB) exams for college learned that their final tests had been called off due to the COVID-19 pandemic. Instead, IB announced last July, it would estimate grades using coursework, ‘significant data analysis from previous exam sessions, individual school data and subject data’. When these synthetic grades were published, outrage erupted. Thousands signed a petition complaining that scores were lower than expected. Students and parents had no means to appeal the predictive elements of grades, even though this data-driven prediction was uniquely controversial. In the United Kingdom, a similar switch to data-driven grading for A-level exams, used for entrance to university, prompted cries of racial bias and lawsuit threats.

As the IB and A-level controversies illustrate, a shift from human to machine decision-making can be fraught. It raises the troubling prospect of a future turning upon a mechanical process from which one’s voice is excluded. Worries about racial and gender disparities persisting even in sophisticated machine-learning tools compounds these concerns: what if a machine isn’t merely indifferent but actively hostile because of class, complexion or gender identity?

Or what if an algorithmic instrument is used as a malign instrument of state power? In 2011, the US state of Michigan entered a multimillion-dollar contract to replace its computer system for handling unemployment claims. Under the new ‘MiDAS’ system implemented in October 2013, the number of claims tagged as fraudulent suddenly spiralled. Because Michigan law imposes large financial penalties on unemployment fraud, the state agency’s revenues exploded from $3 million to $69 million. A subsequent investigation found that MiDAS was flagging fraud with an algorithmic predictive tool: out of 40,195 claims algorithmically tagged as fraudulent between 2013 and 2015 (when MiDAS was decommissioned), roughly 85 per cent were false.

In both the grading and the benefits cases, a potent – even instinctual – response is to demand a human appeal from the machine as a safeguard. The Toronto Declaration – launched in 2018 by Amnesty International and Access Now – called for artificial-intelligence decisional tools to be appended with an ‘accessible and effective appeal and judicial review’ mechanism. Unlike machines, humans can be capable of nuanced, contextualising judgment. They’re capable of responding to new arguments and information, updating their views in ways that a merely mechanical process cannot.

Yet, as powerful as these grounds might seem, the resort to a human appeal implicates technical, social and moral difficulties that are obscure at first blush. Without for a moment presuming that machine-driven decision tools are unproblematic – they’re not – the idea of creating an appeal right to a human decision-maker needs to be closely scrutinised. That ‘right’ isn’t as unambiguous as it first seems. It can be implemented in quite divergent ways. Implemented carelessly, it could exacerbate the distributional and dignitary harms associated with wayward machine decision-making.

A right to a human appeal from a machine decision, such as the IB grade prediction or the MiDAS fraud label – can be understood in two different ways. It could first be translated into an individual’s right to challenge a decision in their unique case. I suspect that most people have this in mind when they think of a human appeal from a machine decision: You got my facts wrong, and you owe it to me as a person to correctly rank and treat me based on who I am and what I have, in fact, done.

Superficially alluring, this version of an appeal right leads to troubling outcomes. To begin with, there’s a substantial body of empirical work showing that appending human review even to a simple algorithmic tool tends to generate more, not fewer, mistakes. This finding, rightly ascribed to a 1954 paper by the psychologist Paul Meehl, was tendered in a historical context of crude statistical tools competing against putatively sophisticated clinical judgments. More than a half-century later, with machine prediction considerably refined, it still stands up well.

What of cases such as MiDAS that have staggeringly high error rates? In the same period as MiDAS was operational, human decisionmakers working with the MiDAS system had a roughly 44 per cent false fraud claim rate. This is far better than MiDAS itself, but on its own not much better than flipping a coin. The real question in the case of MiDAS is whether any cost-effective fraud-detection system exists that’s both sensitive and specific.

Moreover, an individualised right to human appellate review is likely to have unwelcome distributive effects. For example, different families dismayed by an exam prediction are unlikely to be similarly situated in their resources or sophistication. Some will be more capable of appealing than others. Without some well-intentioned advocacy group’s intervention, it’s likely that socioeconomic status and financial resources will correlate to the propensity to appeal. There’s no reason to assume that appeals will be made only in cases when the machine errs – or that a representative sample of errors will be appealed. Indeed, the appellate right’s effect might well be to cast concentrated error costs upon disadvantaged groups and communities.

In the educational context, this approach will limit, not expand, intergenerational mobility. Assume, for example, that A-level predictions were, as alleged, biased in favour of wealthier schools with fewer minority students. A system of individualised appeals not only likely enables a disproportionate number of wealthier students to challenge deflated grades, but it also leaves in place the overrepresentation of wealthier, whiter students in the pool of initially higher grades. Hence, the permission to appeal individual grades will strongly tend toward more, not less, regressive outcomes than a pure machine decision – even if the latter is error-prone.

But there is another way of understanding the same appeals right: this is a complaint, not that I have been wrongly classified, but instead that the algorithm ranking me is characterised by a systematic failure of capacity or function. This isn’t so much a claim to a correct decision as a claim to be treated by a well-calibrated instrument. It’s this version of the appellate right that we should embrace.

Any decisional mechanism, whether human- or machine-operated, will generate errors. An individualised appeals mechanism might reduce the volume of errors. But it might also increase it. Imagine if it’s largely wealthy parents of students (rightly) receiving low grades who lodge objections: they might secure (false) upward corrections from flawed human decision-makers. The net error rate would rise. Therefore, an effective review mechanism needs to focus not solely on discrete cases but on the overall performance of the predictive tool, and its capacity for improvement. This programmatic and systemic right of appeal would tee up the question of whether the algorithm had been designed to produce a high error rate – as is plausibly the case with MiDAS. It would also consider how errors are distributed, and determine whether vulnerable populations are subject to disproportionate burdens. Finally, it would demand that an algorithm’s designer justify design choices in light of the best available technology in the field.

The right to a well-calibrated instrument is best enforced via a mandatory audit mechanism or ombudsman, and not via individual lawsuits. The imperfect and biased incentives of the tool’s human subjects means that individual complaints provide a partial and potentially distorted picture. Regulation, rather than litigation, will be necessary to promote fairness in machine decisions.

For the most profound moral questions raised by human-to-machine transitions are structural and not individual in character. They concern how private and public systems reproduce malign hierarchies and deny rightful opportunities. Designed badly, a right to an appeal exacerbates those problems. Done well, it is a chance to mitigate – reaping gains from technology for all rather than only some.

TECHNOLOGY AND MEDIA ETHICS PROGRESS AND THE FUTURE

Syndicate this idea

Explore more

Black and white photo of three elderly people on a bench; two women in headscarves and a man in a hat; they appear engaged.

NEUROSCIENCE

There are three lenses through which to weigh any decision

Whether an act seems ‘good’ depends on how you look at it. Brain research reveals what happens when the lens changes

by Clara Pretus & Jay Van Bavel

Painting of a man in British colonial attire with an Indian child attendant holding a sword, by a tree overlooking a river landscape.

HUMAN RIGHTS AND JUSTICE

True solidarity requires Burke’s ‘sympathetic revenge’

Social media utterances aren’t enough. Burke’s stand against colonial injustice shows we must confront our own complicity

by Jack Jacobs

Photo of a man sunbathing on a rooftop next to a large satellite dish.

PROGRESS AND THE FUTURE

Why it’s possible to be optimistic in a world of bad news

The original optimist, Leibniz, was mocked and misunderstood. Centuries later, his worldview can help us navigate modern life

by Sumit Paul-Choudhury

Photo of a woman with tattoos taking a selfie on a smartphone wearing an off-shoulder dress in front of glass doors.

VIRTUES AND VICES

Social comparison is driving us to despair. It doesn’t have to

In the social media age, it seems impossible not to measure ourselves against others – but we can dodge the worst pitfalls

by Wojciech Kaftański

World map illustration showing physical geography with oceans, continents and elevation.

ETHICS

What makes a map ‘good’? On the ethics of cartography

Rendering the world in a responsible way means wrestling with what gets depicted on a map, how, and for whom

by Nat Case

A vintage voltmeter displaying a scale in volts and ampères with a brass base and needle indicator.

ETHICS

What a real-life ‘trolley problem’ reveals about morality

We used an electric-shock dilemma to test the strength of people’s moral principles when faced with real-world complexities

by Dries Bostyn

A classical painting of two women and a man in a lavishly decorated room with ornate furniture and clothing, 19th century style.

ETHICS

The curious paradox in how we address each other today

While honouring people’s preferred pronouns, we’ve begun to neglect forms of formal address. Perhaps we need a rethink

by David Benatar

Scene from a film with three characters walking from a futuristic spacecraft in a city setting.

PROGRESS AND THE FUTURE

We need the toolkit of utopian thinking, now more than ever

Many dismiss utopian ideas. But imagining a better world is a vital political skill for tackling today’s challenges

by Caitlin Rajan