Pluralistic: 02 Dec 2021

Today's links

A rear view of a beefy cop with a prominent sidearm, leaning indolently on a phrenology bust whose eye has been replaced with the glowing red eye of HAL9000 from 2001: A Space Odyssey. One of the labelled sections of the phrenology bust has been replaced with the logo for Predpol.

Massive Predpol leak confirms that it drives racist policing (permalink)

When you or I seek out evidence to back up our existing beliefs and ignore the evidence that shows we're wrong, it's called "confirmation bias." It's a well-understood phenomenon that none of us are immune to, and thoughtful people put a lot of effort into countering it in themselves.

But confirmation bias isn't always an unconscious process. Consultancies like McKinsey have grown to multibillion-dollar titans by offering powerful people confirmation bias as a service: pay them enough and they'll produce a fancy report saying whatever you want to do is the best thing you possibly could do.

A sizable fraction of the machine learning bubble is driven by this phenomenon. Pay a machine learning company enough money and they'll produce a statistical model that proves that whatever terrible thing you're doing is empirical and objective and true. After all, "Math doesn't lie."

The best term I've heard for this is "empirical facewash." I learned that term from Patrick Ball, in a presentation on the Human Rights Data Analysis Group's outstanding study of racial bias in predictive policing tools.

(Sidenote: HRDAG just won the prestigious Rafto Prize, a major international award for human rights work)

(And on that note, HRDAG is a shoestring operation that turns our tax-deductible donations into reliable statistical accounts of human rights abuses that are critical to truth and reconciliation, human rights tribunals, and trials for crimes against humanity. I am an annual donor and you should consider them in your giving, too)

Here's how Ball described that predictive policing research: everybody knows that cops have a racist policing problem, but we don't all agree on what that problem is. You and I might think that the problem is that cops make racially motivated arrests, while the cops and their apologists think the problem is that we think the cops make racially motivated arrests.

By feeding crime data into a machine learning model, and then asking it to predict where crime will take place based on past patterns of crime data, cops can get an "objective" picture of where to concentrate their policing activities.

But this has major problems. First, it presumes that crime stats are objective – that everyone reports crime at the same rate, and that the police investigate suspects at the same rate. In other words, this starts from the presumption that there is no racial bias in crime statistics – and then uses that presumption to prove that there is no racial bias in crime statistics!

Second, this presumes that undetected crimes are correlated with detected ones. In other words, if the cops detect a lot of crime in a poor neighborhood – and not in a rich neighborhood – then all the undetected crimes are also in those poor neighborhoods.

Finally, this presumes that every crime is a crime! In other words, it presumes that there are no de facto crimes like "driving while brown" or "walking your dog while black." Some of the "crimes" in the crime stats aren't actually crimes – rather, they're pretextual stops that turn into plea deals after bullying prosecutors threaten a long prison sentence.

HRDAG's work crystallized the critique of machine learning as a tool for correcting systemic bias, and it has been my touchstone for understanding other bias-reinforcing/bias-accelerating machine learning scandals. It's a critical adjunct to such foundational texts as Cathy O'Neil's "Weapons of Math Destruction":

And Virginia Eubanks's "Automating Inequality."

But you don't need to look to outside sources for evidence that predictive policing reinforces and accelerates racial bias. The founders of Predpol – the leading predictive policing tool – came to the same conclusion in 2018, but decided not to do anything about it.

That may sound shocking, but really, it's par for the course with Predpol, which rebranded itself as Geolitica in order to distance itself from a string of scandals and bad publicity.

From the start, Predpol has wrapped its operations in secrecy, pressuring police forces to hide their use of the service from city officials and residents. Back in 2018, a security researcher provided me with a list of cities that seemed to have secretly procured Predpol services.

In 2019, Motherboard's Caroline Haskins used that report to extract even more information about the cops' secret deals with Predpol:

All this started because my source was able to learn that these cities were experimenting with Predpol's digital phrenology due to basic cybersecurity errors the company had made.

Predpol's cybersecurity has not improved since. A team of reporters from Gizmodo and The Markup just published a blockbuster report on Predpol's role in biased policing, using the largest Predpol leak in history. Data regarding 5.9 million Predpol predictions was left on an unsecured server!

The Markup/Gizmodo team used that dataset to conduct a massive study on racial bias in predictive policing. As American University's Andrew Ferguson put it, "No one has done the work you guys are doing, which is looking at the data." This is "striking because people have been paying hundreds of thousands of dollars for this technology for a decade."

Ferguson's point – that public millions have been poured into an experimental technology without any external validation – is important. After all, it's unlikely that cops and Predpol keep this stuff a secret from us because they know we'll love it and they don't want to ruin the pleasant surprise.

On the other hand, it makes perfect sense if Predpol is really selling empirical facewash – that is, confirmation bias as a service.

That certainly seems to be the case based on the analysis published today. Police who rely on Predpol do less patrolling in white and affluent neighborhoods (these are pretty much the same neighborhoods in most of the USA, of course). But when it comes communities of color and poor communities, Predpol predictions send cops flooding in: "A few neighborhoods in our data were the subject of more than 11,000 predictions."

When Predpol sends cops into your neighborhood, arrests shoot up (you find crime where you look for it), as does use of force. This has knock-on effects – for example, the reporters tell the story of Brianna Hernandez, who was evicted, along with her two young children from low-income housing. Her partner was stopped in his car while dropping off some money for her. He had an old court injunction barring him from being on the premises because of a crime he committed 14 years earlier, while he was a minor. The housing complex had a policy of evicting tenants who associated with people who committed crimes, and it had been the target of a flood of Predpol predictions.

Brianna doesn't know if she and her children were made homeless because of a Predpol prediction, thanks to the secrecy Predpol and its customers hide behind.

Robert McCorquodale – the Calcasieu Parish, LA sheriff's attorney who handles public records requests – refused to confirm whether they used Predpol, despite the fact that the data-leak clearly confirmed they were. He cited "public safety and officer safety" and speculated that if criminals knew Predpol was in use, they'd be able to outwit it: "I feel this is not a public record."

Unsurprisingly, police reform advocates in six of the cities where Predpol was in use didn't know about it: "Even those involved in government-organized social justice committees said they didn’t have a clue about it."

All this secrecy helps hide the fact that a) Predpol is expensive and b) it doesn't work. But many police departments are wising up. LAPD was Predpol's first big reference customer – and they stopped using it in 2020, citing financial constraints and a damning Inspector General report.

It's not just LA. Santa Cruz – the birthplace of Predpol – also fired the company last year:

There's a limited pool of mathematicians who can produce the kind of convincing confirmation bias as a service that Predpol sells, and it's shrinking. 1,400 mathematicians have signed an open letter "begging their colleagues not to collaborate on research with law enforcement, specifically singling out Predpol."

(Image: Science Museum London, CC BY 4.0; Cryteria, CC BY 3.0; modified)

This day in history (permalink)

#20yrsago What is Dean Kamen's "IT"?,8599,186660,00.html

#5yrsago British politicians exempt themselves from warrantless spying under the Snoopers Charter

#5yrsago Bernie Sanders: Trump just used your taxes to reward Carrier for offshoring American jobs

#1yrago Nalo Hopkinson, Science Fiction Grand Master

Colophon (permalink)

Today's top sources: Julia Angwin (

Currently writing:

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. Yesterday's progress: 523 words (44604 words total).

  • A short story for MIT Tech Review's 12 Tomorrows PLANNING

  • A Little Brother short story about remote invigilation. PLANNING

  • A Little Brother short story about DIyY insulin PLANNING

  • Spill, a Little Brother short story about pipeline protests. SECOND DRAFT COMPLETE

  • A nonfiction book about excessive buyer-power in the arts, co-written with Rebecca Giblin, "The Shakedown." FINAL EDITS

  • A post-GND utopian novel, "The Lost Cause." FINISHED

  • A cyberpunk noir thriller novel, "Red Team Blues." FINISHED

Currently reading: Analogia by George Dyson.

Latest podcast: Jam To-Day (
Upcoming appearances:

Recent appearances:

Latest book:

Upcoming books:

  • The Shakedown, with Rebecca Giblin, nonfiction/business/politics, Beacon Press 2022

This work licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.

How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Newsletter (no ads, tracking, or data-collection):

Mastodon (no ads, tracking, or data-collection):

Medium (no ads, paywalled):

(Latest Medium column: "Give Me Slack."

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla