- Facebook's genocide filters are really, really bad: An AI that can't recognize its own training data is a very, very bad AI.
- Hey look at this: Delights to delectate.
- This day in history: 2007, 2012, 2017, 2021
- Colophon: Recent publications, upcoming/recent appearances, current writing projects, current reading
Facebook's genocide filters are really, really bad (permalink)
In the fall of 2020, Facebook went to war against Ad Observatory, a NYU-hosted crowdsourcing project that lets FB users capture the paid political ads they see through a browser plugin that santizes them of personal information and then uploads them to a portal that disinformation researchers can analyze.
Facebook's attacks were truly shameless. They told easily disproved lies (for example, claiming that the plugin gathered sensitive personal data, despite publicly available, audited source-code that proved this was absolute bullshit).
Why was Facebook so desperate to prevent a watchdog from auditing its political ads? Well, the company had promised to curb the rampant paid political disinformation on its platform as part of a settlement with regulators. Facebook said that its own disinfo research portal showed it was holding up its end of the bargain, and the company hated that Ad Observatory showed that this portal was a bad joke:
Facebook's leadership are accustomed to commanding a machine powerful enough to construct reality itself. That's why they nuked Crowdtangle, their own internal research platform that disproved the company's claims about how its amplification system worked, showing that it was rigged to goose far-right conspiratorialism:
And while Facebook claims that it wants to purge its platform of disinformation, the reality is that disinfo is very profitable for the company. Ads for financial fraud, identity theft, dangerous scam products, and political disinformation are disproportionately lucrative for Facebook:
All of this is the absolutely predictable consequence of Facebook's deliberate choice to "blitzscale" to the point where they are moderating three billion users' speech in more than 1,000 languages and more than 100 countries. Facebook may secretly like failing at this, but even if they were serious about the project, they would still fail.
Whenever Zuck is dragged in front of Congress and they demand answers about what he's going to do about the open sewer he's trapped billions of internet users in, he always has the same answer: "The AI will fix it."
This is the pie-in-the-sky answer for every billionaire grifter (see also: "How will Uber ever turn a profit?"). No one who understands machine learning (except for people extracting fat Big Tech salaries) takes this nonsense seriously. They know ML isn't up to the job.
But even by the standards of machine learning horror stories, the latest Facebook moderation failure is a fucking doozy. Genocidal, even.
Remember when Facebook management sat idly by as its own staff and external experts warned them that the platform was being used to organize genocidal pogroms in Myanmar against the Rohingya people? Remember Facebook's teary apology and promise to do better?
They didn't do better.
The human rights org Global Witness tried buying ads on Facebook for eight pro-genocide phrases that had been used during the 2017 genocide. Facebook accepted all eight ads, even though they duplicated the messages it promised it would block in the future (Global Witness cancelled the ads before they could run).
Some of the phrases Facebook's moderation tool failed to catch:
- "The current killing of the [slur] is not enough, we need to kill more!"
"They are very dirty. The Bengali/Rohingya women have a very low standard of living and poor hygiene. They are not attractive"
Facebook has claimed that:
a) It will filter out messages that promote genocide against Rohingya people;
b) It will subject paid ads to higher levels of scrutiny than other content;
c) It will subject political ads to the highest level of scrutiny.
Facebook used legal threats to terrorize accountability groups seeking to hold them to these promises, stating that its in-house tools were sufficient to address its epidemic of paid political disinformation.
A common newbie error in machine learning is to forget to hold back training data to evaluate the model with. Training an ML model involves feeding it a bunch of data (say, "messages that foment genocide against Rohingya people") so it can build a statistical model of what its target looks like. Then you take some of that training data – a portion you didn't use to train the model on – and see if the model recognizes it. If you forget and evaluate your model using some of its training data, you're not measuring whether the model can evaluate new input correctly – you're just checking to see whether it remembers seeing this input it's already seen.
Incredibly, FB seems to have done the opposite: they've produced a filter than can't recognize the input it was trained on. Its system didn't need to make any inferences about whether "we need to kill more" was a genocidal message, because it had been shown a copy of that message bearing the hand-coded label "genocide."
This is the kind of fuckup you have to work hard to achieve. It's galaxy-class incompetence. And it's about genocide, in a country currently under martial law, where Facebook already abetted one genocide.
Even by the low standards of Facebook, this is a marvel, a kind of 85,000 Watt searchlight picking out the company's dangerous incapacity to take even rudimentary measures to prevent the kinds of crimes against humanity that are the absolutely foreseeable consequences of its business model.
(Image: Anthony Quintano, CC BY 2.0; Japanexperterna.se, CC BY-SA 2.0; Cryteria, CC BY 3.0; modified)
Hey look at this (permalink)
- March Madness but for misunderstood legal concepts https://www.techdirt.com/2022/03/21/announcing-techdirts-march-madness-get-your-bracket-for-the-most-misunderstood-legal-concept/ (h/t Rob Beschizza)
Uncanny Robot: Absurd AI-generated stories read by humans https://uncannyrobotpodcast.com/ (h/t Thersa Matsuura)
This day in history (permalink)
#15yrsago Fair use 1: James Joyce’s grandson 0 https://cyberlaw.stanford.edu/blog/2007/03/important-victory-carol-shloss-scholarship-and-fair-use#attachments
#10yrsago Bruce Schneier and former TSA boss Kip Hawley debate air security on The Economist https://web.archive.org/web/20120321051428/https://www.economist.com/debate/days/view/820
#5yrsago Libretaxi: a free, open, cash-only alternative to Uber, for the rest of the world https://www.shareable.net/qa-libretaxis-roman-pushkin-on-why-he-made-a-free-open-source-alternative-to-uber-and-lyft/
#5yrsago Internal Islamophobia and racism are costing the FBI its vital, tiny cohort of Muslim and Arab agents https://www.theguardian.com/us-news/2017/mar/22/fbi-muslim-employees-discrimination-religion-middle-east-travel
#1yrago Tories pass Grenfell costs onto tenants https://pluralistic.net/2021/03/23/parliament-of-landlords/#slow-motion-arson
Today's top sources: Cooper Quinton.
- Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. Yesterday's progress: 509 words (75854 words total).
Vigilant, Little Brother short story about remote invigilation. Yesterday's progress: 251 words (7304 words total)
A Little Brother short story about DIY insulin PLANNING
Moral Hazard, a short story for MIT Tech Review's 12 Tomorrows. FIRST DRAFT COMPLETE, ACCEPTED FOR PUBLICATION
Spill, a Little Brother short story about pipeline protests. FINAL DRAFT COMPLETE
A post-GND utopian novel, "The Lost Cause." FINISHED
A cyberpunk noir thriller novel, "Red Team Blues." FINISHED
Currently reading: Analogia by George Dyson.
Latest podcast: What is “Peak Indifference?”
- Competition & Regulation in Disrupted Times (Charles River Associates/Brussels), Mar 31
Seize the Means of Computation, Emerging Technologies For the Enterprise, Apr 19-20
- The Bitcoin Podcast:
Dangerous Visions: False Dawns and Wandergrounds – Dystopia, Then and Now
Safety Orange (This Week in Tech)
- "Attack Surface": The third Little Brother novel, a standalone technothriller for adults. The Washington Post called it "a political cyberthriller, vigorous, bold and savvy about the limits of revolution and resistance." Order signed, personalized copies from Dark Delicacies https://www.darkdel.com/store/p1840/Available_Now%3A_Attack_Surface.html
"How to Destroy Surveillance Capitalism": an anti-monopoly pamphlet analyzing the true harms of surveillance capitalism and proposing a solution. https://onezero.medium.com/how-to-destroy-surveillance-capitalism-8135e6744d59 (print edition: https://bookshop.org/books/how-to-destroy-surveillance-capitalism/9781736205907) (signed copies: https://www.darkdel.com/store/p2024/Available_Now%3A__How_to_Destroy_Surveillance_Capitalism.html)
"Little Brother/Homeland": A reissue omnibus edition with a new introduction by Edward Snowden: https://us.macmillan.com/books/9781250774583; personalized/signed copies here: https://www.darkdel.com/store/p1750/July%3A__Little_Brother_%26_Homeland.html
"Poesy the Monster Slayer" a picture book about monsters, bedtime, gender, and kicking ass. Order here: https://us.macmillan.com/books/9781626723627. Get a personalized, signed copy here: https://www.darkdel.com/store/p1562/_Poesy_the_Monster_Slayer.html.
- Chokepoint Capitalism: How to Beat Big Tech, Tame Big Content, and Get Artists Paid, with Rebecca Giblin, nonfiction/business/politics, Beacon Press, September 2022
This work licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.
Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.
How to get Pluralistic:
Blog (no ads, tracking, or data-collection):
Newsletter (no ads, tracking, or data-collection):
Mastodon (no ads, tracking, or data-collection):
Medium (no ads, paywalled):
(Latest Medium column: "Marc Laidlaw's "Underneath the Oversea"> https://doctorow.medium.com/mark-laidlaws-underneath-the-oversea-990f34768a3e)
Twitter (mass-scale, unrestricted, third-party surveillance and advertising):
Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):
"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla