By Gordon Hull
Not long ago, Google summarily dumped Timnit Gebru, one of its lead AI researchers and one of the few Black women working in AI. Her coauthor Emily Bender has now posted the paper (to be presented this spring) that apparently caused all the trouble. It should be required reading for anybody who cares about the details of how AI and data systems can perpetuate racism, or who cares more generally about the social implications of brute-force approaches to AI. Bender and Gebru take up a common approach to natural-language processing (NLP), which involves an AI system learning how to anticipate what speech is likely to follow a given unit of speech. If, for example, I say “Hello, how are,” the system learns by studying a dataset of existing phrases and text snippets that the next word is likely to be “you,” but almost certainly will not be “ice cream.” How good the computer gets at this game is going to be substantially determined by the quantity and quality of its training data, i.e., the text that it examines.
Bender and Gebru outline the social costs of one approach to this problem, which is basically brute force. As processing power increases, it’s possible to train computers with larger and larger datasets, and the use of larger datasets reliably improves system performance. But should we be doing that? Bender and Gebru detail several kinds of problems. The first is environmental justice: all that processing power uses a lot of energy. Although some of it may come from carbon-neutral sources, the net climate cost is significant. Worse, the NLP systems being produced don’t benefit the people that are going to suffer the most from climate change. As they memorably put it:
“Is it fair or just to ask, for example, that the residents of the Maldives (likely to be underwater by 2100) or the 800,000 people in Sudan affected by drastic floods, pay the environmental price of training and deploying ever larger English LMs, when similar large-scale models aren’t being produced for Dhivehi or Sudanese Arabic?”
These global inequities are reproduced locally, as:
“most language technology is built to serve the needs of those who already have the most privilege in society. Consider, for example, who is likely to both have the financial resources to purchase a Google Home, Amazon Alexa or an Apple device with Siri installed and comfortably speak a variety of a language which they are prepared to handle.”
Spoiler: it’s probably not the flood victim in the Sudan, and it’s also not members of minority communities in the U.S., even as they collectively suffer disproportionately from climate change.
The second kind of problem is one of social disparity and bias. Here, the problem Bender and Gebru outline requires looking at a little more carefully. Early enthusiasts of data science point to the size of datasets as a virtue: if you are worried about whether n is large enough to prove anything, then “n=everything” appears to be a way forward. Certainly the scale can be jaw-dropping, as for example the (randomized, controlled!) study that looked at the diffusion of emotion through Facebook, with n=689,000. Machine learning algorithms also learn from big datasets, as for example the one that learned to diagnose depression from a set of nearly 44,000 Instagram photos. More is better! Except. One needs to ask where all this training data comes from, and how it’s been curated. If the emotional contagion on Facebook study avoids this problem to a degree – after all, it was studying social media (and, just to be clear, it was not a machine-learning algorithm) – work that uses online data to make inferences about the entire world needs more careful scrutiny. Bender and Gebru identify problems specific to NLP at several levels.
First, not everybody is online in the first place, so models of language that are trained on Internet speech tend to learn the speech of younger, whiter and more affluent people. But even taking that into account, there are problems. The data tends to come from fora like Reddit, which skews the data even younger and more male. Or maybe it comes from Twitter, where nobody knows how Tweets are selected for the partial fire-hoses, where abuse of women and minorities is rampant, and where there are documented cases of victims of online harassment being suspended while their harassers are allowed to continue posting. Thus:
“The net result is that a limited set of subpopulations can continue to easily add data, sharing their thoughts and developing platforms that are inclusive of their worldviews; this systemic pattern in turn worsens diversity and inclusion within Internet-based communication, creating a feedback loop that lessens the impact of data from underrepresented populations.”
Because the datasets rely on readily available and big sources like Wikipedia, Reddit and Twitter, the contributions of minorities – which tend to gravitate to friendlier places – are also less likely to be included. And all of this is before the filtering and curating algorithms get to the data; they often filter out sex words, which tends to suppress LGBTQ sites (aside: this is an ongoing problem) and they also filter out speech that tries to reclaim various slurs. Bender and Gebru conclude:
“At each step, from initial participation in Internet fora, to continued presence there, to the collection and finally the filtering of training data, current practice privileges the hegemonic view-point. In accepting large amounts of web text as ‘representative’ of ‘all’ of humanity we risk perpetuating dominant viewpoints, increasing power imbalances, and further reifying inequality.”
The remedy is not to add more and more data, because that generates more of the same, worsening the problem. It also tends to bury it deeper, because of “documentation debt,” the difficulty in explaining where the ever-larger set of data came from and how it was selected and curated. The better strategy is to be intentional about curation:
“We instead propose practices that actively seek to include communities underrepresented on the Internet. For instance, one can take inspiration from movements to decolonize education by moving towards oral histories due to the overrepresentation of colonial views in text, and curate training datasets through a thoughtful process of deciding what to put in, rather than aiming solely for scale and trying haphazardly to weed out, post-hoc, flotsam deemed ‘dangerous’, ‘unintelligible’, or ‘otherwise bad.’”
Finally, there’s the stochastic parrots problem. Big language models aren’t intentional speech. They represent the system getting very, very good at guessing what words and phrases come next. As they get better and better at that, they sound more and more fluent – tending to obscure the fact that they are not intentional agents. Nobody would think the old Eliza program was a real therapist. But the output of AI language can be disarmingly realistic:
Well I did ask @keithfrankish and... I think I might have triggered the first artificial existential crisis! https://t.co/ObGObTQ6Tv pic.twitter.com/ZHKkmlHZQy
— Raphaël Millière (@raphamilliere) July 26, 2020
Not only that, as Dylan Wittkower points out, the interfaces of systems like Amazon’s Alexa all but require that we treat them as if they have intentional states. So there’s a tendency to teach this parroting as if it’s speech. And that’s a problem, because the parrot only learned to talk from dubious sources. There are real-world impacts:
“An LM that has been trained on such data will pick up these kinds of problematic associations. If such an LM produces text that is put into the world for people to interpret (flagged as produced by an ‘AI’ or otherwise), what risks follow? In the first instance, we foresee that LMs producing text will reproduce and even amplify the biases in their input. Thus the risk is that people disseminate text generated by LMs, meaning more text in the world that reinforces and propagates stereotypes and problematic associations, both to humans who encounter the text and to future LMs trained on training sets that ingested the previous generation LM’s output. Humans who encounter this text may themselves be subjects of those stereotypes and associations or not. Either way, harms ensue: readers subject to the stereotypes may experience the psychological harms of microaggressions and stereotype threat. Other readers may be introduced to stereotypes or have ones they already carry reinforced, leading them to engage in discrimination (consciously or not), which in turn leads to harms of subjugation, denigration, belittlement, loss of opportunity and others on the part of those discriminated against.”
And these are all a lot more frightening if somebody designs the system maliciously:
“McGuffie and Newhouse show how GPT-3 could be used to generate text in the persona of a conspiracy theorist, which in turn could be used to populate extremist recruitment message boards. This would give such groups a cheap way to boost recruitment by making human targets feel like they were among many like-minded people. If the LMs are deployed in this way to recruit more people to extremist causes, then harms, in the first instance, befall the people so recruited and (likely more severely) to others as a result of violence carried out by the extremists.”
There’s more in the paper, and it’s all carefully documented. This paper is a must read, and more work like it is urgent. As Bender and Gebru conclude, “work on synthetic human behavior is a bright line in ethical AI development, where downstream effects need to be understood and modeled in order to block foreseeable harm to society and different social groups. Thus what is also needed is scholarship on the benefits, harms, and risks of mimicking humans and thoughtful design of target tasks grounded in use cases sufficiently concrete to allow collaborative design with affected communities.”
Even if Google isn’t interested in hearing about it, much less doing it.
Recent Comments