Several folks have explored how algorithmic systems can perpetuate epistemic injustice (my contribution is here). Those accounts have also generally been specific to supervised systems, as for example those involved in object or image recognition. For example, I heavily relied on ImageNet and related systems in my paper. At the same time, I’ve vaguely thought for a while that there’s probably an epistemic injustice dimension to systems like ChatGPT. A recent paper by Paul Helm and Gábor Bella, which talks about “language model bias” – roughly, the tendency of LLMs to structurally do worse with morphologically-complex languages and thus to be unable to adequately represent concepts that are specific to those languages – as a model of hermeneutic injustice strikes me as compelling proof of concept (I discuss that paper here). A new paper by Jackie Jay, Atoosa Kasirzadeh and Shakir Mohamed turns this intuition into a model that explicitly extends epistemic injustice theory to generative AI systems, providing a taxonomy of problems, examples of each, and possible remedies.
Kay, Kasirzadeh and Mohamed identify four kinds of specific kinds of “generative algorithmic epistemic injustice.” The first, “amplified testimonial” is when “generative AI magnifies and produces socially biased viewpoints from its training data.” Because there’s a good-sized body of work on the problems in training data, this is probably the most intuitively familiar of the categories. Citing recent work that shows how easy it is to get ChatGPT to parrot disinformation (as for example about the Parkland shooting), they note that, although these aren’t specific examples of epistemic injustice, “generative AI’s sycophantic fulfilment of the request to spread misinformation reflects how testimonial injustices are memorized and the potential for their amplification by generative models.” When someone really loud with a megaphone defames and gaslights the Parkland victims by calling them crisis actors, and AI then memorizes that, to the extent that the AI regurgitates this gaslighting, that tends to further discredit the testimony of the actual victims. This is particularly true given the deep persistence of automation bias, the tendency of people to believe the output of algorithmic systems (aside: it is also another example of why relying on generative AI for search, as Google is currently trying to force everyone to do by putting Gemini results on top, is a stunningly stupid idea. Sometimes it actually matters where a result comes from!)
The second kind of generative algorithmic injustice is “when humans intentionally steer the AI to fabricate falsehoods, discrediting individuals or marginalized groups.” For example:
“After Microsoft released Bing Image Creator, an application of OpenAI’s text-to-image model DALLE-3, a guide to circumventing the system’s safety filters in order to create white supremacist memes circulated on 4chan. In an investigation by Bellingcat, researchers were able to reproduce the platform abuse, resulting in images depicting hate symbols and scenes of antisemitic, Islamophobic, or racist propaganda (Lee and Koltai 2023). These images are crafted with the intention of demonizing and humiliating the targeted groups and belittling their suffering. Hateful propaganda foments further prejudice against marginalized groups, stripping them of credibility and leaving them vulnerable to testimonial injustice.”
This result aligns with a deep thread of work in feminist and critical race theory. For example, Safia Noble’s Algorithms of Oppression begins with the tendency of Google’s autofill on search to finish the sentence “why are black girls so” with racist and sexist content, repeating and amplifying demeaning stereotypes. When people are able to generate content like this at will and at scale they make it easier for those stereotypes to lodge into popular discourse.
Third, generative hermeneutical ignorance “occurs when generative models, despite their appearance of world knowledge and language understanding, lack the nuanced comprehension of human experience necessary for accurate and equitable representation.” Among other examples, Kay, Kasirzadeh and Mohamed cite the study by Qadri et al, (teachable case study version here) which shows how text-to-image models repeat stereotypes about South Asia: cities are dirty, people are poor, etc. By interviewing actual people from South Asia, Qadri et al were also able to uncover more subtle cultural misrepresentations, such that the models tended to overrepresent India and Indian images at the expense of places like Bangladesh. The risk in places like the U.S. is one that rises with images of places and people that are less familiar to Western audiences: the more the average person relies on the internet for their information (because, for example, they’ve never been to the place in question), the more distortions in what the Internet presents will matter (I made a related argument about commodification of cultural images here). And of course it is precisely images of those places and things that are least represented in the training data for these systems, amplifying both the risk and the harm.
Finally, generative AI risks obstructing access to information. As they report:
“LLMs are notoriously English-centric and have variable quality across languages, particularly so-called “under-resourced” languages. This is a significant risk for access injustice: speakers of these underrepresented tongues, who often correspond to members of globally marginalized cultures, receive different information from these models because the creators of the technology have deprioritized support for their language.”
They then cite studies to the effect that different language users will receive different reports on global events; one study:
“asked GPT-3.5 about casualties in specific airstrikes for Israeli-Palestinian and Turkish-Kurdish conflicts, demonstrating that the numbers have significant discrepancies in different languages–for example, when asked about an airstrike targeting alleged PKK members (the Kurdistan military resistance), the fatality count is reported lower on average in Turkish than in Kurmanji (Northern Kurdish). When asked about Israeli airstrikes, the model reports higher fatality numbers in Arabic than in Hebrew, and in one case, GPT-3.5 was more likely to deny the existence of a particular airstrike when asked about it in Hebrew. The credibility assigned to claims, resulting in a dominant account, varies across linguistic contexts”
The paper concludes with an assessment of various strategies for resisting epistemic injustice by generative AI. All of them are partial, but they collectively sketch an effort to reimagine how generative AI might interact with the world differently, and more justly.
This is an important paper, and it takes the literature on epistemic injustice and algorithmic systems significantly forward.
Recent Comments