By Gordon Hull
Large Language Models (LLMs) are well-known to “hallucinate,” which is to say that they generate text that is plausible-sounding but completely made-up. These difficulties are persistent, well-documented, and well-publicized. The basic issue is that the model is indifferent to the relation between its output and any sort of referential truth. In other words, as Carl Bergstrom and C. Brandon Ogbunu point out, the issue isn’t so much hallucination in the drug sense, but “bullshitting” in Harry Frankfurt’s sense. One of the reasons this matters is defamation: saying false and bad things about someone can be grounds to get sued. Last April, ChatGPT made the news (twice!) for defamatory content. In one case, it fabricated a sexual harassment story and then accused a law professor. In another, it accused a local politician in Australia of corruption.
Can LLMs defame? According to a recent and thorough analysis by Eugene Volokh, the answer is almost certainly yes. Volokh looks at two kinds of situation. One is when the LLM defames public figures, which is covered by the “actual malice” standard. Per NYT v. Sullivan, “The constitutional guarantees require … a federal rule that prohibits a public official from recovering damages for a defamatory falsehood relating to his official conduct unless he proves that the statement was made with ‘actual malice’ – that is, with knowledge that it was false or with reckless disregard of whether it was false or not” (279-80).
Continue reading "A Least-Bad Solution for Language Model Defamation?" »
Recent Comments