By Gordon Hull
There’s an emerging literature on Large Language Models (LLMs, like ChatGPT) that basically argues that they undermine a bunch of our existing assumptions about how language works. As I argued in a paper a year and a half ago, there’s an underlying Cartesianism in a lot of our reflections on AI, which relies on a mind/body distinction (people have minds, other things don’t), and then takes language use as sufficient evidence that one’s interlocutor has a mind. As I argued there, part of what makes LLMs so alarming is that they clearly do not possess a mind, but they do use language. So they’re the first examples we have of an artifact that can use language; language-use is no longer sufficient to indicate mindedness. In that paper, drew the implication that we need to abandon our Cartesianism about AI (caring whether it “has a mind”) and become more Hobbesian (thinking about the sociopolitical and regulatory implications of language-producing artifacts). Treating LLMs as the origin points of speech has real risks, including making the human labor that produces them invisible, and making it harder to impose liability since machines can’t meet a standard scienter requirement for assigning tort liability.
Here I want to take up a somewhat different thread, one that I started exploring a while ago under the general topic of iterability in language models. This thread takes the literature on language models seriously; where I want to go with it is to talk about an under-discussed latent Platonism in how we tend to approach language(models). I’ll start with the literature, which divides into a couple of sections, a Wittgensteinian and a Derridean.
1. The Wittgensteinian Rejection of Cartesian AI
Lydia Liu makes the case for a direct Wittgensteinian influence on the development of ML, via the Cambridge Researcher Margaret Masterson. I only ran into this work recently, so on the somewhat hubristic assumption that other folks in philosophy also don’t know it, I’ll offer a basic summary here (in my defense: Liu says that “the news of AI researchers’ longtime engagement with Wittgenstein has been slow to arrive.” She then adds that “the truth is that Wittgenstein’s philosophy of language is so closely bound up with the semantic networks of the computer from the mid-1950s down to the present that we can no longer turn a blind eye to its embodiment in the AI machine” (Witt., 427)).
Liu’s thesis is that “Wittgenstein inspired a group of researchers called Cambridge Language Research Unit (CLRU) in Britain to launch one of the first programs in machine translation, information retrieval, mechanical abstracting, and so on in the 1950s, all of which are now claimed for AI and cognitive science” (Witt., 428). The doxographic evidence supporting it is solid. Masterson was a student of Wittgenstein, and he apparently dedicated the Blue Book lectures to her. A year after Wittgenstein’s Philosophical Investigations appeared, Masterson published a paper “Words” that asks what the meaning of a word is, and then dives headfirst into a recognizably Wittgensteinian narrative. Masterson focuses on the problem of words that have the same written inscription (“ward”) but multiple meanings (hospital ward, ward off, etc.). Are these instances of a single word with multiple senses, or are they multiple words with a single sign? Liu notes:
“That conundrum is by no means an idle issue. Masterman’s research group at CLRU learned it the hard way within a couple of years when they embarked on the computational research on machine translation and information retrieval. The undecidability of word in either determination—a single word with multiple senses or multiple words unified by a single sign—became an endless source of frustration and challenge for them, trumping all other difficulties. The OED turned out to be the least helpful template when the researchers ran into technical difficulties while mining data from pedestrian language use or when they found themselves overwhelmed by the ubiquity of word-concept entanglement in the machine” (Witt., 435-6).
The resulting research agenda was completely different from the more familiar one pursued in the U.S. Liu cites Chomsky’s emphasis on syntactical structures; it seems to me that one might equally point to the entire research program that led to what now gets called “good old-fashioned AI,” which was top-down, rule-driven, and not so successful once you got past the limited domain of finite, rule-governed systems. It was this approach to AI that Hubert Dreyfus spent much of his career opposing, on phenomenological grounds (see here and his retrospective here). The approach also led to Searle’s (in)famous “Chinese Room” argument against it, which purported to show that a computer could not understand language.
Liu has a lot to say about the weirdness of Searle’s argument in another paper. There, she notes that the experiment trades on a strange metaphysical separation (concretized by a wall with a mail slot) between the minds of a native Chinese native English speaker. Liu notes:
“This picture of the world … is so fundamental to Searle’s argument about understanding that without it the philosopher would not be able to authorize the place of semantic legibility where understanding does take place. And where does it take place? By his intuition, it takes place inside the mind of a native speaker, English or Chinese, not outside it or between minds, much less in the digital computer. That intuition, however, begs the question: Why should native speakers of a language have any bearing on the fact of understanding in the Chinese Room or outside it?” (Turing, 15)
Yet if you were to imagine someone bilingual in the room, or someone who speaks some Chinese but not natively, “it follows that the segregation of mental spaces would instantly crumble” (Turing, 15). As a result, “it becomes clear that the source of Searle’s trouble with AI is not the computer or the robot but a monolingual nativism that structures the philosopher’s relationship to the world.” (Turing, 16). It is precisely this sort of separation that the Wittgenstein-inspired translational projects pursued in Masterson’s lab tried to move beyond.
For Liu, Masterman pursues an agenda that moves beyond even Derridean efforts, and she argues that “Masterman is the first modern philosopher to push the critique of Western metaphysics beyond what is possible by the measure of alphabetical writing, and, unlike deconstruction, her translingual philosophical innovation refuses to stay within the bounds of self-critique” (Witt., 444). Or, as she remarks in the other paper, “Quine, Derrida and Saussure have started to theorize all of this “but they have not theorized it with any degree of satisfaction with respect to the actually existing language data” (Turing, 19).
Masterman’s strategy also moves beyond Wittgenstein:
“What she did was turn the cognitive limitations of the computer—that is, the challenges involved in the programming of the MT machine to distinguish amongst ward, ward, and ward as different words or as different senses of one word—into a distinct advantage to achieve greater philosophical clarity about the entanglement of word and concept in human languages.” (Witt., 437)
Whereas Derrida goes after logocentrism via a critique of Plato and liminal concepts like difference, Masterman uses classical Chinese to undermine Western assumptions about what a “word” is in the first place. This requires rethinking what Chinese ideographic writing is doing. Rather than seeing it as a prior stage to a more sophisticated alphabetic writing (this is Hegel’s view), “ideographic writing operates on combinatory logic, not propositional logic. The difference between combinatory logic and propositional logic carries tremendous importance for her. To investigate the logical forms in classical Chinese is to look for the rules of combination of ideographic clues or visual hints, and this is conceivable only when one ceases to think of ideographs in classical Chinese as pictorial representations of objects or icons by resemblance” (Witt., 439). In other words, Masterman’s move is a fundamental rethinking of how writing/text is organized, based on the study of a non-Western writing.
Next time I’ll look at the strategy, which is the forerunner of the distributional account of language behind LLMs.
Recent Comments