• Last time, I setup a question about Foucault’s anti-humanism.  His comments in Order of Things are famous, and the recent publication of a 1954-5 lecture course he delivered at Lille as La question anthropologique offers a chance to think about the evolution of his thought on the subject.  One clue that something is different is that Ludwig Feuerbach, one of the “Young Hegelians” in Marx’s early-career circle, is prominent in the 1950s version but not the one ten years later, even though Feuerbach’s name was prominently associated with objectionable humanism by Foucault’s teacher Althusser at the time Order of Things appeared.

    I want to approach the questions that this poses not by asking where Feuerbach went – I don’t really have any evidence on that either way (yet?) – but to ask where Feuerbach came from in the 1950s.  Recent scholarship offers some really interesting work on that question.  If one were to ask where Foucault got the idea of anti-humanism, Heidegger would be an obvious starting point.  As Arianna Sforzini suggests in her introduction to La question anthropologique, “Foucault is in agreement with the observation formulated by Heidegger from 1929: ‘anthropology today is no longer, and hasn’t for a long time, just been the title of a discipline.” (235, the Heidegger reference is to his Kant and the Problem of Metaphysics, p. 147 in the English. Original: GA 5, 209).

    We know that Foucault had read a lot of Heidegger.  Jean-Baptiste Vuillerod’s recent La naissance de l’anti-hégélianisme, about which much more later, reports that “we find in the Foucault archives hundreds of pages of notes taken on Heidegger, which he read in German.”  In box 33a-0, for example, “we find long commentaries, translations and paraphrases of the following texts:” What is called Thinking?, Letter on Humanism, ‘Who is Nietzsche’s Zarathustra,’ ‘Building, Dwelling, thinking,” “Nietzsche’s Word: God is Dead,” “Overcoming Metaphysics,” “The Age of the World Picture,” “Anaximander’s Language,” and “a series of citations on the principal Heideggerian concepts.”

    (more…)
  • Foucault published Madness and Civilization in 1961; before that, there was relatively little published work, and his early career work of the 1950s has been neglected until quite recently.  Some of it is starting to appear, in particular work that he did at the University of Lille: two manuscripts: one on Binswanger and Existential Analysis and one on Phenomenology and Psychology; and a course on Anthropology

    The Anthropology course, La question anthropologique, is of obvious interest because it can help to provide some backstory to Foucault’s anti-anthropology chapter in Order of Things, in which he ties anthropology to humanism as a historical moment whose time is passing.  As he writes there, “man is neither the oldest nor the most constant problem that has been posed for human knowledge” and was made possible only by larger epistemic arrangements.  The dissolution of that episteme would famously lead to the disappearance of the problem:

    “If those arrangements were to disappear as they appeared, if some event of which we can at the moment do no more than sense the possibility – without knowing either what its form will be or what it promises – were to cause them to crumble, as the ground of Classical thought did, at the end of the eighteenth century, then one can certainly wager that man would be erased, like a face drawn in sand at the edge of the sea” (423).

    That was 1966.  The Anthropology course were lectures Foucault gave in late 1954 and early 1955 at Lille.  Broadly, as Arianna Sforzini writes in the introduction to the lectures,

    (more…)
  • Last time, I looked at Derrida’s Gift of Death to understand the logic of sacrifice there.  Briefly, the decision to do one thing involves sacrificing all of the other thing one could do.  So when I choose to feed this cat, I sacrifice all the other cats.  My ethics are impeccable, but the decision to prefer one cat over all the others is one that cannot be ultimately justified.  This is the lesson Derrida takes from Kierkegaard’s Abraham.  I then suggested that Derrida thinks a similar logic works in language, with evidence from passages where he suggests that speaking here and now in a certain language (French, in his case and examples) involves not speaking in other ways and other languages.  As he says in Grammatology, the justification of a particular discourse is only possible on historic grounds, not absolute ones.

    What does any of this have to do with language models?  A viable chatbot does a lot more than next-token prediction. I’ve talked a lot about the various normative decisions that go into making models work – everything from de-toxifying training data to all of the efforts (of which RLHF is perhaps the best-known) to massage the outputs into something a person would find palatable.  The models also make a significant break with English language in that they operate using word tokens, and not words: the very architecture of the model involves a strategic process of winnowing the range of iterability (for more: one, two, three).  Here I want to look at something different, something analogous to the sense of “decision” in Derrida.

    (more…)
  • There’s starting to be a good bit of productive “continental” work on Large Language Models (LLMs) like ChatGPT.  In particular, there’s emerging work that takes on LLMs from the point of view of language.  I’ve said a lot about the usefulness of Derrida for understanding LLMs, generally through the lens of Derrida’s discussion of Platonism.  For skeptics, there’s now also a new paper by David Gunkel that makes a succinct case using Derrida’s différance.  For those who prefer structuralism to post-structuralism, there’s Leif Weatherby’s Language Machines (Weatherby dismisses Derrida’s utility; I offer the outlines of a response here).  For those who prefer Wittgenstein, Lydia Liu has some really interesting work and evidence of a direct influence of Wittgenstein on the development of language computation at Cambridge.  Here I want to continue the general exploration by taking it in a direction that I’m pretty sure is new, the way that Derrida understands decision and sacrificial logic.  The setup is a little long, and goes by way of the Binding of Isaac.  So bear with me.

    In the relatively late Gift of Death (1992), Derrida responds to Kierkegaard’s telling of the binding of Isaac.  To recall, in the Biblical story, God “tests” Abraham by instructing him to take his only son Isaac and sacrifice him at the top of Mount Moriah.  Abraham obliges without question; an angel intervenes at the last moment to save Isaac.  Abraham passes the test and is promised offspring “as numerous as the stars of heaven and as the sand that is on the seashore” because he obeyed the command.  Kierkegaard’s text is presented in the voice of one Johannes de Silentio, who claims not to be a philosopher and to be rendered speechless by Abraham’s faith.  Speaking of the authorial voices in his early texts, Kierkegaard suggests that they allow “the educative effect of companionship with an ideality which imposes distance” (CUP, 552).  Silentio suggests fairly early on that “Abraham was the greatest of all, great by that power whose strength is powerlessness, great by that wisdom whose secret is foolishness, great by that hope whose form is madness, great by the love that is hatred to oneself” (16-17).  There is a central paradox to Abraham: his greatness requires that he explicitly intend to do what is obviously unethical.  Hence Silentio’s unwillingness to explain Abraham in (Hegelian) conceptual terms.   Derrida explains the paradox this way:

    (more…)
  • No, the quote isn’t a new marketing slogan for OpenAI.  I’m actually referring to a budding issue in patent law.  The Patent Act says that “whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title (35 U.S.C. §101).  Although this is very broad, Supreme Court precedent says that it exempts abstract ideas, laws of nature, and natural phenomena. 

    As I argued in my IP book (from which I’m lifting some of the discussion below) the rise of the information economy has made understanding these exemptions quite difficult.  In an industrial setting, all of these patentable things tended to occur in certain objects that could then be claimed as patentable.  As Dan Burk notes, “products, at least to the extent that they constitute objects, are inherent in the concept of process …. Making and using entail some type of object: some thing is made, and some thing is used.  In classic industrial setting, the substrates of the process were fairly apparent, and extant in what is now §101; machines and materials visibly interacted as inputs generating outputs” (527).

    With the rise of “immaterial” goods and a post-Fordist economy, however, it is increasingly difficult to point to discrete things either at the level of product or process, and the ability to characterize immaterial goods informatically suggests that they could be understood as either thing or process.  Burk argues that the Supreme Court cases on §101 are therefore more about drawing judicial limits on what patents can cover.  As he puts it, “excluding conceptual inventions from patent eligibility pushes exclusivity further downstream to the stage of finished products, requiring narrower claims on concrete implementations, rather than allowing conceptual patents early in the development of a technology” (535).  Still, the devil lies in the details of how to make this work.

    (more…)
  • Leif Weatherby does not care for Derrida.  At least, in Language Machines (see here for a synopsis/initial take on this important book) he suggests that Derrida’s (mis)reading of Saussure is a significant part of “how the humanities lost language, allowing both cognitive science and NLP to update analytical and technological approaches that literary theory rarely engaged” (73).  In particular, Derrida’s move to the critique of metaphysics and his tendency to lump pretty much everything together under that umbrella risks abstraction – it’s a proposal that “itself floats above the fray” (73).  This gets to the same place Chomsky did, albeit by a different route:

    “By sweeping structuralism’s focus on a concrete object to one side in the name of opposition to metaphysics, poststructuralism fumbled the object itself. Where Chomsky avoids external language by excluding it from science, Derrida finds the law not in cognition but rather at a level of abstraction about culture that ends up having the same effect: a lack of a link between the ‘conditions of im/possibiity’ and the expressions so conditioned” (73).

    The Derridean critique, in other words, is so abstract that “it is simply not clear that we need Derrida’s revision of structuralism to proceed with a concrete analysis of computational language” (73).  Worse, post-structuralism in its Derridean version doesn’t have much to say about how language “interfaces with other sign-systems … primarily because it has never taken other sign-systems particularly seriously, perhaps especially mathematics” (73).

    There’s a lot going on here, and I’m certainly not in a position to defend Derrida’s level of abstraction.  After all, I lean Foucauldian.  In what follows, I want so say something about the abstraction problem, and then something about why I think Derrida nevertheless has something to offer.

    (more…)
  • Last time, I talked about Leif Weatherby’s fantastic Language Machines (for my initial synopsis and thoughts on the book, see here) and his identification of a Kantian problematic behind what he calls the syntax view of language, which is prominently associated with Chomsky.  Although Chomsky called his book Cartesian Linguistics, Weatherby thinks the better reference is to Kant.  I think this makes a lot of sense, and it helps (this was the trajectory last time) to understand why structuralist, post-structuralist and Wittgensteinian work seems to have real traction when applied to language models.

    Here I want to step back a little and note part of what motivates the Kantian account, because I think it shows the political stakes of Kantianism. On a standard epistemological reading, Kant was awakened from his dogmatic slumber by Humean empiricism. Causality demands necessity and empiricism can’t get you there (see B123-4). I have no quarrel with the epistemological reading, but it’s worth noting that the language of the First Critique also is full of juridical terminology.  For example, we need to “institute a tribunal which will assure to reason its lawful claims, and dismiss all groundless pretensions, not by despotic decrees, but in accordance with its own eternal and unalterable laws” (A xiii).  As David Lachterman showed, this kind of language is all over the First Critique and is critical to the project of disciplining reason.  In starting the Deduction, Kant distinguishes a question of right from a question of fact and applies the distinction to our use of the categories:

    (more…)
  • In Language Machines (see here), Leif Weatherby argues that what he calls the “syntax” view of language, which is most closely associated with Chomsky, is better viewed as a Kantian system than a Cartesian one:

    “Syntax, universal grammar, principles and parameters, and the more recent ‘minimalist program’ with its key idea of ‘merge’ – all these are attempts to isolate and formalize the ability to use language as a distinctively human operation shared neither by animals nor by machines. For this reason, I think that his linguistics is more Kantian than Cartesian. Chomskyan linguistics is the search for the categories of a transcendental logic as it exists extensively, to find the rules that we impose on sound or paper …. The search for the rules of that knowledge in the empirical order is futile, Kant argued, and Chomsky’s argument against statistics ha its analog here, not in Descartes or in Humboldt” (46-7).

    Chomsky’s aversion to empiricism (in this Kantian sense) is “at the cost of defining” language “not as actually spoken languages but as the formal production unit – in the brain or some computational formalism – that achieves the fit between knowing and saying, the internal and external aspects of the linguistic act” (51).  On the Chomskyan argument, it is not possible to bootstrap from semantics to syntax; the cost is explaining “how the deep structure of syntax actually imposes form on specific languages, like English or Lao” (51).

    (more…)
  • Regular readers of this space will know that I think large language models are deeply fascinating, in addition to being a little scary (depending on their use).  I also think that we can get some traction on both of those things by way of post-structuralist language theory, or at least, by way of Derrida.  I was thus very happy to finally read Leif Weatherby’s Language Machines: Cultural AIO and the End of Remainder Humanism, which came out earlier this year.  Weatherby’s thesis is, in brief, that the structuralists were right about language, and that we need to see this to have any hope of understanding language models and directing them to good use.  I’ll hopefully have more to say about various parts of the book later, but for now I want to offer a high level outline.

    Weatherby begins by arguing that “nothing less than the problem of meaning, in a holistic sense, surfaces when language is algorithmically reproducible,” such that “this problem can be addressed only if linguistics is extended to include poetics … reversing the assumption that reference is the primary function of language, grasping it rather as an internally structured web of signs” (2).  This is because “the new AI is constituted in an conditioned by language, but not as a grammar or a set of rules.  Taking in vast swaths of real language in use, these algorithms rely on language in extenso: culture, as a machine” (5).

    (more…)
  • The preprint is freshly posted on SSRN; the paper is forthcoming in a volume on Privacy Resignation (aka privacy cynicism). In it I argue that privacy resignation is usefully understood as an adaptive preference. Here is the abstract:

    Adaptive preferences are preferences that change because of the availability of what someone desires. The concept has had considerable uptake in the literature on human development, where it is used to understand how socially marginalized people come to accept their marginal status. Here, I apply the framework to privacy resignation in two ways. On a substantive interpretation, adaptive preferences indicate a normative problem. In the case of privacy, it is with substantive autonomy and the importance of privacy to a number of core human capabilities. On a formal interpretation, adaptive preferences are irrational because they involve changing one’s assessment of something without it having itself changed. Here I argue that this sort of preference-“privacy is unavailable, therefore it is bad”-is a goal of the data industry, which wants to change social norms against privacy to serve its own purposes and to deflect critical thinking away from its practices.