By Gordon Hull
Last time, I suggested that a recent paper by Mala Chatterjee and Jeanne Fromer is very helpful in disentangling what is at stake in Facebook’s critique of Illinois’ Biometric Information Privacy Act (BIPA). Recall that BIPA requires consent before collecting biometric identifiers, and a group of folks sued FB over phototagging. Among FB’s defenses is the claim that its software doesn’t depend on human facial features; rather it “learns for itself what distinguishes different faces and then improves itself based on its successes and failures, using unknown criteria that have yielded successful outputs in the past.” (In re Facebook Biometric Info. Privacy Litig., 2018 U.S. Dist. LEXIS 810448, p. 8). Chatterjee and Fromer apply the phenomenal/function distinction from philosophy of mind to the question of how mental state requirements in law apply to AI, with an extended case study of liability for copyright infringement. Basically, there’s an ambiguity buried in the mental state requirements, and we need to decide – probably on a case-by-case basis – whether the law’s objective is better served by a phenomenal or functional account of the mental state in question.
In applying the distinction, I suggested that we assume for the sake of argument that the software does not do the same thing that an embodied human being does when they identify a face. In other words, I was suggested that we accept arguendo that the software in question does not achieve the same phenomenal state as one of us does when we recognize a face. I also said I think that assumption, while clearly correct in a literal sense, may not be able to do as much work as it needs to. Here’s why.
It should be fairly clear that the experience of recognizing Pierre in a café is not identical between different people, or probably even the same person in different times. For that to be true, the molecular structure and electrical activity in their respective brains would have to be identical, which isn’t going to be the case. It’s also not clear that we don’t “learn[] for [ourselves] what distinguishes different faces and then improve[] [ourselves] based on [our] successes and failures, using unknown criteria that have yielded successful outputs in the past,” just like FB. After all, if you ask me why I recognize somebody, I will produce some criteria – but if it’s somebody I know, it’s not like I consciously apply that criteria as a rule. Neither the FB system nor I are using the old-fashioned “AI” of an ELIZA program. It would therefore at least require some argument to say that I recognize the face by means of that criteria, rather than offering it as a post facto explanation. Indeed, recognition does not appear to be a “conscious” process in the relevant sense at all. So that can’t be the issue.
I’m sure there’s other such examples that could be adduced – my only point is that the real question is when the phenomenal state is close enough to count, and how FB might draw the line in such a way to make its distinction work. In other words, FB’s defense needs to point to something more than a functional state – getting the face right – but something less than the pure replication of the phenomenal state of a person who recognizes a face. The algorithm clearly does something different – but how do we know that’s enough to matter?
FB might reply that it uses metadata, but that can’t be right, at least not without some further argument: the use of metadata is basically the application of contextual information. Knowledge of context is an enormous part of how we identify people, as anyone who has had trouble recognizing someone in a different context will know. So too, I can recognize a politician on a stage where I perhaps wouldn’t recognize them if I ran into them at the grocery store. We even give ourselves credit for recognizing people purely on metadata: “oh, that must be Professor X speaking at the podium,” I say, knowing only that I am where professor X is supposed to give the keynote, during the time scheduled for the address.
All of this can be extended easily enough to artifacts. In their original article articulating the extended mind hypothesis (EMH), David Chalmers and Andy Clark suggest that “Epistemic actions alter the world so as to aid and augment cognitive processes such as recognition and search” (8). These actions then deserve epistemic credit:
“If, as we confront some task, a part of the world functions as a process which, were it done in the head, we would have no hesitation in recognizing as part of the cognitive process, then that part of the world is (so we claim) part of the cognitive process. Cognitive processes ain’t (all) in the head!” (8).
Obviously now is not the time to fully debate this hypothesis; I rather want to assume that something like it is sufficiently plausible in this context, as a way of getting at the proper phenomenal description of “recognition.” Although the most famous example from the paper is about Otto the Alzheimer’s patient (more about him in a moment), it’s worth noting that the lead example is about varying levels of immersion in a computer screen. EMH is about cognitive systems: “All the components in the system play an active causal role, and they jointly govern behaviour in the same sort of way that cognition usually does. If we remove the external component the system’s behavioural competence will drop, just as it would if we removed part of its brain.” (8-9). The more general idea behind extended mind (particularly as developed in Clark’s Natural Born Cyborgs) is that certain kinds of technical devices and artifacts should be considered as part of our “minds.” It’s arbitrary to draw a line at the brain (you can prove that gesturing does cognitive work), and equally arbitrary to draw it at the edges of the body (fiddling with scrabble tiles on a tray is part of thinking about what to play (9-10)).
For an easy example, if I am used to wearing a wristwatch, and you ask me if I know what time it is, I’ll first say ‘yes’ and then look at the watch. Somewhat closer to the present case, if I am at a meeting where everybody wears nametags, I’ll read those nametags, and the tags could very well be the trigger that causes me to recognize someone. It’s not just that I’ll figure out who somebody I’ve not met is – it’s that the nametag will help me recall someone I only met once or twice before. Indeed, the tag can trigger me to remember other aspects about a person, such as other conferences I saw them at. It might also trigger me to remember other times I saw that name tag – and through that mechanism cause me to better recognize the person. The baseline would be my own memory. But that process can be slow, and require additional inputs (“hi, hoe have you been since last week?” ok, right – who did I see last week?) and it’s not clear how it works anyway: I might recognize someone because they always wear the same blazer to conferences.
Clark and Chalmers famously compare Inga and Otto. Inga wants to go to the museum, so she remembers where it is, and goes there. Now Otto:
“Otto suffers from Alzheimer’s disease, and like many Alzheimer’s patients, he relies on information in the environment to help structure his life. Otto carries a notebook around with him every-where he goes. When he learns new information, he writes it down. When he needs some old information, he looks it up. For Otto, his notebook plays the role usually played by a biological memory. Today, Otto hears about the exhibition at the Museum of Modern Art, and decides to go see it. He consults the notebook, which says that the museum is on 53rd Street, so he walks to 53rd Street and goes into the museum” (12-13).
On their argument, Inga believed the museum was on 53rd Street, even though she had to “look it up” in her memory. It seems reasonable to conclude that Otto believed the museum was there too, even though his “memory” wasn’t part of his biological process. This territory is very close to facial recognition. Consider Otto 2.0: Otto has trouble associating names and faces. But he keeps pictures of his friends on his phone, and those pictures are all labeled with their names. If Otto sees someone across the room that he’s seen before, he checks on his phone in order to know who that person is. If that’s the case, Otto is recognizing faces, just like his friend Inga who recognizes them using her biological memory.
All of this seems to me to establish at least a burden of argument on Facebook: why do we say that the black-box algorithm that “learns for itself what distinguishes different faces and then improves itself based on its successes and failures, using unknown criteria that have yielded successful outputs in the past” is doing something sufficiently different from what human systems do, without making up something ad hoc and arbitrary? Clark and Chalmers conclude with four factors to help distinguish Otto from what they characterize as more fanciful cases such as the sleep-deprived villagers in 100 Years of Solitude who label everything, or whether the information in my address book is in my memory:
“First, the notebook is a constant in Otto’s life – in cases where the information in the notebook would be relevant, he will rarely take action without consulting it. Second, the information in the notebook is directly available without difficulty. Third, upon retrieving information from the notebook he automatically endorses it. Fourth, the information in the notebook has been consciously endorsed at some point in the past, and indeed is there as a consequence of this endorsement. The status of the fourth feature as a criterion for belief is arguable (perhaps one can acquire beliefs through subliminal perception, or through memory tampering?), but the first three features certainly play a crucial role” (12)
These are like the fair use factors: you put them together to help decide, in a non-arbitrary way, what’s going on. At the risk of moving too quickly, let’s apply these to the FB system and see if it sounds like it’s doing what Otto 2.0 or even Inga do when they recognize faces. FB won’t tag someone without the information it has available to it. That information is ready-to-hand. It is assumed to be reliable. And it is done on the basis of a known instance of identification.
In short: even if we want something more than a functional account of recognition, FB has to make some arguments. One suspects that there’s an unjustified biologism behind their claim that their software does something different.
Recent Comments