In philosophy (when we are broad-minded) we tend to think of Turing Tests as "kinds of behavioural tests for the presence of mind, or thought, or intelligence in putatively minded entities." In a recent (2009) article [that I discussed in a reading group with Morgan Bennett, a PhD Student at UCSB] ("Understanding in economics: Gray-box models. In H.W. de Regt, S. Leonelli & K. Eigner (Eds.), Scientific understanding: Philosophical perspectives (pp. 210-229). Pittsburgh), the Dutch philosopher-historian of economics, Marcel Boumans (a former colleague in the ill-fated Amsterdam History and Methodology of Economics group), stretches the Turing Test concept to mean: "Reports based on the output of the quantitative model and on measurements of the real system are presented to a team of experts. When they are not able to distinguish between the model output and the system output, the model is said to be valid." This is then glossed by Boumans as "an observer (interrogator or expert) has to decide whether a distinction can be made between a computer output and output from the “real” world." (218-9) Let's call this kind of reasoning a "System Turing Test." (Boumans cites work using the System Turing Test in systems engineering.)
Boumans points out that arguments that rely on something like the System Turing Test were used to justify two important practices in Economics: (1) the use of "artificial, abstract, patently 'unreal'" models (219; Boumans is quoting Nobel Laureate Lucas). (2) the rise of simulations on artificial economies (220ff; represented by the work of Kydland and Prescott, who also won Nobels for it). The first is a variant on famous arguments by Milton Friedman. (I have written on these elsewhere and here.) The second use is not restricted to economics; I believe climate change research also engages in it. What makes the work of Kydland and Prescott interesting is that their simulations were explicitly not designed to predict, but rather (in a passage quoted by Boumans) to measure “What is the quantitative nature of fluctuations induced by technology shocks?” So, the simulation gets used to reveal a property of the model (which, in turn, is supposed to reflect the real system).
Now, when years ago (through a reading group with Boumans) I encountered the methodology of Kydkand and Prescott, I had a kind of gut reaction against their strategy of testing the model on the same data by which it was calibrated. (This practice is widespread in data-rich, theory-poor sciences that rely on simulation.) "They can't double-count!" (There is a rich philosophy of science literature on this issue with famous papers by Worrall and Mayo.) A few months ago I heard a great paper ("Can Data in Climate Science be Used for Calibration and Confirmation?",) by Charlotte Werndl thas a neat Bayesian argument that shows that double-counting can be justified on probabilistic grounds. (Note how rare it is that I say nice things about Bayesian arguments!) I re-interpret her argument as follows: sometimes we need to double-count in order to reveal a difference that makes a difference in the data (that absent double-counting would have remained invisible to us). Let's assume for the sake of argument that Werndl's argument works and that I reinterpreted it accurately (not sure about the latter).
Now let me grant that there are conditions in which double-counting will be successful. But given that the Bayesian argument relies on probabilistic reasoning there are surely also going to be (perhaps less likely) conditions when it is not successful. That is to say, even under the best circumstances, sometimes our simulations are inadequate. The System Turing Test defers the judgment on that fact to the expert who decides that the simulated output matches the underlying system. There is no further fact of the matter available. (That's why we're simulating.) Of course, the expert who has come to think about the world through the simulation may well develop overconfidence in the model (or incentives to favor it); something like this problem is not so uncommon in economics (as even economists have started to recognize). When the simulated universe knows no surprises, the experts' judgment will be cultivated to act like machines.
Recent Comments