This is a guest post by Mitchell Aboulafia. It's a bit long, but it's worth reading in its entirety. I don't necessarily endorse any particular part of it, but I think it makes some excellent points, especially about lack of transparency. This feature in particular is, in my view, not tolerable given the enormous influence the report has; an influence that is aided, in no small part, by the very cooperation of the group of evaluators that the letter is addressed to.
Mtichell's post comes, in its entirety, after the break:
An Open Letter to Prospective Evaluators for the 2014-2015 Philosophical Gourmet Report--An Update*
A debate about the PGR and program rankings is now underway in philosophy circles. Although there has been no organized public forum, this conversation is taking place on blogs, Facebook, and other media. It has been spontaneous, coming on the heels of criticism of Brian Leiter’s behavior. The good news, at least to this observer, is that there seems to be consensus on at least one point: the PGR is a flawed instrument. But there the agreement ends. There is wide disagreement about the extent and nature of the PGR’s problems. Some argue that it is so flawed that it’s time to retire it, permanently, while others believe that its problems are minimal and that the PGR should be available for 2014-2015 and the years ahead. Between these extremes are other views and other options, including a preference for suspension of the PGR for the present in order to address its problems.
Since we have no hard data, it’s difficult to say what proportion of the philosophical community holds what position. We can say that the debate has generated an enormous amount of interest. There are many philosophers who think that we should not proceed with the PGR this year. Leiter himself conducted two polls, advertised on his site, regarding whether there should be a 2014-2015 PGR. The first set of results started coming in heavily against proceeding, but Leiter thought there might be something fishy about those results. So, he offered a new poll through Condorcet Internet Voting Service, which for public polls uses IP addresses to eliminate multiple votes from the same person or location. When Leiter closed the poll on September 24, 3,424 people had voted. The question: “Should we proceed with the 2014 PGR?” The results: 2,104 No votes, 1,320 Yes votes.** That’s a little over 61% in opposition, for a poll that Leiter himself conducted and advertised on his site, and which took place before the current debate on the PGR itself gained momentum. (See Archive of the Meltdown. It was, for example, in early October that word came that the University of Sheffield decided not to participate.) It’s reasonable to suppose that Leiter hoped that he would receive support for the PGR when he did the poll. When he didn’t, he dropped it. This speaks to one of the PGR’s problems, namely, that its administration is private and very closely held, with very little input of any kind from the broader profession. (Those who believe that the PGR is Leiter’s personal property, and that philosophers have no business telling him how it should be run because it belongs to him, need read no further.)
Of course, the internet polls Leiter initiated don’t settle the question of how the profession as a whole would vote, but I think we can say they clearly indicate a good deal of interest in the question of whether the PGR should proceed. From the poll as well as the anecdotal evidence, we can assume that there are a lot of people out there who would prefer that the 2014-2015 PGR rankings not go forward.
Let’s assume that I am correct that there is a general consensus that the PGR is flawed in some fashion, which even without my appeal to the anecdotal is a reasonable assumption, because, after all, there are no perfect evaluation instruments. This is a point often made by some of the PGR’s defenders. As for the other extreme, I don’t think anyone who has been reading about this controversy in the last month would doubt that there are voices who see the PGR as fatally flawed.
If we grant that this range of views exists about the PGR, then a reasonable question for evaluators is the following: How flawed would the PGR have to be for you to decide against participating this year, that is, what kinds of problems must it have? Note that the question is whether you as an evaluator should endorse the current product as it stands by participating in this year’s survey. I am not raising here the question of whether problems with the PGR can or cannot eventually be resolved.
Let me offer a list, by no means a comprehensive one (and certainly not in rank order), of ten problems that have been raised about the PGR. It is by no means necessary to agree with all of these criticisms to take a pass on the PGR this year. Just a few, perhaps even one, will do.
The Philosophical Gourmet Report . . .
- lacks definitive criteria for the evaluation of departments, for example, what sorts of accomplishments by philosophers should weigh more heavily than others.
- lacks public and clearly defined guidelines for its evaluators, for example, evaluators can choose to rank based on cursory impressions of departments or do a great deal of homework regarding departments, or anything and everything in between.
- places evaluators in a position of weighing and using criteria differently, creating a survey that not only permits the comparison, in effect, of apples to oranges, but also leaves the selection of particular fruits to the evaluators. (The PGR’s “Methods and Criteria” page speaks of “different philosophies of evaluation.”)
- has too few evaluators, selected by too few people, to adequately evaluate many, if not most, of the specializations in philosophy.
- imposes its own idiosyncratic cultural preferences on a profession devoted to critical and communal inquiry as the means to truth. The PGR has not sought a public debate on its organization and procedures, in spite of serious concerns about bias, distortion, and unwarranted marginalization. It is moving ahead with this year’s PGR in spite of obvious concerns of many in the philosophical community and a NO vote from Leiter’s own poll regarding the question of proceeding with the 2014-2015 PGR.
- promotes a halo effect with respect to departments that adversely affects the job prospects of candidates, that is, reputational pedigrees often trump candidates’ actual accomplishments.
- is functionally indifferent to the undesirability of the halo effect. Leiter is candid about the fact that its evaluators have been influenced by a halo effect in the past--"As one respondent put it a few years ago: 'surprisingly tough to say what I think, without the institutional halo effect front loaded.' " However, the PGR's response, namely, to withhold the names of universities from reviewers, fails to address the halo effect that is created by the presence of a "star" in a particular department. The “star” effect also renders any claim to anonymity of departments indefensible. (Evaluators often know where so-and-so "star" is currently teaching, or know other members of a department’s faculty.)
- does not have an independent Board with real oversight.
- fosters a parochial view of the discipline, under the guise of neutrality. In Bharath Vallabha,’s words, “[i]nstead of having different ecosystems where some value Princeton as the best department, and others value Notre Dame as the best, and yet others value Vanderbilt as the best, and so on, PGR fosters the idea that there is one over-arching sense of what are the best departments, and those are the ones which are ranked by PGR and in particular the ones which are at the top of those rankings.”
- does not have a sufficiently diverse pool of evaluators, especially when we consider not only their present institutions, but where the evaluators received their graduate degrees. (See, for example, Bharath Vallabha’s “The Function of the Philosophical Gourmet Report:” “The surveys don't aim to capture what a broad range of philosophers - those associated with both ranked and unranked schools - think. Rather, they capture what people who have passed through, or are affiliated with, ranked programs think is the ordering of the ranked programs.”)
I am convinced that rankings in general do more harm than good. A comprehensive informational website with a sophisticated search engine would be my preference, one that would allow individuals or departments to generate customized rankings based on selected criteria. But I will not try to convince you of this here. Instead, I want to run some numbers by you, prospective evaluators, and ask that you consider them, in light of the problems with the PGR, before deciding to fill out this year’s survey. Many invited evaluators do not fill out the survey when they receive it, and there are good reasons for you to take a pass on it this year. Here’s why:
According to Leiter, he is currently working from a list of 560 nominees to serve as evaluators for the 2014-2015 PGR. During the last go-around in 2011, 271 philosophers filled out the part of the survey dealing with overall rankings, and “over 300 philosophers participated in either the overall or specialty rankings, often both.” Leiter claims that in 2011 the on-line survey was sent to approximately 500 philosophers. So, many philosophers decided NOT to fill it out even after receiving it. (Also notice that from the information he provided we don’t know how many filled out the portion of the survey dealing with specializations. All we know for certain is that it must have been at least 30, that is, 271 + 30=over 300 hundred.)
Three hundred may appear to be a reasonable number of evaluators, but the total number of participants obscures crucial details, and one doesn’t need any sophisticated form of statistical analysis to see how problematic these are. If you look at the thirty-three specializations that are evaluated in the PGR, slightly more than 60% have twenty or fewer evaluators. That’s right, twenty or fewer. Think about this for a moment: twenty or fewer philosophers, in one case as few as three, are responsible for ranking 60% of the specializations found in the PGR, what many consider to be the most important feature of the PGR.
But it is actually worse than this. There are certain areas that have many fewer evaluators than other areas. For example, the PGR lists nine specializations under the History of Philosophy rubric. Six of the nine have twenty or fewer evaluators. One of the specializations, American Pragmatism, has only seven. The only general category to have the majority of specializations with more than twenty evaluators is “Metaphysics and Epistemology.” Five of its seven specialties have more than twenty. But none of the others–Philosophy of Science and Mathematics, Value Theory, and the History of Philosophy—have a majority of specializations with more than twenty evaluators. In the three specializations outside of these rubrics we find this: eleven evaluators for Feminism, three for Chinese, and four for Philosophy of Race.
The case of Chinese Philosophy is noteworthy. In 2009 Manyul Im, a supporter of Leiter’s rankings at the time, posted a blog about the rankings of Chinese Philosophy, and let Leiter know that he was going to publish the post. Professor Im had two basic complaints. For Chinese Philosophy, according to Im, the rankings were misleading about the differences in quality between programs, and the pool of evaluators was too small and their backgrounds were too similar. He argued against ranking programs in Chinese Philosophy and instead recommended “an informative list of viable programs.” The 2011 PGR did reduce the number of ranked groups in Chinese, from four to two, presumably to help address the concern about reading too much into the differences between groups. However, the PGR still uses the same three evaluators, two of whom Im pointed out were students of the same person, while all three share a similar “interpretive paradigm.” Here’s an instance of the arbitrariness of the PGR as a ranking system: in the same year, 2011, the Philosophy of Race, with four evaluators, one more than Chinese, received the whole megillah, and has five ranked groups. Go figure.
But you don’t have to take my word about the small number of evaluators for the specializations. Here’s what Leiter says on the 2011 PGR site.
Because of the relatively small number of raters in each specialization, students are urged not to assign much weight at all to small differences (e.g., being in Group 2 versus Group 3). More evaluators in the pool might well have resulted in changes of .5 in rounded mean in either direction; this is especially likely where the median score is either above or below the norm for the grouping (emphasis in the original).
Obviously, urging students “not to assign much weight at all to small differences” does not address the issue. No weight should be assigned to specializations ranked by so few people. This is not rocket science. This is common sense. You can’t evaluate the quality of specializations that have so many facets with so few people, who themselves were selected by another small group of people, the Advisory Board, which clearly favors certain specializations given the distribution of evaluators. (This is especially true when there hasn’t been a public discussion about what should constitute standards for rankings of specializations in philosophy.) Leiter’s advice makes it appear that one should take the specialty rankings seriously if only we refrain from assigning too much weight to small differences. But if this is right, why didn’t Leiter take Professor Im’s advice and not have any rankings in Chinese Philosophy in 2011? Two groups with three reviewers is absurd by Leiter’s own logic, but there it is. My hunch is that, as silly as the rankings of Chinese Philosophy into two groups may seem, had Leiter not done it this way, the door would have opened for those in other specializations to argue for Im’s suggestion of “an informative list of viable programs.” But this is anathema to the rankings mentality of the PGR. Everything—everything—is independently better or worse than something else, in PGR-land. Once you start to rank, it’s rankings all the way down. For example, the PGR will use them even when it won’t show the scores, “due to the small number of evaluators”:
I honestly don’t know how one could fill out the survey this year in good faith knowing that so few people are participating in ranking so many specializations. When you fill out the survey you are making a statement. You are providing your expertise to support this enterprise. The fact that you might be an evaluator in M & E, with more evaluators than the other areas, doesn’t absolve you of responsibility as a participant. At minimum, you are tacitly endorsing the whole project.
Ah, you say, but perhaps this year’s crop of evaluators will be more balanced. However, the way that the PGR is structured undermines this hope. The evaluators are nominated by the Advisory Board, which has roughly fifty members. Most of the same people are on the Board this time around as last time. But here’s the kicker: Leiter asks those leaving the Board to suggest a replacement. The obvious move for a Board member here would be to nominate a replacement in his or her own area, probably from his or her own circle of experts. In Leiter’s words, “Board members nominate evaluators in their areas of expertise, vote on various policy issues (including which faculties to add to the surveys), serve as evaluators themselves and, when they step down, suggest replacements.” So there is no reason to believe that the make up of the pool of evaluators would have markedly changed since the last go-round.
The 2014-2015 PGR will be in place for at least the next two years, maybe longer given the difficulties it faces. There are a lot of young people who will be influenced by it. If you have been invited to serve as an evaluator for 2014-2015, please consider taking a pass on filling out the survey. If enough of you do so, the PGR will at minimum have to reform its methodology.
Given the recent and continuing publicity surrounding the PGR, it’s also important to consider how it may be used against the interests of philosophers. Faculty members from other disciplines are already discussing its flaws on the web. This will only increase as the dispute within philosophy about the PGR intensifies. The cat is out of the bag. Those who try to use this flawed ranking system will soon be challenged by savvy chairs in other departments, either directly or behind closed doors in discussions with deans and provosts. We should try to avoid the embarrassment of having people outside of philosophy, especially those who are familiar with survey methodologies and related data collection, discover our support for such a compromised ranking system. Taking a pass on this year’s PGR is not only the right thing to do, it is prudent. But only you, the invited evaluators, get to decide whether the PGR is too flawed to endorse by filling out your surveys. The rest of us have no say whatsoever in this decision.
1) In this post I purposely sought to keep the statistics as simple and straightforward as possible in order to raise basic questions about imbalances and sampling size in the current PGR. Gregory Wheeler has a nice series on some of the more in-depth statistical work in “Choice & Inference.” See the series and concluding piece, “Two Reasons for Abolishing the PGR.”
2) If there is available public content regarding changes to the PGR that I have missed, I’d be grateful for a pointer in that direction. As far as I know, no fundamental change is taking place in the 2014-2015 PGR.
3) I have calculated the number of evaluators in the different categories as best I could from the information available to me. Any errors, to the best of my knowledge, are small enough that the case I make here stands in any event.
*A portion of this post originally appeared on UP@NIGHT, October 19, 2014, in a series of pieces dealing with rankings and the Philosophical Gourmet Report.
** Leiter’s words announcing the second poll, “So here's a different poll service, which in the past has done better at blocking strategic voting. Here it is. Rank "Yes" #1 if you want the 2014 PGR to go forward; rank "No" #1 if you do not want it to go forward. We'll see how the two come out.” I can’t find anything on his blog about the poll after this. If there is a public response Leiter made to the negative outcome, I would like to hear about it.