Brian Leiter criticizes the new Google Scholar Metrics, which uses h-index and various similar measures to assess journals. He writes that "since it doesn't control for frequency of publication, or the size of each volume, its results are worthless." Some of my friends on Facebook are wondering why he's saying this, so I'll try to offer a helpful toy example here.
Consider two journals: Philosophical Quality, which publishes 25 papers a year, all of which are good and well-cited; and the Journal of Universal Acceptance, which publishes 25 equally good and well-cited papers a year as well as 975 bad papers that nobody ever cites. Google gives both journals the same score along all its metrics. Since tacking extra uncited papers onto a journal doesn't affect the number of papers in it with at least that number of citations, JUA's additional bad papers make no impact on the h-index (or on Google's other measures defined in terms of h-index, like h-median or h-core). But if you're looking at someone's CV and they've got a paper in one of these journals, you should be more impressed by a Phil Quality paper than an JUA paper. The Phil Quality paper is likely to be good, while a JUA paper is likely to be bad. Still, Google will see them as equal.
Synthese and Phil Studies seem to be benefitting from this phenomenon, as they publish lots more papers than other journals and get higher rankings in the Google metrics. They're good journals! But they're definitely not #1 and #2, which is how Google has them. Meanwhile, the consensus #1 journal in Brian's polls, Philosophical Review, publishes only about 15 papers a year and ranks 17th on Google. (I'll admit that I'm kind of rooting for Phil Review, as my paper on the Humean Theory of Motivation came out there. And now it's part of their h-core! Hopefully that hasn't biased me into giving a bad argument.)
This is what happens when you rank journals in the same way you rank the output of individuals. If two people have the same number of papers that get cited a lot, but one has a lot of other papers that nobody cares about, I wouldn't say the person with more papers has worse research output. (Depending on the situation, I might see that person as an equally good researcher with an additional weird hobby.) But if a journal accepts a lot of papers that nobody cares about, it makes publishing there less prestigious.
This isn't to say that journals should accept fewer papers -- in fact, I think they should accept more, given that a lot of good philosophy is going unpublished these days for lack of journal space. There's plenty of good stuff out there that's taking forever to find a home. The lack of publications is holding back debates, and a serious backlog is developing. Phil Review could probably publish three times as many papers with no reduction in average quality. But my point is just that we shouldn't look at a measurement that's indifferent to low average quality, and use it as we would use an average-quality measurement.
I don't think Google Scholar Metrics is totally worthless, though. If you're comparing journals with the same number of papers, it could be helpful. Also, it might be useful if you care about something other than the average quality of a paper in a particular journal. Maybe you're a librarian and you're trying to figure out what a journal subscription is worth. Phil Quality isn't any better than JUA from that standpoint -- either way, you get 25 good papers! So maybe Google Scholar Metrics would help librarians. And maybe someone can come up with a clever mathy fix for h-index that corrects for the effects of accepting more papers. (Update: I see that Kate Devitt is working on something like that.) But for figuring out how to score a job candidate's CV, Google hasn't yet given us anything to rely on.
Recent Comments