Findings and implications from data mining the IMC review process

By: Robert Beverly, Mark Allman

Appears in: CCR January 2013

Abstract: The computer science research paper review process is largely human and time-intensive. More worrisome, review processes are frequently questioned, and often non-transparent. This work advocates applying computer science methods and tools to the computer science review process. As an initial exploration, we data mine the submissions, bids, reviews, and decisions from a recent top-tier computer networking conference. We empirically test several common hypotheses, including the existence of readability, citation, call-for-paper adherence, and topical bias. From our findings, we hypothesize review process methods to improve fairness, efficiency, and transparency.

Public Review By: Sharad Agarwal

The debate on how to improve the conference paper review process rages on. This highly competitive, manual and lengthy process can have a big impact on the dissemination of new ideas, and author morale and careers. The goal of this paper is to encourage our community to analyze data on the review process, both during and after the review process, to help expose and/or correct biases (or lack thereof). This paper analyzes review data from ACM Internet Measurement Conference 2010. The authors find there is no bias with respect to readability, nor reviewer bidding scores. However, they find a topic bias and a citation bias, neither of which I find surprising and both are likely benign. We have to treat the findings with care. This paper uses only one conference's data. The cause of any bias (or lack of bias) has not been uncovered, though that is not a stated goal of the paper. The paper is far from comprehensive in exploring all possible biases. Individual analyses can be improved -- for example, language sophistication is probably not a best fit for technical papers. I expect this paper will generate discussion in the ACM SIGCOMM community. I hope there will be follow-on work by TPC chairs of other conferences and workshops. At the very least, we can help novice authors better understand with objective metrics what the bar is for different venues. We can take solace in knowing that no immediate cause for alarm has been identified in this paper.