Towards a Model of Computer Systems Research Tom Anderson University of Washington P2P vs. Systems Research P2P No centralized control Emergent behavior Heavy tailed distributions Incentives matter Randomness helps Systems Research No centralized control Emergent behavior Heavy tailed distributions? Incentives matter? Randomness hurts? This talk: •Explain systems research using tools from P2P systems research •Suggest some mechanisms to better align author and conference incentives 2 Mean Score + StdDev NSDI 08 3 Mean Score + StdDev OSDI 06 4 Mean Score + StdDev SOSP 07 5 Randomness is Fundamental? Little consensus as to what constitutes merit − − − − − Importance of problem? Creativity of solution? Completeness of evaluation? Effectiveness of presentation? All of the above? Large #’s of submissions makes consistency hard to achieve − − Small PC, huge workload, burnout, lack of attention to detail Large PC, lower workload, less consistency 6 SIGCOMM 06 Experiment Manage randomness explicitly − − Large PC, split between “light” and “heavy” Light + heavy PC: bin into accept, marginal, reject • With as few reviews as possible • Add reviews for papers with high variance • Add reviews for papers at the margin Program committee meeting (just heavy PC) − − − − Pre-accept half the papers Pre-select 2x to discuss Each paper under discussion read by at least 5 from heavy PC Result: success disaster • Little basis for discriminating between papers at the boundary 7 Two Models of Distribution of Merit 8 Citation Distribution for SOSP 9 Incentives for Marginal Effort With unit merit and no noise: − Impulse function at accept threshold With unit merit and noise, single conference: − Gaussian function at accept threshold With unit merit, high noise, and multiple conferences: − − Peak incentive well below accept threshold Repeated attempts without improving paper We’d like effort to reflect the underlying merit of the idea − − Good ideas are pursued, even after publication Mediocre ideas are published, and the author quickly moves on 10 A Modest Suggestion Reward, like merit, should be a continuous function Publish rank and error bars for every paper accepted at a conference − − Computed automatically from individual PC ranking Post-hoc (benefit from perspectives of all reviewers) After some time has elapsed, re-rank − − Encourage continued effort on good ideas Like test in time, but applied to all published papers 11 Afternoon Discussion Topics Double-blind vs. single-blind reviews Should authors disclose previous reviews of the same paper? Are author-rebuttals useful? When should ``open reviews'' be used? Should we review the reviewers? CS-wide citation reporting and indexing Travel reduction Decoupling publication from presentation How do we quantify the merit of a conference? Do PCs tend to favor PC-authored papers? How random are PC decisions? How big is the rejected-paper tumbleweed? 12 Afternoon Discussion Topics Is there a correlation between PC size and conference impact? Does overlapping membership between PCs decrease diversity? Is there a correlation between number of papers accepted and quality? Do overall scores predict what gets accepted? What do authors like and dislike about reviews? How to handle suspected author misbehavior How to handle suspected reviewer misbehavior When, why, and how to shepherd Reviews of review-management software Proposals for new or improved review-management features 13