Banal Because Format Checking is So Trite Geoffrey M. Voelker University of California, San Diego Workshop on Organizing Workshops, Conferences, and Symposia for Computer Systems (WOWCS’08) This Talk is Not Very Interesting Banal is a format checker for PDF documents Deduces how a document was formatted Optionally compares it with a specification Intended for conference management systems Now being used in HotCRP and EDAS Seemed timely to document its genesis and implementation April 15, 2008 WOWCS’08 2 Why? Preserving reviewer anonymity Assisting conference management tasks Ensuring anonymity rules Possibly helping do initial assignments by mining the bib Fairness Acrobat javascript that calls home when pdf is loaded Everyone else obeyed the rules… Time Already enough time spent on reviewing Frustrated that abuse meant taking even more of my time April 15, 2008 WOWCS’08 3 How? Convert PDF To XML (with pdftohtml) Track the locations of all segments of text, essentially form bounding boxes Compute margins, columns, body font, etc. Heuristics for page #s, headers, footers, etc. April 15, 2008 WOWCS’08 4 Where? A handful of SIGOPS/SIGCOMM conferences OSDI’06, SIGCOMM’07, SIGCOMM’08 Eddie Kohler has integrated it into HotCRP Henning Schulzrinne also integrated banal with EDAS Since 2006, used for over 800 events April 15, 2008 WOWCS’08 5 So? What are our community goals for having formatting requirements? Evil: Annoying trifles that negatively impact our ability to communicate our results and ideas? Helpful: Reflect practicalities of publishing costs and community time? Not surprisingly, I’m in the practical camp April 15, 2008 WOWCS’08 6