iOpener Workbench: Tools for Rapid Understanding of Scientific Literature Cody Dunne, Ben Shneiderman, Bonnie Dorr & Judith Klavans {cdunne, ben, bonnie}@cs.umd.edu, jklavans@umd.edu 27th Annual Human-Computer Interaction Lab Symposium May 27-28, 2010 College Park, MD iOpener Workbench Contribution • Infrastructure for rapidly summarizing scientific endeavor – Integrate statistics, visualization, reference management, and automatic summarization – Multiple coordinated views Use Cases • • • • Learn about new fields Understand how communities form Analyze citation patterns within communities Easily explore & export all papers in a community What we integrate • Potent network analysis tool – SocialAction – Citation network statistics & visualization – Automatic community detection & visualization • Reference & document management – JabRef – Powerful reference manager with extensive features for search, grouping, review, annotation, and export • Document view with citation linking & highlight • Automatically generated summaries – Citation text, keywords, abstracts What can you do with a graph? • Statistics, lists, and text is helpful, but • Visualizations show unexpected trends, clusters, gaps, outliers • Data cleaning & verification • “Information visualization answers questions you didn't know you had” – Ben S. Importance of Survey Articles • • • • Rapidly expanding disciplines Large volume of scientific publications Increasing cross-disciplinary research Need for accurate surveys of previous work – Short summaries – In-depth historical notes • Multiple users – Scientists – Students & Educators – Government decision makers iOPENER • NSF Info Integration & Informatics program • Information Organization for PENning Expositions on Research Components • Bibliometric lexical link mining • Automatic summarization techniques • Visualization tools for structure and content Ongoing Work • Increase preprocessing of citation texts to vastly improve trimmer summary comprehension • Preliminary case studies with UMD student domain experts – Dependency parsing subset of the ACL Anthology Network (AAN) Coming Soon • Multi-dimensional in-depth long-term case studies – longitudinal case studies with domain experts using their data – close participant observation • Software & generated surveys publicly available and presented to academia and wider audiences iOpener Workbench • Infrastructure to aid rapid summarization of scientific literature • Integrates – Statistics – Visualization – Reference management – Automatic summarization iOpener Workbench: Tools for Rapid Understanding of Scientific Literature Cody Dunne, Ben Shneiderman, Bonnie Dorr & Judith Klavans {cdunne, ben, bonnie}@cs.umd.edu, jklavans@umd.edu tangra.si.umich.edu/clair/iopener This work has been partially supported by NSF grant "iOPENER: A Flexible Framework to Support Rapid Learning in Unfamiliar Research Domains", jointly awarded to UMD and UMich as IIS 0705832. Network Analysis Reference Manager Document & Citation View Summarization Features – Network analysis • SocialAction (Perer, Shneiderman) • Citation network visualization – Force-directed placement (by linkages) • • • • Scatterplots of paper attributes & statistics Statistics rank tables Categorial and numerical range coloring Automatic community detection – Newman '04 fast heuristic Features – Reference Manager • Search by field with simple regex – abstract|keywords=nonprojective and year = 2008 • • • • • • Grouping -- automatic, search results, manual DOI/URL, fulltext (annotated PDF, plain text) Metadata, abstracts User generated reviews BibTeX, Word, OpenOffice integration HTML, EndNote export Document view - features • Citation links • Highlighting Summarization - Features • Automatically generated summaries Citation text, keywords, abstracts • Working to substantially improve coherence & relevance