Inside Laboratory 2.0 Jeremy Frey University of Southampton The impact and influence of Web 2.0-based Services on e-Research “The internet wasn't created for mockery! It was created so scientists from different universities could share datasets....” Simpson, H. The Simpsons (2005), Eds. Groening, M., Brooks, J.L. & Simon, S., Series 16, Episode 8, Original air date (US) 06-Feb2005. http://www.tvtome.com/tvtome/servlet/GuidePageServlet/showi d-146/epid-346864/ In a move to preserve civilization for future generations, the world’s leaders announced Monday their decision to have the accumulated knowledge of all areas of human endeavor written down. “Once this information is transferred to a permanent record on the printed page,” explained British Prime Minister John Major via satellite, “the children of tomorrow will be able to access our knowledge by visually scanning or ‘reading’ the information similar to the way you see me speaking on your television.” Data Generation in Chemistry Characterisation Synthesis Data Deluge 40 years ago a PhD student would determine about 3 crystal structures for their thesis – this can now be easily achieved in a day 0.5 million 35 million 2.5 million ‘Few thousand’ The primary cause is the current data publication process, which is tied to journal articles and peer review Data Publication & Information Loss Spectroscopic analysis is often performed to ensure a reaction is proceeding according to plan – as a result <5% are published (via a process with heavy information loss) http://www.theonion.com/content/node/28104 The Solution Intellect & Interpretation (Journal article, report, etc) Underlying data (Institutional data repository) Core of Self describing data • Store of data that can be viewed and manipulated in different ways • User interfaces to suite user and occasion He is charged with expressing contempt for meta-data Late 20th Century Labs Experience – Grey Literature People Equipment Lab Records Environment Literature Computation Late 20th Century Labs Experience – Grey Literature Lab Records Computer Computer People Equipment Environment Computation Computer Computer Computer Computer Computer Literature Experience – Grey Literature Lab Records Computer People Computer Computer Computer Environment Computer Computation ELN Blog MyExpt Equipment 21th Century Labs Computer Computer Literature Blog Wiki Repositories User interfaces Time Line Planning Information Data + Metadata Virtual World Social Networks Google Wave “MyExperiment” Faraday’s laboratory notebooks are also remarkable in the amount of detail that they give about the design and setting up of experiments, interspersed with comments about their outcome and thoughts of a more philosophical kind. All are couched in plain language, with many vivid phrases of delightful spontaneity…. Peter Day, ‘The Philosopher’s Tree: A Selection of Michael Faraday’s Writings’ Observations are never collected on note pads, filter paper or other temporary paper for later transfer into a notebook If you are caught using the “scrap of paper” technique, your improperly recorded data may be confiscated by your TA Literature Formal Literature Grey Literature Databases Repositories Lab Blog ELN Experiments Experiments Repositories Experiments Middleware • Middleware is the connecting software between separated components • Message brokering system used • Increases scalability and interoperability • Without middleware • With middleware 17 meta Analysis & Discussion: Blogging Experiments A repository… • Allows one to put, store and get • Provides search and browse • DOES NOT provide presentation and discussion functions essential to working up a scientific study • ‘Geographically distributed collaborative research’ • Open or private • A useful approach for sharing ‘failed’ experiments? http://chemtools.chem.soton.ac.uk/projects/blog/ Blogging Innovations Machines Sensors Tools Time Line View Journal publication Test Data Management Grant Applications Analysis / Conference reports RESULTS! Online blog3 Stack 25 User Directory Cell Blog Management Simple Experiment Ontology Link with OREChem Validation • Increasing the value of data • How to bring all the necessary information together to enable appropriate validation • Increasingly difficult & expensive to achieve Need provenance and context otherwise just a collection of items oreChem – The Chemical Semantic Web • • • • • • • At-source capture of chemistry data Chemical structure search Compound object authoring Retrospective harvesting of chemistry data Reuse through common ORE data model Semantic authoring Virtualized triple storage • • • • • • University of Cambridge Cornell University Indiana University Penn State University University of Queensland University of Southampton Mash-up (reuse) Semantic Graph (storage) experiments text measurements Data (capture) documents data scientists molecules data molecules 32 The Circle of Attribution Store Link 33 Confusion of Digital Reality http://www.theonion.com/content/node/29147 Awareness of symbols and meanings in user interfaces We need a Semiotic Web or at least a better understanding of the semiotics of the web Thanks • RC UK, EPSRC, JISC for funding • Colleagues and Students from the Schools of Chemistry, Electronics & Computer Science, Mathematics • IBM, Microsoft • www.combechem.org • www.ecrystals.soton.ac.uk • chemtools.chem.soton.ac.uk 38