Community Data Evaluation using a Semantically Enhanced Modelling Process e-mail: mhh@comp.leeds.ac.uk • , Mohammed Haji 1, Peter Dew 1, Chris Martin 1,2 • 1 School of Computing, University of Leeds • 2 School of Chemistry, University of Leeds Content • Community Data Evaluation using a Semantically Enhanced Modelling Process • Capturing Provenance and Data • Current practices and the Electronic Lab Notebook • Evaluation • Conclusion 2 Community Data Evaluation • Progress in many scientific communities depends on complementary experimental and theoretical development. • These communities require high quality data to evaluate findings. - Our primary community is the Atmospheric Community . • The Motivation – Study how to transition from today's ad-hoc process practises – Sustainable process of • Gathering, community evaluation and sharing data & models between scientists • Minimising changes to proven working practises of the scientist • Operate within world-wide co-laboratories 3 Capturing Provenance Data • Provenance is captured in three forms namely Inline (during the experiment execution), pre-hoc and post-hoc, before and after the experiment. • Broadly speaking there are two categories for capturing provenance data in e-Science projects: • System oriented: There are usually tightly coupled with the workflow paradigm and seek to automatically capture provenance. • User oriented: Adopting key practises from the scientific approach and use domain specific scientific terminologies. • In this research we seek to develop a user oriented approach and reconcile with the system orientation to automate process provenance capture. Specifically capturing inline annotation. 4 Current Evaluation Processes for the MCM Links to experimental data Evaluation Work Group Community Database Data sources Lab-Book Capture of the Model Development Provenance Inputs to the modelling process: Benchmark data Model parameter sets etc. 5 Envisioned Evaluation Processes Links to experimental data and provenance generation processes SeMEEP Community Evaluation Subjective Community Semantic Database Laboratory Archive Workgroup database Data sources Semanticenabled ELN ELN Capture of the Model Development Provenance Model Execution Analysis Inputs to the modelling process: Benchmark data Model parameter sets etc. Model Development 6 Scientist’s Personal ELN Archive ELN Process Planning the Scientific Process Modelling Plan Ontology Mechanism Editing Model Execution Scientific Process Automatic Metadata Capture Mechanism version n-1 Mechanism version n Compare to generate metadata Capture Metadata at run time User Annotation Metadata Storeage Metadata Storeage 7 Model Output Analysis ELN Screenshots • Prompts displayed when changing the chemical mechanism; • Editing a reaction • Adding a new reaction 8 Evaluation Methodology • In-depth interviews with members of the atmospheric chemistry model group at Leeds, covering: – Demonstration of the prototype – User testing of the prototype – Discussion of scenarios involving the use of the prototype. • Analysis – Interviews recorded and transcribed – Analysed using techniques from grounded theory 9 Evaluation • Barriers to adoption: – Effort required at modelling time for provenance capture • “[in] your lab book you can write down what ever you want [but with an ELN] it is going to take time to go through the different protocol steps”. – When asked if they would use an ELN requiring a similar amount of user input to the prototype the response was positive: • “Yeah, I think it would be a good thing. I don’t think it is too much extra … work.” – Rather than viewing the prompts for user annotation as interruption to their normal work the user recognised the value of being prompted • “is a good way to do it because otherwise you won’t [record the provenance].” 10 Conclusion • Outlined the Community Data Evaluation using a Semantically Enhanced Modelling Process and the ELN. • The work is focused on a user-oriented approach using domain specific scientific terminologies. • Showed the community evaluation vision. • Discussed the ELN evaluation method. • Future work – Carry out further investigation into the atmospheric chemistry community. – Look into other community that would benefit from this work such as Geomagnetism. Acknowledgement - Peter Jimack, David Allen and Mike Pilling 11