Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC) Molecular Environmental Issues Radioactive waste disposal Pollution: molecules and atoms on mineral surfaces Crystal growth and scale inhibition Crystal dissolution and weathering Rocks and Mineral Structures Radioactive waste disposal Pollution: molecules and atoms on mineral surfaces Crystal growth and scale inhibition Crystal dissolution and weathering The “Grand Challenge”. Requires scientists to work together in teams - a Virtual Organisation Organic molecules Metallic elements Halogens Contaminant Sulphides Oxides/hydroxides Phosphates Carbonates Large empirical models Aluminosilicates Linear-scaling quantum mechanics Clays, micas Quantum Monte Carlo Natural organic matter Level of theory Adsorbing surface Design Approach taken: – Over approx 3 years we have engaged in many workshops, tutorials and prototyping with developers and users. Teaching users what e-Science can “do for them”, including security. • – Planned to integrate together some tools which had already been developed/ prototyped at CCLRC, UCL and Reading. • • – Cooperation between CCLRC and NIEeS in Cambridge. A service-oriented approach is used for certain aspects: Grid, data management, user interfaces, metadata management. Workflow was found to be important to users, e.g. for combinatorial studies. Several iterations of software have enabled some usability issues to be addresses. Originally envisaged an “Integrated Portal Architecture” linking HPCPortal, DataPortal and visualisation services. • • We thought we knew what users would like, but actually they preferred a simpler incremental approach; Workflow scripting was preferred to a single portal. There are now several separate tools in use. E-Minerals Portal Technical Strategy • Technology considerations: – Considered: Globus GT2, SRB, Harness, CCF, Portal, Web services, visualisation tools • Various tool sets were tried and the users “voted with their feet” – Used: Globus, Condor, SRB, AG, MAST, RCommands, Metadata Editor, Workflow scripts, Web services, XML/ RDF/ OWL for data interoperability. • Infrastructure – E-Minerals “mini-Grid” was a great success, based on earlier work at Daresbury and Manchester on Grid evaluation. Mini-Grid focuses resources of the e-Minerals VO and includes large campus Condor pools and parallel computers. Using Globus, Condor and GSI. Data managed using SRB. • Collaboration tools – Access Grid, MAST, Wiki Integrated Portal Architecture Generic portal design using Globus and Web Services: GSI Data Systems DataPortal Web Services GridFTP Web Services HPCPortal Web Services Visualisation HPC Systems Globus Working with GGF Grid Computing Environments Research Group Development Issues • Constraints and other issues: – Project divided from outset into: • development team; • application team; • science team. – All teams work together and collaborate on papers – Tools written in C to integrate with existing “heritage” applications, e.g. from the Collaborative Computational Projects (CCPs) – Other interoperability issues addressed using Web services, e.g. gSOAP (client) +AXIS (server), XML-based data models and Semantic Grid technologies RDF+OWL – Constraints: short term goals, no prior experience of e-Science, new technology must not disrupt current work. – High requirements on computing resources for simulation studies • This lead to a focus on workflows for repeated calculations, data management for storing and retrieving results, semantic Web technologies for data interoperability between codes Evaluation • Papers presented at All Hands 2005 included: – E-Science Usability: the e-Minerals Experience (paper 425) – The e-Minerals Project: Developing the Concept of the Virtual Organisation to support Collaborative Work on Molecular-scale Environmental Simulations (paper 518) • User engagement and evaluation: – Looked at the Usability Task Force metrics. – Our approach did not readily map onto them, but there are overlaps – Key: understand the science users, their needs, and their natural ways of working. – Good and bad points summarised on next slides Lessons Learnt What was usable? – Keep it simple – use effective lightweight tools for the job – Condor and Globus – Condor job scripts were accepted readily. Condor-G and DAGMan now used. RSL also embedded in scripts. – SRB – required little training and was found to be useful, SCommands in scripts. – Resource Management – Globus-based resource-monitoring tool was developed (in the Portal). A meta-scheduler is being developed. – Security – GSI proved “easy for users to work with”. The Portal uses MyProxy to ensure pervasive access. Certificates were not a problem – we offered training from Day 1. – Collaboration tools – desktop use of AG enables ad hoc meetings + MAST (Multi-cast Application Sharing Tool). Wiki and Instant Messaging also used. – Semantic technologies. CML was initially used with XSLT and SVG. This now extended in the AgentX toolkit. Lessons Learnt What was not usable? – Client tools * – installation has caused difficulties, e.g. Globus. Initially used “submit machines”. Solutions investigated include: • Portal – hides the complexity behind a Web interface, user doesn’t install anything; • Web service interfaces – for Condor (Chapman et al.), GROWL for Globus and SRB (Allan et al.); • BPEL interface – work at UCL/ OMII – plug-in for Eclipse. – Firewall issues – for both users and infrastructure – changes to rules lead to instability. Portal and Web services solve this problem for users. – Meta-data – tools are available, but automatic harvesting required to avoid mistakes. RCommands developed to improve this, can be linked into the workflow scripts. * A recent workshop “Lightweight Grid Computing” was held 2-3/5/06 at Losehill Hall. Attendees from GROWL, RealityGrid, Imperial College, e-Minerals, e-CCP… Transcript of discussions on usability issues is available giving more detailed information. Future Plans Current and Future development plans: – New tools are being developed, for instance recently the meta-data editor and RCommands were added to the suite . – AgentX data-interoperability tools have been added from e-CCP extending the use of CML. Such work is now timely and illustrates how existing large codes, e.g. Siesta and GULP from CCP5 can be integrated easily with visualisation tools. – Development staff also work on other projects and with other developers. E-Minerals tools are now being evaluated in other areas, e.g. Integrative Biology and e-CCP. There are key synergies and critical mass, sharing of experiences and code/ services. – Full integration via a portal interface was not initially wanted, and also could not be achieved at the start of the project as the technology was not adequate (we tried PHP, now have JSR-168). This is now being re-visited as it provides a good solution to many of the problems highlighted. – Re-usable portlet-based tools from the NGS Portal can be re-used, already done for Integrative Biology and other projects. Can be combined with Wiki etc. Blatant advert: Portals and Portlets 2006 http://www.nesc.ac.uk/esi/events/686/ Some following slides show more details of some of the tools. AgentX Framework - Overview Ontology MOLECULE Mappings locator Data O 0.000 0.000 0.000 “Mol_frag_id” ATOM locator “Atom_frag_id” xCoordinate locator “xCoor_frag_id” H 0.000 0.757 0.587 H 0.000 -0.757 0.587 Specify how to locate data (XML, CML, XLink) with a particular meaning Applications can use tools (AgentX library) that work with the specification to obtain information Classes and properties of entities are specified in an ontology (OWL, RDF/ XML) Mappings (RDF/ XML) associate classes and properties with fragment identifiers (XPointer) Fragment identifiers can be used to locate logical collections (classes) and data items (properties) AgentX Framework - Example DL_POLY3 (CCP5) integrated with CCP1 GUI Mappings CONTROL DL_POLY3 REVCON.xml CCP1 GUI AgentX Mappings CONFIG.xml AgentX core AgentX core - Core library written in C Fortran wrapper Python wrapper - Wrappers for Python, Perl and Fortran Standard Ontology Standard Mappings - Hides the complexities of dealing with XML - Simple API - Enables straightforward exchange of information RCommands • • RCommands are shell tools and associated Web services for metadata manipulation RCommands primary use case is within e-Minerals workflow, i.e. to allow automatic insertion of metadata as a post processing action Function Domain RCommand Rinit Authentication / Session Rexit Rpasswd Rls Entity Operations Rcreate Rrm Parameter Operations Rannotate Permissions Rchmod Rsearch RCommands Service-based Arch RCommands Client Side BPEL Engine gSOAP SOAP Axis RCommand Server Code JDBC Server Side Relational Database Link into workflows Subset of Schema • Title • Description • Notes • Start / End Dates • Originator • Name • Description Name Value Pairs • Name • URI University of Reading Royal Institution