EPSRC e-Science Pilot Projects Comb-e-Chem Structure-Property Mapping: Combinatorial Chemistry and the Grid The synthesis of new compounds by combinatorial methods provides major opportunities for the generation of large volumes of new chemical knowledge. An extensive range of primary data needs to be accumulated, integrated and Dr Jeremy Frey relationships modelled, so that maximum knowledge can be derived. The goal of the Comb-e-Chem project is to develop an e-science testbed that integrates existing structure and property data sources within a grid-based Prof. Dave de Roure Dr. Jonathan Essex information-and knowledge-sharing environment. The service-based grid computing infrastructure extends to devices in the laboratory and involves enriched streams, (including multimedia and live metadata), full support for provenance and innovative techniques for automation throughout the environment. Prof. Mike Hursthouse Dr. Mike Luck Comb-e-Chem Objectives: • to support new data collection, including process as well as product data, based on integration with electronic lab and e-logbook facilities Dr. Luc Moreau Prof. Alan Welsh Prof. Sue Lewis • to integrate data generation on demand via grid-based quantum and simulation modelling to augment the experimental data Dr. Mike Surridge Dr. Ian Meacham • to develop interfaces that provide a unified view of these resources, with transparent access to data retrieval, online modelling, and design of experiments Prof. Guy Orpen to populate new regions of scientific interest • to provide shared, secure access to these resources in a collaborative e-science environment. www.e-science.soton.ac.uk/projects.html A collaboration between: University of Southampton University of Bristol Roche Discovery Welwyn Pfizer IBM UK Ltd Cambridge Crystallographic Data Centre Southampton Combinatorial Centre of Excellence EPSRC e-Science Pilot Projects DAME: Distributed Aircraft Maintenance Environment The aim of this project is to demonstrate the use of the high-performance computing grid to support diagnosis using geographically distributed information and analysis. The practical, real world demonstration will help Rolls-Royce with aero engine maintenance decisions, and other sectors such as medical and manufacturing to improve their diagnostic processes. Prof. Jim Austin The Universities of York, Sheffield, Leeds and Oxford have joined forces with RollsRoyce, its information system partner Data Systems & Solutions and Cybula Limited to meet this challenge. Prof. John McDermid Prof. Andy Wellings The project will deliver: • a proof of concept demonstrator for the Grid • a generic distributed diagnostics test-bed Prof. Lionel Tarassenko Prof. Peter Fleming • an aero gas turbine application demonstrator for the maintenance of aircraft engines • techniques for distributed data mining and diagnostics Prof. Peter Dew Dr. Alison Mckay • an evaluation of the existing US grid networks Globus and SRB for this task DAME will build on the grid infrastructure, known as the White Rose Computational Grid, currently under construction by Leeds, Sheffield and York universities at a cost of £2.8 million. Dr. Haydn Thompson Dr. Karim Djemame The essential themes of this project are real-time intelligent feature extraction, high-performance pattern-matching, intelligent data mining and decision support techniques, where expertise and software tools are distributed across the grid. The enormity of the databases and the need for distributed access to the data make this a particularly challenging problem for the grid. www.cs.york.ac.uk/dame A collaboration between: Rolls-Royce plc. Data Systems & Solutions Cybula Limited EPSRC e-Science Pilot Projects GEODISE: Grid Enabled Optimisation and DesIgn Search for Engineering GEODISE will provide grid-based seamless access to an intelligent knowledge repository, a state-of-the-art collection of optimisation and search tools, industrial strength analysis codes, and distributed computing and data resources. Engineering design search and optimisation is the process whereby engineering modelling and analysis are exploited to yield improved designs. In the next 2-5 Prof. Simon Cox years intelligent search tools will become a vital component of all engineering design systems and will steer the user through the process of setting up, executing and post-processing design search and optimisation activities. Such systems typically require large-scale distributed simulations to be coupled with tools to describe and modify designs using information from a knowledge base. These tools Prof. Andy Keane are usually physically distributed and under the control of multiple elements in the supply chain. Whilst evaluation of a single design may require the analysis of gigabytes of data, to improve the process of design can require assimilation of terabytes of distributed Prof. Carole Goble data. Achieving the latter goal will lead to the development of intelligent search tools. GEODISE will focus on the use of computational fluid dynamics (CFD). This application is relevant to its existing industrial partners: Prof. Nigel Shadbolt BAE Systems/ Rolls-Royce and Fluent and will leverage expertise from e.g. Prof. Mike Giles Advanced Knowledge Technologies IRC (Soton) and BAE/RR UTP for Design (Soton) and RR UTC for CFD (Oxford) www.geodise.org/ A collaboration between: University of Southampton University of Oxford University of Manchester Rolls-Royce plc BAE Systems plc Fluent Europe Ltd Intel Corp (UK) Microsoft Ltd Epistemics Ltd Compusys plc Condor EPSRC e-Science Pilot Projects myGrid: An e-Biologist’s Workbench Lead by the University of Manchester, myGrid is a consortium of five universities the EMBL_EBI at Hinxton and eight commercial partners. The team is divided into end-users and technology/service providers. myGrid aims to deliver a personalised collaborative problem-solving platform for an e-Scientist working in a distributed environment, such that they can construct long-lived in silico experiments, find and adapt others and publish their own view on public repositories, and be better informed as to the provenance the currency of Prof. Carole Goble the tools and data directly relevant to them. The focus is on data-intensive post-gernomic functional analysis. myGrid will develop an extensible open platform for data and tools interoperability built using a mix of four technologies: the Grid, Web Services, the Semantic Web Dr. Paul Watson and an agent software engineering paradigm. Key functional features include: data integration, process workflow, personalisation, provenance, change notification and view management, collaborative sharing or process flows and resources. Prof. Tom Rodden Dr. Luc Moreau Non-functional requirements include security and fault tolerance. The ultimate goal is to improve both the quality of information in repositories and the way repositories are used. The appropriateness of the infrastructure will be shown in two ways: Dr. Rob Gazaiskaus • for the e-Scientists: by a workbench and two applications - Model organism gene expression analysis - GPCR fingerprints database annotation • for developers: by the dissemination of a “myGrid-in-a-box” developers kit Dr. Alan Robinson - the specification of services - service descriptors - APIs and message protocols - Implemented pilot services and the assimilation of example existing Life A collaboration between: University of Manchester University of Newcastle University of Nottingham University of Southampton Science integration platforms. The project approach is incremental and evolutionary, based on a series of prototypes and using open standards and open source. An exploratory “pre-prototype” to validate use case acquition and identify core services has just University of Sheffield been completed. The final results will be disseminated on a rolling programme European Bioinformatics Institute starting in June 2003. All software will be available as Open Source. AstraZeneca GlaxoSmithKline MERCK KgaA Sun Microsystems Network Inference Epistemics Ltd GeneticXchange IBM UK Limited www.mygrid.org.uk/ EPSRC e-Science Pilot Projects The RealityGrid A Tool for Investigating Condensed Matter and Materials RealityGrid will construct a Grid test-bed to enable the realistic modelling and simulation of complex condensed matter systems at the meso and nanoscale levels, as well as the discovery of new materials. High performance computing and visualisation are critical to this test-bed: they provide a synthetic environment for modelling to be compared and integrated with the reality provided by experimental data. Prof. Peter Coveney RealityGrid will provide Grid hardware and middleware that will allow these to be coupled in an environment optimised for scientific discovery. The project involves active collaboration with industry: Dr. John Brooke Prof. John Darlington Advanced Visual Systems, Silicon Graphics Inc and Fujitsu on the underpinning computational issues, Schlumberger and the Edward Jenner Institute for Vaccine Research on end-user scientific applications in the conjunction of modelling, simulation, informatics and experimental research. Prof. Roy Kalawasky Prof. Adrian Sutton www.realitygrid.org Prof. John Gurd Prof. Michael Cates A collaboration between: Queen Mary, University of London University of Manchester University of Edinburgh Imperial College of Science, Technology and Medicine Loughborough University University of Oxford Schlumberger Cambridge Research Ltd. The Edward Jenner Institute for Vaccine Research Silicon Graphics Inc Advanced Visual Systems Ltd. Fujitsu Ltd Computation for Science EPSRC e-Science Pilot Projects Discovery Net: An e-Science Testbed for High Throughput Informatics The DNet project aims to design, develop and implement an advanced infrastructure to support real-time processing, interpretation, integration, visualisation and mining of massive amounts of time critical data generated by high throughput devices. The project will maximize the benefit of testing EPSRC-funded infrastructure and will cover new technology devices and technology including biochips in biology, high throughput screening technology in biochemistry and Dr. Yike Guo combinatorial chemistry, high throughput sensors in energy and environmental science, remote sensing and geology. Application studies include analysis of Protein Folding Chips and SNP Chips using LFII technology, protein-based fluorescent micro array data, air sensing data, renewable energy data, and geohazard prediction data. Prof. John Darlington The development program of DNet will focus on the design and implementation of four important components: grid infrastructure, data engineering, information structuring and knowledge discovery. Each component provides mechanisms to deal with the issues of high throughput informatics. Apart from delivering a Dr. Daniel Ruekert Dr Tony Cass practical distributed discovery platform, DNet will focus on the establishment of a Dr. John Hassard set of standards for representing and communicating high throughput information Prof. Bob Spence for integrated research. Such standards will be promoted by establishing Dr. Jian Liu international collaborations in DNet research and integrating DNet with data grid Dr. Moustafa Ghanem activities and related distributed data analysis research in the USA. A collaboration between: Imperial College of Science, Technology and Medicine Inforsense Ltd DeltaDOT Ltd RVCo Inc