NeSC and eSI Dave Berry, Research Manager PRISM Forum, 28th April 2005 NeSC and eSI roles NeSC – the national centre International gateway to UK e-Science UK and EU Training Standardisation work eSI – the international research institute Conferences and workshops Research visitors e-Science research Digital Curation Centre Middleware (OGSA-DAI, SunDCG, edikt, …) Science applications (GridQTL, BRIDGES, …) Industrial projects Engage industry Stimulate the uptake of e-Science technology UK e-Science Budget (2001-2006) Total: £213M + £100M via JISC EPSRC Breakdown M RC (£21.1M ) 10% EPSRC (£77.7M ) 37% Staff costs only Applied (£35M) 45% Grid Resources HPC (£11.5M) BBSRC (£18M ) 15% 8% NERC (£15M ) 7% Computers & NetworkCore (£31.2M) (£57.6M ) 40% funded separately PPARC27% CLRC (£10M ) 5% ESRC (£13.6M ) 6% + Industrial Contributions £25M Source: Science Budget 2003/4 – 2005/6, DTI(OST) The e-Science Centres Globus Alliance National Centre for e-Social Science Open Middleware Infrastructure Institute e-Science Institute Grid Operations Support Centre Digital Curation Centre CeSC (Cambridge) EGEE, ChinaGrid National Institute for Environmental e-Science National Grid Service NGS core nodes data nodes at RAL and Manchester compute nodes at Oxford and Leeds free at point of use apply through NGS web site to do: project or VObased application and registration all access is through digital X.509 certificates from UK e-Science CA or recognized peer National HPC services HPCx (EPCC) CSAR (Manchester) Must apply separately to research councils Cyprus Total: Total: 87 Sites 87 Sites 8784 8784CPUs CPUs 33 PByte PByte LCG-2/EGEE-0 Status 08-11-2004 The Primary Requirement … Building people grids Enabling People to Work Together on Challenging Projects: Science, Engineering & Medicine Recent meetings PRISM forum EPSRC e-Science review UK Hacklatt Workshop (Lattice QCD) UK Globus week Introduction to the NGS The Accessibility e-Olympics 5th Annual Dependability IRC workshop Gridsphere and portlets workshop Introduction to the Edinburgh Mouse Atlas and EMAGE Gene Expression Database NeSC Training Team Enabling, facilitating and delivering quality training in the UK and Internationally Formed in April 2004 Grown from initial two staff to: five dedicated trainers two developers one dissemination officer Funded by EU and UK Supports EGEE and the NGS Example: GGF Summer School The NeSC training team made a central contribution planning, organising and presenting the 2004 GGF Summer School. The event was attended by 84 selected advanced international students. Other presenters included Carl Kesselmann (Globus) and Miron Livny (Condor) First 6 months In the first 6 months from its inception the training team directly delivered: 5 training events in the UK (35 trainees - advanced) X X At NeSC and the University of Stafford For JISC and EGEE 6 training events in Europe (183 trainees – introductory/advanced) X X At CERN, FZK Karlsruhe, CNB Madrid, Lithuania and Italy. For EGEE Also dissemination presentations to introduce people to the concept of the grid Coordination of training in Europe The NeSC training team is the leading partner for the training activity in the EGEE project. 1400 1200 1000 800 600 400 Workshops Advanced 200 Developer 12.4.05 23.3.05 10.3.05 4.2.05 28.2.05 17.1.05 9.12.04 16.12.04 29.11.04 4.11.04 15.11.04 22.10.04 21.9.04 11.10.04 8.9.04 17.9.04 18.7.04 29.6.04 6.5.04 26.5.04 7.4.04 Induction 14.1.04 0 Event Training Overall Feedback 6.00 5.00 Score We coordinate and provides quality assurance for training with 22 partner institutions in 13 countries. Total attendance at courses No of students 4.00 3.00 2.00 1.00 0/1/00 20/1/00 9/2/00 29/2/00 Date 20/3/00 9/4/00 Geographical distribution of EGEE courses Current training topics EGEE training produces courses based on commitments in the execution plan and training requests which have been received Induction LCG2 installation LCG2 APIs Design UML Related project support DILIGENT VO specific training GATE Biomed gLite preparation Web Services WSDL WSRF GT4 Biomed Application Developers Course, Madrid Cataloging Summer 2004: OGSA specification informational document Provisioning VO Mgmt Integration Policy Mgmt Access Context Services Information Services Data Services Application Mgmt Workflow Mgmt Workload Execution Mgmt Planning Job Mgmt Execution Mgmt Services Reservation Configuration Deployment Provisioning Resource Mgmt Services Troubleshooting Infrastructure Services Self Mgmt Services Security Services Heterogeneity Mgmt Authentication Optimization Authorization Service Level Attainment Integrity Boundary Traversal QoS Mgmt Event Discovery Logging Mgmt WSRF WSN WSDM Naming Data Services design team Informal domain expert groups within OGSA May include co-chairs of other WG/RGs Output is included in OGSA specification DAIS-WG OGSA Data Service Design team GSM-WG GFS-WG OGSA-WG Tele cons, F2F meetings ByteIO WG NeSC and eSI roles NeSC – the national centre International gateway to UK e-Science UK and EU Training Standardisation work eSI – the international research institute Conferences and workshops Research visitors e-Science research Digital Curation Centre Middleware (OGSA-DAI, SunDCG, edikt, …) Science applications (GridQTL, BRIDGES, …) Industrial projects Engage industry Stimulate the uptake of e-Science technology eSI Workshops Space for real work Crossing communities Creativity: new strategies and solutions Written reports Scientific Data Mining, Integration and Visualisation Grid Information Systems Portals and Portlets Virtual Observatory as a Data Grid Imaging, Medical Analysis and Grid Environments Open Issues in Grid Scheduling Data Provenance & Annotation e-Science Workflow Services GeoSciences & Scottish Bioinformatics Forum Suggestions always welcome! Attendance from different countries 900 Other 800 Australasia 700 North America Europe (non UK) 600 UK (Other) AC.UK 500 400 300 200 Year 3/Q4 Year 3/Q3 Year 3/Q2 Year 3/Q1 Year 2/Q4 Year 2/Q3 Year 2/Q2 Year 2/Q1 Year 1/Q4 Year 1/Q3 0 Year 1/Q2 100 eSI Industrial Involvement 133 delegates from 64 companies including not only: IBM, Microsoft, Oracle, Sun, HewlettPackard, … but also: Apple, Astra Zeneca, BAE, Cisco, Honeywell, Motorola, Organon, Pfizer, Siemens, … eSI Research Visitors Collaborate with UK research and development Engage in and develop eSI event programme Build bridges with your community Visit for anywhere between one week and six months Link up with regional e-Science centres Becoming a research visitor Establish a collaboration with NeSC Pre-established mutual interests We encourage diversity of disciplines Complementary experience, knowledge and skills We can help match interests and develop a plan Visitors already engaged in relevant R&D This is not a training opportunity Our support depends on the length and value of visit Typically covers travel and/or local living costs Application via our web site NeSC Website Statistics www.nesc.ac.uk 50 750000 AFRICA 45 AM ERICA - NORTH 650000 AM ERICA - OTHER ASIA 40 EUROPE - UK 550000 EUROPE - NON UK 35 M IDDLE EAST OCEANIA - PACIFIC 25 350000 20 250000 15 150000 10 50000 5 Year 3/Q4 Year 3/Q3 Year 3/Q2 Year 3/Q1 Year 2/Q4 Year 2/Q3 Year 2/Q2 Year 2/Q1 Year 1/Q4 Year 1/Q3 -50000 Year 1/Q2 0 Successful Hits HITS Year 1/Q1 Volume (GB) 450000 UNKNOWN 30 NeSC Website National e-Science Centre http://www.nesc.ac.uk/ Mission, Background, Foundation Locations, Staff, Resources, Projects Register interest, Mailing lists, NeSCForge Regional associations and Collaborations News, Notices Presentations and Lectures http://www.nesc.ac.uk/presentations/ e-Science Institute http://www.nesc.ac.uk/esi/ Mission, Events (Future and Past) Register for Events, Visitor Programme UK e-Science Map and Index of Centres Technical Papers Index of >100 Projects Task Forces General Information Glossary, Bibliography, Who’s who E-Science job vacancies http://www.nesc.ac.uk/centres/ http://www.nesc.ac.uk/technical_papers/ http://www.nesc.ac.uk/projects/ http://www.nesc.ac.uk/teams/ NeSC and eSI roles NeSC – the national centre International gateway to UK e-Science UK and EU Training Standardisation work eSI – the international research institute Conferences and workshops Research visitors e-Science research Digital Curation Centre Middleware (OGSA-DAI, SunDCG, edikt, …) Science applications (GridQTL, BRIDGES, …) Industrial projects Engage industry Stimulate the uptake of e-Science technology Digital Curation Centre • Actions needed to maintain and utilise digital data and research results over entire life-cycle – For current and future generations of users • Digital Preservation – Long-run technological/legal accessibility and usability • Data curation in science – Maintenance of body of trusted data to represent current state of knowledge in area of research • Research in tools and technologies – Integration, annotation, provenance, metadata, security….. Trusted Repositories of Knowledge • The Maori entrusted their knowledge to people, trained to be the repositories,who could: – – – – – receive information with the utmost accuracy store information with integrity beyond doubt retrieve the information without amendment apply appropriate judgement in the use of the information pass on the information appropriately Whatarangi Winiata, (2002), Repositories of Röpü Tuku Iho: A Contribution to the Survival of Mäori as a People, Wellington: Library & Information Association of New Zealand Aotearoa Annual Conference, 17-20 November 2002 Special thanks to Professors Derek Law & Seamus Ross communities of practice: users curation organisations community support & outreach Collaborative Associates Network of Data Organisations services management & coordination research research collaborators development testbeds & tools Industry standards bodies Data exchange on the Web Web DTD XML XML Q: XML view DB1 DB2 All members of a community (industry) agree on a DTD and then exchange data w.r.t. it: e-commerce, health-care, ... XML Publishing: mapping relational data to XML conforming to the predefined DTD Archiving (preserving) databases How do you preserve something that changes every hour or minute? Important for the scientific record – someone might have cited your data at time t. Current practice Create versions (how often?) Log changes Use diffs Do nothing (common!) Uncompressed Archive size is ≤ 1.01 times diff repository size ≤ 1.04 times size of largest version Compressed archive size between 0.94 and 1 times compressed diff repository size gzip - unix compression tool XMill - XML compression tool Size (bytes) x 106 100 days of OMIM c diff n i , e v archi version Legend •archive •inc diff •version •compressed inc diff •compressed archive gzip(inc diff) XMill(archive) The OGSA-DAI Project Powered by …. Funded by the Grid Core Programme OGSA-DAI £3 million, 18 months, from Feb 2002 Three major releases, three interim releases DAIT (DAI-Two) Keep the OGSA-DAI brand name £1.5 million, 24 months, from Oct 2003 Four major releases OGSA-DAI Downloads by country 792 registered users @ 23/8/04 BRIDGES C F G V ir t u a l P u b lic a lly C u r a te d D a t a E nsem bl O r g a n is a t io n O M IM G la s g o w S W I S S -P R O T P riv a te E d in b u r g h MGI VO Authorisation P r iv a te d ata O x fo rd bl a st Synteny Grid Service HUGO … RGD L e ic e s te r D ATA HUB OGSA-DAI P riv a te data d ata Information Integrator P r iv a te d ata N e th e rla n d s P r iv a te data London P riv a te d ata + database engine 1 ODD-Genes registry database engine 2 GridQTL: High performance QTL analysis via the Grid Execute QTL analyses on grid computing resources Describe parallel computation requirements Automatic task-level decomposition of analysis requests Schedule, monitor and re-start decomposed tasks Provide a secure and private data space for each researcher Synchronise application input and output Enable analysis re-start from intermediate results Be a robust public service GridQTL Portal Analysis 1 Data Mgr Analysis 2 Analysis Portlet Analysis 3 Analysis 4 Analysis 5 Meta Sched UK e-Science Grid or NGS Resources Virtual Observatories Observations made across entire electromagnetic spectrum ROSAT ~keV DSS Optical 2MASS 2µ IRAS 25µ IRAS 100µ GB 6cm NVSS 20cm WENSS 92cm ⇒e.g. different views of a local galaxy Need all of them to understand physics fully Databases are located throughout the world Peter Clarke VOTES Virtual Organisations for Trials and Epidemiological Studies 3 year MRC (£2.9M) funded project Plans to develop Grid infrastructure to address key components of clinical trial/observational study X X X Recruitment of potentially eligible participants Data collection during the study Study administration and coordination – Involves Glasgow, Oxford, Leicester, Nottingham, Manchester Clinical Virtual Organisation Framework Used to realise CVO-1 (e.g. for data collection) CVO-2 (e.g. for recruitment) LeiNott GLA Transfer Grid GPs OX IMP Clinical trial data sets Disease registries Hospital databases Scottish Bioinformatics Research Network Funded (£2.4M) by Scottish Enterprise, Scottish Higher Education Funding Council, Scottish Executive Environment and Rural Affairs Department Involves Glasgow, Dundee, Edinburgh, Scottish Bioinformatics Forum Aim to provide bioinformatics infrastructure for Scottish health, agriculture and industry Infrastructure support at Dundee, Edinburgh and Glasgow to support first-rate research in bioinformatics at each academic institute Infrastructure support at three institutes, to support inter-institutional sharing of compute and data resources through application of Grid computing Outreach and training activities mediated by the Scottish Bioinformatics Forum Genetics and Healthcare Initiative Funded by Health Department and Department for Enterprise and Lifelong Learning Involves Glasgow, Dundee, Edinburgh, Aberdeen Genetics as applied to healthcare first two years emphasis on providing a platform for research into the genetic basis of common complex diseases in Scotland X X Mental health, cardiovascular, … Plan to establish 15,000 family-based intensively-phenotyped cohort recruited from the East and West of Scotland Basis for neutralising heritable (genetic) risk factors in disease surveillance, treatment optimisation, avoidance of adverse drug events and prediction of response to therapy, health care planning and drug discovery, … DyVOSE Dynamic Virtual Organisations for e-Science Education (DyVOSE) project Exploring advanced authorisation infrastructures for security X … in Grid Computing Module as part of advanced MSc at Glasgow – Provide insight into rolling Grid out to the masses! ScotGrid GU Condor pool Other (known!) Grid resources Education VO policies PERMIS based Authorisation checks Authorisation decisions Edikt Standards Requirements analysis Technology matchmaking E-Science Apps CS Research Edikt project Gap filling Grid Services for e-Science Data Management Rigorous engineering Commercial SW components and skills The team: 8 professional software engineers, support staff, project manager, commercialisation manager, architect, and SAB SHEFC funded research and development grant 3 years funding: May 2002 – 2005 +3 years funding upon successful project and review ELDAS – Data Access Service Grid User1 Another (partial) implementation of the GGF WS-DAI specifications Grid User2 Grid Proxy ELDAS Xindice DB DAC Web User1 Web Servlet Java Framework EJB - DAS DAC MySQL DB DAC DAC DB2 DB Oracle 9i DB Implemented using Enterprise Java Beans Data Access Components interface to distinct DBMSs Accessible as a grid data service or a web data service BinX – accessing legacy binary data simulations The Problem: Many binary data files Applications must “know” the data format Binary data formats are machine-specific Binary Binary Binary Binary Data File Binary Data File Binary Data File Data File Data Data File File The Solution: Write a “stand-aside” format description in XML Provide a library to X X Interpret the description Provide file access across different machines Build higher-level services BinX BinX file file describes describes binary binary file file structure structure BinX Library e-Science Application NeSC and eSI roles NeSC – the national centre International gateway to UK e-Science UK and EU Training Standardisation work eSI – the international research institute Conferences and workshops Research visitors e-Science research Digital Curation Centre Middleware (OGSA-DAI, SunDCG, edikt, …) Science applications (GridQTL, BRIDGES, …) Industrial projects Engage industry Stimulate the uptake of e-Science technology Mammography A prototype of a national database of mammographic images in support of the UK breast screening programme Standard Standard Mammo Mammo Format Format Mammograms have different appearances, depending on image settings and acquisition systems Temporal mammography Computer Aided Detection 3D View FirstDIG Data mining with the First Transport Group, UK Example: “When buses are more than 10 minutes late there is an 82% chance that revenue drops by at least 10%” "The results of this exercise will revolutionise the way we do things in the bus industry.“, Darren Unwin, Divisional Manager, First South Yorkshire. OGSA-DAI OGSA-DAI OGSA-DAI OGSA-DAI Client Application Data Mining Application OGSA-DAI INWA EPCC,UK TOG Grid Engine Bank Telco OGSA-DAI Bank data OGSA-DAI UK Property Curtin,Australia TOG Grid Engine user@edinburgh Bank Telco OGSA-DAI Telco data Data Browser OGSA-DAI Australian property Data Browser user@australia Mont Blanc Tunnel Fire TIME (min) 0 Events Consequences Fire detected Emergency assessment too slow! Lack of co-ordination b/w 2 sides too many vehicles enter tunnel 10 1st Decision Traffic stopped 15 1st Response Fans turned on in wrong direction! Enhancement of smoke and fire 20 French Fire Br. 25 Italian Fire Br. Intervention made difficult by poor initial response 39 dead! Asif Usmani Mont Blanc Tunnel Fire & FireGrid TIME (min) Events Pre-emergency response planning x- Case-based training Consequences Many scenarios generated Co-ordination Preparedness 0+ Fire detected Sensors channel info to C&C ~1 1st Decision Traffic stopped Early forecasts 20 Fire Brigades Use in ‘Design’ Mode ‘Emergency Response’ Mode Select pre-designed scenario matches Better emergency planning Escalation: Alert Experts, Commandeer resources Better emergency assessment Sensor driven simulations initialised Emergency magnitude minimised Effective intervention C&C tasks emergency responders Lives saved! Asif Usmani application FireGrid Technologies 1000s of sensors & gateway processing Emergency Responders KBS and Planning Super-real-time simulation (HPC) Asif Usmani Grid Maps, models, scenarios Inter-Enterprise Computing Network (IECnet) DTI Knowledge Transfer Network 3 years from 1st February 2005 Exploiting the use of Grid computing technologies in UK industry Lead partner: Intellect UK Project manager: Ian Osborne E-Science partner: NeSC Technical lead: Dave Berry IECnet Objectives 1. Establish wide understanding of the potential of Inter-Enterprise Computing 2. Accelerate the recognition of requirements and issues for Inter-Enterprise Computing 3. Prepare the UK ICT industry, users and government for Inter-Enterprise Computing 4. Follow through the e-Science Core Programme vision in which demanding scientific research stimulates significant advances in Grid technology and the results are transferred to UK industry, healthcare and government. IECnet Advisory Council Provides strategic overview and insight into the projects operation Representatives from: Industry Suppliers X HP, IBM, Intel, Sun, Oracle Target Industry Users X Comms, Security, Pharma, Finance, e-Gov/NHS, Engineering, SMEs, Venture Capital e-Science Experts dti Oversight X Anne Trefethen, dti/EPSRC Edinburgh M.Sc. in e-Science Bob Mann rgm@roe.ac.uk www.ph.ed.ac.uk/postgraduate/degrees/msc_escience.html