Research and NeSC Applications Prof Richard Sinnott Technical Director National e-Science Centre r.sinnott@nesc.gla.ac.uk 26th October 2006 The Context • • • • There are many Grids There are many ways to build Grids There are many different middleware competing in this space People say Grid in grants and then build web services because Grid middleware is too hard • There are many agendas – big business, academic, … • There are many moving targets – changing middleware, changing standards, changing sciences resources/questions/funding streams… • • • • • There is a lot of hype There is a lot of money available There are lots of projects and big scientific challenges There is an urgent need to build user communities There needs to have much more research pull than middleware push – … there are many more things that could go here! Data Grids for High Energy Physics ~PBytes/sec Online System 1 TIPS is approximately 25,000 ~100 MBytes/sec SpecInt95 equivalents Offline Processor Farm ~20 TIPS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second ~100 MBytes/sec Each triggered event is ~1 MByte in size Tier 0 CERN Computer Centre LCG/gLite middleware ~622 Mbits/sec Tier 1 France Regional Centre (Large scale data Italy Regional FermiLab ~4 TIPS Centre management, large ~622 Mbits/sec scale compute resource Caltech Tier2 Centre Tier2 Centre Tier2 Centre Tier2 Centre management, resource Tier 2 ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS ~622 Mbits/sec broking…!!!) Physicists work on analysis “channels”. Germany Regional Centre InstituteInstitute Institute ~0.25TIPS Physics data cache Institute ~1 MBytes/sec Tier 4 Physicist workstations Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Challenges of NanoCMOS Design OMII-UK middleware (workflows, security, 3D data management, resource management, + …) Statistical Protein Structures Gene expressions Nucleotide structures Globus/WS- middleware (fine grained security, data access/integration, exponential data growth, keep it simple!) Populations Organisms Physiology Tissues Protein-protein interaction (pathways) The e-Health Future… NeSC Research… • Most NeSC Glasgow research is on security and ease of use across various application domains • NeSC Edinburgh focus is on middleware development especially Grid data access/integration (OGSA-DAI, DAIT, OMII-UK, eDIKT), high performance networking, data curation …. Ease of Use • (…and setting the scene for some of the later demonstrations) • For Grids/e-Research to be truly successful – have to be made as seamless to access and use as the internet • Forget training, education for some (most?) users! – have to be based on research pull and not middleware push – experiences in various projects have shown that users don’t like digital certificates • The majority most certainly won’t jump through hoops to get on the Grid Single Sign-On • X.509 certificate based PKI common to many Grid efforts (including UK) – Step 1. • Get a certificate – Step 2. • Get your DN registered at places you expect to use – Step 3. • Read the manuals (Globus, gLite, …) for how to submit/run a job Step 1 • In UK e-Science community X.509 PKI based on centralised CA with direct single hierarchy to users – Typical scenario for getting Grid certificate CA 2. Check details of request RA 3. Ok? 1. Request certificate (www.grid-support.ac.uk/ca) 4. Download and install certificate in browser 5. Download and install CRL User 6. Export certificate to various formats e.g. as Grid certificate $> openssl pkcs12 -in cert.p12 -clcerts -nokeys -out usercert.pem!!!! This is off-putting for end users!!! Typically not available on Windows!!! Root access? Local sys-admin? But… • Identity management issues – Certificate Revocation Lists – When revoked? By whom? How timely? • Strong passwords for private keys – Users write them down, share them, forget them • Privilege Management – Numerous domains where never get access to local account to “do stuff” • User classification – Tinkerers vs much larger e-Research Community • they want services to point their browser at and point click to run things on the Grid – I don’t want an account on a cluster to compile/run code, I’m a biologist who wants to run BLAST on a free National Grid resource As a result… • ~3500 UK e-Science certs – 1000 for Manchester cluster • But over 3 Million Athens accounts in UK HE/FE • Iceberg is not to scale!!!! How Can we Improve Things? • We don’t want each domain reinventing their own security solutions • Best to exploit local authentication – Sites know best if users still at institution and are best placed to state what their privileges are/should be Introducing Shibboleth • Shibboleth (http://shibboleth.internet2.edu) Definition Shibboleth [Hebrew for an ear of corn, or a stream or flood] 1. A word which was made the criterion by which to distinguish the Ephraimites from the Gileadites. The Ephraimites, not being able to pronounce sh, called the word sibboleth. See --Judges xii. 2. Hence, the criterion, test, or watchword of a party; a party cry or pet phrase. ] • Shibboleth will replace Athens as access mgt system across UK academia – Federations based on trust • or more accurately trust but verify • numerous international federations exist MAMS, SWITCH, HAKA, SDSS… Typical Shibboleth Scenario Identity Provider LDAP AuthN Home Institution Federation Service provider 5. User accesses resource W.A.Y.F. User 1. User points browser at Grid resource/portal (or non-Grid resource) Grid resource / portal It’s a start, but… • Benefit from local authentication but really want finer grained control… – I know you have authenticated, but I need to know that you have sufficient/correct privileges to access my VO resources – can also return various other information needed to support authorisation decisions Authorization Technologies • Various technologies for authorization including – PERMIS • PrivilEge and Role Management Infrastructure Standards Validation – http://www.permis.org – Community Authorisation Service • http://www.globus.org/security/CAS/ – AKENTI • http://www-itg.lbl.giv/security/akenti – CARDEA • http://www.nas.nasa.gov/Research/Reports/Techreports/2003/nas-03020-abstract.html – VOMS • http://hep-project-grid-scg.web.cern.ch/hep-project-grid-scg/voms.html • At NeSC we have been working extensively with PERMIS Role Based Access Controls • Basic idea is to define: – roles applicable to specific VO • roles often hierarchical – Role X ≥ Role Y ≥ Role Z – Manager can do everything (and more) than an employee can do who can do everything (and more) than a trainee can do – actions allowed/not allowed for VO members – resources comprising VO infrastructure (computers, data resources etc) • A policy then consists of sets of these rules • { Role x Action x Target } – Can user with VO role X invoke service Y on resource Z? • Policy itself can be represented in many ways, e.g. XML, XACML, … • Tools available for policy editing, associating users with roles, signing policies etc – Policies stored as attribute certificates in LDAP server • (New tools/wizards presented at OGF18 Washington) Finer Grained Shibboleth Scenario Identity Provider Service provider LDAP AuthN Shib Frontend Home Institution 6. Make final AuthZ decision Federation 5. Pass authentication info and attributes to authZ function W.A.Y.F. User 1. User points browser at Grid resource/portal Grid Portal Ok, but… • I can do authorisation but I want single-sign on to lots of distributed resources across different organisations (aka Virtual Organisations in Grid speak) – Browser allows to keep session information so can access other resources without signing in again • Provided authorisation information valid for different service providers – Each service provider completely autonomous • Can configure attribute release/attribute acceptance policies per identity provider/service provider NeSC Applications BRIDGES Project • More later GEMEPS Project • More later VOTES Project • More later DyVOSE Project • Dynamic Virtual Organisations for e-Science Education (DyVOSE) project – Two year project (£289k) started 1st May 2004 funded by JISC – Exploring advanced authorisation infrastructures for security • … in Grid Computing Module as part of advanced MSc at Glasgow – providing insight into rolling Grid out to the masses! ScotGrid GU Condor pool Other (known!) Grid resources Education VO policies PERMIS based tio n Authorisation authorisa checks Authorisation decisions Putting the “Dy” in DyVOSE • Dynamic PMI Case Study Glasgow Edinburgh Glasgow SoA using Glasgow DIS to issue Edin. roles Edinburgh SoA using Glasgow DIS to issue Edin. roles LDAP LDAP ACs created for Edin. roles Glasgow Education VO policies Edinburgh Education VO policies PERMIS based Authorisation checks/decisions Grid BLAST Service Implemented by Students Grid BLAST Data Service data input Grid-data Client Protein/nucleotide data returned based on student team role Nucleotide + Protein Sequence DB Security Related Projects • GLASS – JISC funded started March 2006 • Exploring early adoption of Shibboleth – Working with Computer Services directly • Scenarios based upon teaching and access to NHS resources/data – Includes brain trauma (interest to neuro-folk/CARMEN?) • Builds upon university wide unified account management system being rolled out (based on Novell nSure technology) • ESP-Grid – JISC/Oxford University funded • Developed demonstrator to show how Grid resources can be accessed and used via Shibboleth technology • Grid Security Report – JISC/JCSR funded • Focus on Grid security practices, middleware and outlook • Grid meets Geographical Information Systems – JISC funded with focus on Shibboleth access to GIS data resources Grid Enabled Occupational Data Environment (GEODE) • GEODE – Funded by ESRC lead by University of Stirling • Two year project aiming to develop Grid enabled portal for occupational data – includes integration of various existing classification schemes – More later! Grid Enabling Biomedical Pathway Simulator • To extend software from DTI funding BPS project to benefit from the Grid – Biochemical differential equation solver – Parameter searches – Security aspects important Scottish Bioinformatics Research Network • Four year proposal (£2.4M) started February 2006 – Funded by Scottish Enterprise, Scottish Higher Education Funding Council, Scottish Executive Environment and Rural Affairs Department • Involves Glasgow, Dundee, Edinburgh, Scottish Bioinformatics Forum – Aim to provide bioinformatics infrastructure for Scottish health, agriculture and industry • Infrastructure support at Dundee, Edinburgh and Glasgow to support first-rate research in bioinformatics at each academic institute • Infrastructure support at three institutes, to support inter-institutional sharing of compute and data resources through application of Grid computing • Outreach and training activities mediated by the Scottish Bioinformatics Forum Scottish Family Health Study • Five (2+3) year proposal (£4.6M) started January 2006 – Funded by Health Department and Department for Enterprise and Lifelong Learning • Involves Glasgow, Dundee, Edinburgh, Aberdeen – focus of genetics as applied to healthcare – first two years emphasis on providing a platform for research into the genetic basis of common complex diseases in Scotland » Mental health, cardiovascular, … » Plan to establish 15,000 family-based intensively-phenotyped cohort recruited from the East and West of Scotland – basis for neutralising heritable (genetic) risk factors in disease surveillance, treatment optimisation, avoidance of adverse drug events and prediction of response to therapy, health care planning and drug discovery, … Meeting the Design Challenge of nanoCMOS Electronics £5.3M EPSRC Pilot – kicks off next week International Tech nology Roadmap for Semiconductors Year MPU Half Pitch (nm) MPU Gate Length (nm) 2005 2010 2015 2020 90 32 45 18 25 10 14 6 2005 edition Toshiba 04 Device diversification 230 nm 90nm: HP, LOP, LSTP 45nm: UTB SOI Bulk MOSFET 32nm: Double gate Standard Single Set 4-year project with lots of international visibility 25 nm FinFET UTB SOI FD SOI Bulk MOSFET LSTP LOP HP(MPU) Stat. Sets Current Efforts • AHRC Grant proposals – Performance Arts – Scottish Language and Literature • OMII proposals – Visualisation service • Scottish Enterprise – Production level clinical e-Infrastructure for Scotland • Wellcome Trust – Grid based biomedical visualisation infrastructure • EPSRC – Grid based brain trauma co-ordination with China • Links to CARMEN – Construction Industry and Grids • JISC – MANY bids on-going in e-Infrastructure, e-Repositories, … areas • And of course the Scottish Grid Service… Opportunities • There are more opportunities than can be followed up • All funding councils, DTI, JISC, Europe FW7, international calls – How long for…? – Often difficult to get the first grant…? – More than happy to work with folk…?