NeSC and eSI Dave Berry, Research Manager PRISM Forum, 28 April 2005

advertisement
NeSC and eSI
Dave Berry, Research Manager
PRISM Forum, 28th April 2005
NeSC and eSI roles
NeSC – the national centre
International gateway to UK e-Science
UK and EU Training
Standardisation work
eSI – the international research institute
Conferences and workshops
Research visitors
e-Science research
Digital Curation Centre
Middleware (OGSA-DAI, SunDCG, edikt, …)
Science applications (GridQTL, BRIDGES, …)
Industrial projects
Engage industry
Stimulate the uptake of e-Science technology
UK e-Science Budget
(2001-2006)
Total: £213M + £100M via JISC
EPSRC Breakdown
M RC (£21.1M )
10%
EPSRC (£77.7M )
37%
Staff costs only Applied (£35M)
45% Grid Resources
HPC (£11.5M)
BBSRC (£18M )
15%
8%
NERC (£15M )
7%
Computers & NetworkCore (£31.2M)
(£57.6M
)
40%
funded separately PPARC27%
CLRC (£10M )
5%
ESRC (£13.6M )
6%
+ Industrial Contributions £25M
Source: Science Budget 2003/4 – 2005/6, DTI(OST)
The e-Science
Centres
Globus Alliance
National
Centre for
e-Social
Science
Open
Middleware
Infrastructure
Institute
e-Science
Institute
Grid
Operations
Support
Centre
Digital
Curation
Centre
CeSC (Cambridge)
EGEE,
ChinaGrid
National
Institute
for
Environmental
e-Science
National Grid Service
NGS core nodes
data nodes at RAL and
Manchester
compute nodes at
Oxford and Leeds
free at point of use
apply through NGS web
site
to do: project or VObased application and
registration
all access is through
digital X.509 certificates
from UK e-Science CA
or recognized peer
National HPC
services
HPCx (EPCC)
CSAR (Manchester)
Must apply separately to
research councils
Cyprus
Total:
Total:
87
Sites
87
Sites
8784
8784CPUs
CPUs
33 PByte
PByte
LCG-2/EGEE-0 Status
08-11-2004
The Primary Requirement …
Building people grids
Enabling People to Work Together on Challenging Projects: Science, Engineering & Medicine
Recent meetings
PRISM forum
EPSRC e-Science review
UK Hacklatt Workshop (Lattice QCD)
UK Globus week
Introduction to the NGS
The Accessibility e-Olympics
5th Annual Dependability IRC workshop
Gridsphere and portlets workshop
Introduction to the Edinburgh Mouse Atlas
and EMAGE Gene Expression Database
NeSC Training Team
Enabling, facilitating and delivering
quality training in the UK and
Internationally
Formed in April 2004
Grown from initial two staff to:
five dedicated trainers
two developers
one dissemination officer
Funded by EU and UK
Supports EGEE and the NGS
Example: GGF Summer School
The NeSC training team made
a central contribution planning,
organising and presenting the
2004 GGF Summer School.
The event was attended by 84
selected advanced
international students.
Other presenters included Carl
Kesselmann (Globus) and
Miron Livny (Condor)
First 6 months
In the first 6 months from its inception the
training team directly delivered:
5 training events in the UK
(35 trainees - advanced)
X
X
At NeSC and the University of Stafford
For JISC and EGEE
6 training events in Europe
(183 trainees – introductory/advanced)
X
X
At CERN, FZK Karlsruhe, CNB Madrid, Lithuania and Italy.
For EGEE
Also dissemination presentations to introduce
people to the concept of the grid
Coordination of training in
Europe
The NeSC training
team is the leading
partner for the
training activity in
the EGEE project.
1400
1200
1000
800
600
400
Workshops
Advanced
200
Developer
12.4.05
23.3.05
10.3.05
4.2.05
28.2.05
17.1.05
9.12.04
16.12.04
29.11.04
4.11.04
15.11.04
22.10.04
21.9.04
11.10.04
8.9.04
17.9.04
18.7.04
29.6.04
6.5.04
26.5.04
7.4.04
Induction
14.1.04
0
Event
Training Overall Feedback
6.00
5.00
Score
We coordinate and
provides quality
assurance for
training with 22
partner institutions
in 13 countries.
Total attendance at courses
No of students
4.00
3.00
2.00
1.00
0/1/00
20/1/00
9/2/00
29/2/00
Date
20/3/00
9/4/00
Geographical distribution of EGEE
courses
Current training topics
EGEE training produces courses based on commitments in the
execution plan and training requests which have been received
Induction
LCG2 installation
LCG2 APIs
Design
UML
Related project support
DILIGENT
VO specific training
GATE
Biomed
gLite preparation
Web Services
WSDL
WSRF
GT4
Biomed Application Developers Course, Madrid
Cataloging
Summer 2004:
OGSA specification
informational
document
Provisioning
VO
Mgmt
Integration
Policy
Mgmt
Access
Context
Services
Information
Services
Data
Services
Application
Mgmt
Workflow
Mgmt
Workload Execution
Mgmt
Planning
Job
Mgmt
Execution
Mgmt
Services
Reservation Configuration Deployment Provisioning
Resource
Mgmt
Services
Troubleshooting
Infrastructure
Services
Self
Mgmt
Services
Security
Services
Heterogeneity
Mgmt
Authentication
Optimization
Authorization
Service Level
Attainment
Integrity
Boundary
Traversal
QoS
Mgmt
Event
Discovery Logging
Mgmt
WSRF
WSN
WSDM
Naming
Data Services design team
Informal domain expert groups within OGSA
May include co-chairs of other WG/RGs
Output is included in OGSA specification
DAIS-WG
OGSA Data Service
Design team
GSM-WG
GFS-WG
OGSA-WG
Tele cons, F2F meetings
ByteIO WG
NeSC and eSI roles
NeSC – the national centre
International gateway to UK e-Science
UK and EU Training
Standardisation work
eSI – the international research institute
Conferences and workshops
Research visitors
e-Science research
Digital Curation Centre
Middleware (OGSA-DAI, SunDCG, edikt, …)
Science applications (GridQTL, BRIDGES, …)
Industrial projects
Engage industry
Stimulate the uptake of e-Science technology
eSI Workshops
Space for real work
Crossing communities
Creativity: new strategies and solutions
Written reports
Scientific Data Mining, Integration and Visualisation
Grid Information Systems
Portals and Portlets
Virtual Observatory as a Data Grid
Imaging, Medical Analysis and Grid Environments
Open Issues in Grid Scheduling
Data Provenance & Annotation
e-Science Workflow Services
GeoSciences & Scottish Bioinformatics Forum
Suggestions always welcome!
Attendance from different
countries
900
Other
800
Australasia
700
North America
Europe (non UK)
600
UK (Other)
AC.UK
500
400
300
200
Year 3/Q4
Year 3/Q3
Year 3/Q2
Year 3/Q1
Year 2/Q4
Year 2/Q3
Year 2/Q2
Year 2/Q1
Year 1/Q4
Year 1/Q3
0
Year 1/Q2
100
eSI Industrial Involvement
133 delegates from 64 companies
including not only:
IBM, Microsoft, Oracle, Sun, HewlettPackard, …
but also:
Apple, Astra Zeneca, BAE, Cisco,
Honeywell, Motorola, Organon, Pfizer,
Siemens, …
eSI Research Visitors
Collaborate with UK research and
development
Engage in and develop eSI event
programme
Build bridges with your community
Visit for anywhere between one week
and six months
Link up with regional e-Science
centres
Becoming a research visitor
Establish a collaboration with NeSC
Pre-established mutual interests
We encourage diversity of disciplines
Complementary experience, knowledge and skills
We can help match interests and develop a plan
Visitors already engaged in relevant R&D
This is not a training opportunity
Our support depends on the length and
value of visit
Typically covers travel and/or local living costs
Application via our web site
NeSC Website Statistics
www.nesc.ac.uk
50
750000
AFRICA
45
AM ERICA - NORTH
650000
AM ERICA - OTHER
ASIA
40
EUROPE - UK
550000
EUROPE - NON UK
35
M IDDLE EAST
OCEANIA - PACIFIC
25
350000
20
250000
15
150000
10
50000
5
Year 3/Q4
Year 3/Q3
Year 3/Q2
Year 3/Q1
Year 2/Q4
Year 2/Q3
Year 2/Q2
Year 2/Q1
Year 1/Q4
Year 1/Q3
-50000
Year 1/Q2
0
Successful Hits
HITS
Year 1/Q1
Volume (GB)
450000
UNKNOWN
30
NeSC Website
National e-Science Centre
http://www.nesc.ac.uk/
Mission, Background, Foundation
Locations, Staff, Resources, Projects
Register interest, Mailing lists, NeSCForge
Regional associations and Collaborations
News, Notices
Presentations and Lectures
http://www.nesc.ac.uk/presentations/
e-Science Institute
http://www.nesc.ac.uk/esi/
Mission, Events (Future and Past)
Register for Events, Visitor Programme
UK e-Science
Map and Index of Centres
Technical Papers
Index of >100 Projects
Task Forces
General Information
Glossary, Bibliography,
Who’s who
E-Science job vacancies
http://www.nesc.ac.uk/centres/
http://www.nesc.ac.uk/technical_papers/
http://www.nesc.ac.uk/projects/
http://www.nesc.ac.uk/teams/
NeSC and eSI roles
NeSC – the national centre
International gateway to UK e-Science
UK and EU Training
Standardisation work
eSI – the international research institute
Conferences and workshops
Research visitors
e-Science research
Digital Curation Centre
Middleware (OGSA-DAI, SunDCG, edikt, …)
Science applications (GridQTL, BRIDGES, …)
Industrial projects
Engage industry
Stimulate the uptake of e-Science technology
Digital Curation Centre
• Actions needed to maintain and utilise digital data and
research results over entire life-cycle
– For current and future generations of users
• Digital Preservation
– Long-run technological/legal accessibility and
usability
• Data curation in science
– Maintenance of body of trusted data to
represent current state of knowledge in area of
research
• Research in tools and technologies
– Integration, annotation, provenance, metadata,
security…..
Trusted Repositories of Knowledge
• The Maori entrusted their knowledge to people, trained to
be the repositories,who could:
–
–
–
–
–
receive information with the utmost accuracy
store information with integrity beyond doubt
retrieve the information without amendment
apply appropriate judgement in the use of the information
pass on the information appropriately
Whatarangi Winiata, (2002), Repositories of Röpü Tuku Iho: A Contribution to the
Survival of Mäori as a People, Wellington: Library & Information Association of New
Zealand Aotearoa Annual Conference, 17-20 November 2002
Special thanks to Professors Derek Law & Seamus Ross
communities
of practice:
users
curation
organisations
community
support &
outreach
Collaborative
Associates
Network of
Data
Organisations
services
management
& coordination
research
research
collaborators
development
testbeds
& tools
Industry
standards bodies
Data exchange on the Web
Web
DTD
XML
XML
Q: XML view
DB1
DB2
All members of a community (industry) agree on a
DTD and then exchange data w.r.t. it:
e-commerce, health-care, ...
XML Publishing:
mapping relational data to XML
conforming to the predefined DTD
Archiving (preserving) databases
How do you preserve something that
changes every hour or minute?
Important for the scientific record –
someone might have cited your data at
time t.
Current practice
Create versions (how often?)
Log changes
Use diffs
Do nothing (common!)
Uncompressed
Archive size is
≤ 1.01 times diff
repository size
≤ 1.04 times size of
largest version
Compressed
archive size between
0.94 and 1 times
compressed diff
repository size
gzip - unix compression
tool
XMill - XML compression
tool
Size (bytes) x 106
100 days of
OMIM
c diff
n
i
,
e
v
archi
version
Legend
•archive
•inc diff
•version
•compressed inc diff
•compressed archive
gzip(inc diff)
XMill(archive)
The OGSA-DAI Project
Powered by ….
Funded by the Grid Core Programme
OGSA-DAI
£3 million, 18 months, from Feb 2002
Three major releases, three interim
releases
DAIT (DAI-Two)
Keep the OGSA-DAI brand name
£1.5 million, 24 months,
from Oct 2003
Four major releases
OGSA-DAI Downloads by country
792 registered users @ 23/8/04
BRIDGES
C F G V ir t u a l
P u b lic a lly C u r a te d D a t a
E nsem bl
O r g a n is a t io n
O M IM
G la s g o w
S W I S S -P R O T
P riv a te
E d in b u r g h
MGI
VO Authorisation
P r iv a te
d ata
O x fo rd
bl a
st
Synteny
Grid
Service
HUGO
…
RGD
L e ic e s te r
D ATA
HUB
OGSA-DAI
P riv a te
data
d ata
Information
Integrator
P r iv a te
d ata
N e th e rla n d s
P r iv a te
data
London
P riv a te
d ata
+
database
engine 1
ODD-Genes
registry
database
engine 2
GridQTL: High performance QTL
analysis via the Grid
Execute QTL analyses on grid computing resources
Describe parallel computation requirements
Automatic task-level decomposition of analysis requests
Schedule, monitor and re-start decomposed tasks
Provide a secure and private data space for each researcher
Synchronise application input and output
Enable analysis re-start from intermediate results
Be a robust public service
GridQTL Portal
Analysis 1
Data
Mgr
Analysis 2
Analysis
Portlet
Analysis 3
Analysis 4
Analysis 5
Meta
Sched
UK e-Science Grid
or NGS Resources
Virtual Observatories
Observations made across entire electromagnetic spectrum
ROSAT ~keV DSS Optical 2MASS 2µ IRAS 25µ
IRAS 100µ
GB 6cm NVSS 20cm WENSS 92cm
⇒e.g. different views of a local galaxy
Need all of them to understand physics fully
Databases are located throughout the world
Peter Clarke
VOTES
Virtual Organisations for Trials and Epidemiological
Studies
3 year MRC (£2.9M) funded project Plans to develop Grid
infrastructure to address key components of clinical
trial/observational study
X
X
X
Recruitment of potentially eligible participants
Data collection during the study
Study administration and coordination
– Involves Glasgow, Oxford, Leicester, Nottingham, Manchester
Clinical Virtual Organisation Framework
Used to realise
CVO-1
(e.g. for data
collection)
CVO-2
(e.g. for
recruitment)
LeiNott
GLA
Transfer
Grid
GPs
OX
IMP
Clinical trial
data sets
Disease
registries
Hospital
databases
Scottish Bioinformatics Research
Network
Funded (£2.4M) by Scottish Enterprise,
Scottish Higher Education Funding Council,
Scottish Executive Environment and Rural
Affairs Department
Involves Glasgow, Dundee, Edinburgh, Scottish
Bioinformatics Forum
Aim to provide bioinformatics infrastructure
for Scottish health, agriculture and industry
Infrastructure support at Dundee, Edinburgh and
Glasgow to support first-rate research in
bioinformatics at each academic institute
Infrastructure support at three institutes, to support
inter-institutional sharing of compute and data
resources through application of Grid computing
Outreach and training activities mediated by the
Scottish Bioinformatics Forum
Genetics and Healthcare Initiative
Funded by Health Department and Department
for Enterprise and Lifelong Learning
Involves Glasgow, Dundee, Edinburgh, Aberdeen
Genetics as applied to healthcare
first two years emphasis on providing a platform for
research into the genetic basis of common complex
diseases in Scotland
X
X
Mental health, cardiovascular, …
Plan to establish 15,000 family-based intensively-phenotyped cohort
recruited from the East and West of Scotland
Basis for neutralising heritable (genetic) risk factors in
disease surveillance, treatment optimisation, avoidance
of adverse drug events and prediction of response to
therapy, health care planning and drug discovery, …
DyVOSE
Dynamic Virtual Organisations for e-Science Education
(DyVOSE) project
Exploring advanced authorisation infrastructures for
security
X
… in Grid Computing Module as part of advanced MSc at Glasgow
– Provide insight into rolling Grid out to the masses!
ScotGrid
GU Condor pool
Other (known!)
Grid resources
Education
VO policies
PERMIS based
Authorisation checks
Authorisation decisions
Edikt
Standards
Requirements
analysis
Technology
matchmaking
E-Science Apps
CS Research
Edikt project
Gap filling
Grid Services for
e-Science Data
Management
Rigorous
engineering
Commercial SW
components
and skills
The team: 8 professional software engineers, support staff,
project manager, commercialisation manager, architect, and
SAB
SHEFC funded research and development grant
3 years funding: May 2002 – 2005
+3 years funding upon successful project and review
ELDAS – Data Access Service
Grid User1
Another (partial)
implementation of
the GGF WS-DAI
specifications
Grid User2
Grid Proxy
ELDAS
Xindice DB
DAC
Web User1
Web Servlet
Java
Framework
EJB - DAS
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
Implemented using Enterprise Java Beans
Data Access Components interface to distinct DBMSs
Accessible as a grid data service or a web data service
BinX – accessing legacy binary
data
simulations
The Problem:
Many binary data files
Applications must “know”
the data format
Binary data formats are
machine-specific
Binary
Binary
Binary
Binary
Data
File
Binary
Data
File
Binary
Data
File
Data
File
Data
Data File
File
The Solution:
Write a “stand-aside” format
description in XML
Provide a library to
X
X
Interpret the description
Provide file access across different
machines
Build higher-level services
BinX
BinX file
file
describes
describes
binary
binary file
file
structure
structure
BinX Library
e-Science
Application
NeSC and eSI roles
NeSC – the national centre
International gateway to UK e-Science
UK and EU Training
Standardisation work
eSI – the international research institute
Conferences and workshops
Research visitors
e-Science research
Digital Curation Centre
Middleware (OGSA-DAI, SunDCG, edikt, …)
Science applications (GridQTL, BRIDGES, …)
Industrial projects
Engage industry
Stimulate the uptake of e-Science technology
Mammography
A prototype of a national database of
mammographic images in support of the UK
breast screening programme
Standard
Standard
Mammo
Mammo
Format
Format
Mammograms have different
appearances, depending on image
settings and acquisition systems
Temporal
mammography
Computer
Aided
Detection
3D View
FirstDIG
Data mining with the First Transport Group, UK
Example: “When buses are more than 10 minutes
late there is an 82% chance that revenue drops by
at least 10%”
"The results of this exercise will revolutionise the
way we do things in the bus industry.“, Darren
Unwin, Divisional Manager, First South Yorkshire.
OGSA-DAI
OGSA-DAI
OGSA-DAI
OGSA-DAI Client Application
Data Mining Application
OGSA-DAI
INWA
EPCC,UK
TOG
Grid Engine
Bank
Telco
OGSA-DAI
Bank data
OGSA-DAI
UK Property
Curtin,Australia
TOG
Grid Engine
user@edinburgh
Bank
Telco
OGSA-DAI
Telco data
Data Browser
OGSA-DAI
Australian property
Data Browser
user@australia
Mont Blanc Tunnel Fire
TIME (min)
0
Events
Consequences
Fire detected
Emergency assessment too slow!
Lack of co-ordination b/w 2 sides
too many vehicles enter tunnel
10
1st Decision
Traffic
stopped
15
1st Response
Fans turned on in
wrong direction!
Enhancement of
smoke and fire
20
French Fire Br.
25
Italian Fire Br.
Intervention made
difficult
by poor initial response
39 dead!
Asif Usmani
Mont Blanc Tunnel Fire & FireGrid
TIME (min) Events
Pre-emergency
response planning
x-
Case-based
training
Consequences
Many scenarios
generated
Co-ordination
Preparedness
0+
Fire detected
Sensors channel
info to C&C
~1
1st Decision
Traffic stopped
Early forecasts
20
Fire Brigades
Use in ‘Design’ Mode
‘Emergency Response’ Mode
Select pre-designed
scenario matches
Better emergency
planning
Escalation: Alert Experts,
Commandeer resources
Better emergency
assessment
Sensor driven
simulations initialised
Emergency magnitude
minimised
Effective intervention
C&C tasks
emergency responders
Lives saved!
Asif Usmani
application
FireGrid Technologies
1000s of sensors
& gateway
processing
Emergency
Responders
KBS and
Planning
Super-real-time
simulation (HPC)
Asif Usmani
Grid
Maps, models,
scenarios
Inter-Enterprise Computing
Network (IECnet)
DTI Knowledge Transfer Network
3 years from 1st February 2005
Exploiting the use of Grid computing
technologies in UK industry
Lead partner: Intellect UK
Project manager: Ian Osborne
E-Science partner: NeSC
Technical lead: Dave Berry
IECnet Objectives
1. Establish wide understanding of the potential of
Inter-Enterprise Computing
2. Accelerate the recognition of requirements and
issues for Inter-Enterprise Computing
3. Prepare the UK ICT industry, users and
government for Inter-Enterprise Computing
4. Follow through the e-Science Core Programme
vision in which demanding scientific research
stimulates significant advances in Grid
technology and the results are transferred to UK
industry, healthcare and government.
IECnet Advisory Council
Provides strategic overview and insight
into the projects operation
Representatives from:
Industry Suppliers
X
HP, IBM, Intel, Sun, Oracle
Target Industry Users
X
Comms, Security, Pharma, Finance, e-Gov/NHS,
Engineering, SMEs, Venture Capital
e-Science Experts
dti Oversight
X
Anne Trefethen, dti/EPSRC
Edinburgh M.Sc. in e-Science
Bob Mann
rgm@roe.ac.uk
www.ph.ed.ac.uk/postgraduate/degrees/msc_escience.html
Download