- ePrints Soton

advertisement
X-ray single
Mol
Raman
Jeremy Frey
STM
Drug Design & Delivery:
The role of e-Science
School of Chemistry
Comb-e-Chem
Sept 2004
University of Southampton, UK
Ocean
Monolayer
Jeremy Frey
e-Science
• ‘e-Science is about global collaboration in key
areas of science, and the next generation of
infrastructure that will enable it.’
• ‘e-Science will change the dynamic of the way
science is undertaken.’
John Taylor, DG of UK OST
• ‘[The Grid] intends to make access to computing
power, scientific data repositories and
experimental facilities as easy as the Web makes
access to information.’
Tony Blair, 2002
Jeremy G. Frey
The UK e-Science
Challenge
• £120M over a 3 Year Programme to
create the next generation IT
infrastructure to support e-Science and
Business
• Essential that UK plays a leading role in
Global Grid development with the USA
and EU
• Phase 1: Started roll out of plan for Grid
Research, Development and Support of
Jeremy G. Frey
e-Science Pilot Projects
UK e-Science Grid
Edinburgh
Glasgow
DL
Belfast
Newcastle
Manchester
Cambridge
Oxford
Cardiff
RAL
London
Southampton
Jeremy G. Frey
Hinxton
National e-Science Centre (NeSC)
• NeSC is in Edinburgh
• Provides Courses & Meetings
• Also has some funding for fellowships to
visit NeSC
Jeremy G. Frey
The Collaboratory Concept
• In 1989, William Wulf, then with the U.S.
National Science Foundation, defined a
collaboratory as
"a center without walls, in which the nation's
researchers can perform their research without
regard to geographical location, interacting with
colleagues, accessing instrumentation, sharing
data and computational resources, and accessing
information in digital libraries."
Jeremy G. Frey
The Current “Client – Server ad hock” model
HPC
Experiment
Analysis
Storage
HPC
Scientist
Experiment
Computing
Storage
Analysis
HPC
Jeremy G. Frey
The Future
The Grid Model - Information Utilities
Scientist
M
I
D
L
E
W
A
R
E
Experiment
Analysis
Computing
Storage
Storage
Analysis
Experiment
Computing
Computing
Jeremy G. Frey
Storage
Access Grid
• Full multi-site video conferencing over the
IP network
• Many sites now in the UK all running the
same system
• System originated in the USA so also sites
there.
Jeremy G. Frey
Access Grid nodes
Jeremy G. Frey
Access Grid
The Grid
• Grid is needed because
– Volume of data (real time data,
images, video)
– Scale of computation (analysis,
simulation)
– Complexity of process (automation)
– Variable demands on computation
– Provenance (audit trials, timestamps,
process)
Jeremy G. Frey
•Comb-e-Chem Partners
•IBM
•IT
•Innovation
•NCS
•CCDC
•ECS
•Chemistry
•Stats
•Combi
•Centre
•Pfizer
•Bristol
•Chemistry
•GSK
•Southampton
•AZ
Jeremy G. Frey
•IUPAC
•RSC
CombeChem People & Places
AZ
GSK
Pfizer
IBM
Jeremy G. Frey
People
• Chemistry (Southampton & Bristol)
– Mike Hursthouse, Chris Frampton, Jon Essex, Jeremy Frey, Guy
Orpen, Stephan Christensen, Thomas Gelbrich, Sam Peppe,
Hongchen Fu, Graham Tizard, Suzanna Ward, Lefteris Danos
• National Crystallography Service (NCS)
– Simon Coles, Mark Light, Ann Bingham
• Electronics and Computer Science (Southampton)
– Dave De Roure, Luck Moreau, Mike Luck, Hugo Mills, Graham
Smith, Simon Miles, Nicky Harding, Gareth Hughes, monica
Schraefel, Terry Payne
• It-Innovation (Southampton)
– Mike Surridge, Ken Meacham, Steve Taylor, Daren Marvin
• Statistics (Southampton)
– Alan Welsh, Sue Lewis, Ralph Manson, Dave Woods
• Rutherford Appleton Laboratory
Jeremy G. Frey
Design
Plan
Goal
Synthesis
Dissemination
All steps must be Grid Aware
Structure
Prediction
Properties
Modelling
Analysis &
Correlation
I will illustrate the application of e-Science to some of these
stages using examples from the Comb-e-Chem Project
Jeremy G. Frey
Salt Selection
Design
Plan
Descriptors
Goal
Smart Lab
Semantic Grid
Synthesis
Combinatorial
Chemistry
Dissemination
All steps must be Grid Aware
Structure
Prediction
Crystallography
Simulations
Properties
Non-linear optical effects
Publication@Source
Modelling
Analysis &
Correlation
Structural Similarities
With examples…….
Jeremy G. Frey
The Comb-e-Chem Project
• The exponential world of Combinatorial
Synthesis and High throughput analysis
meets the exponentially growing power
of computing
• Funding
EPSRC, IBM, GSK, AZ, Southampton
Jeremy G. Frey
The Comb-e- Chem Vision
Structure + Properties
Knowledge + Prediction
Structures
DB
Automation & Remote
interaction
Properties
DB
Simulation and
calculation
Jeremy G. Frey
Co-Laboratory
Interaction between
users & “Dark Labs”
Properties
Models
Design
Experiment
Automation
Structures
Analysis
Jeremy G. Frey
All about Automation
• Design
• Synthesis
• Measurement
Experiments
Information & Knowledge
• Analysis
• Databases
• Agents
Jeremy G. Frey
Goal
Knowledge
Literature
Plan &
COSHH
Report
Smart Laboratory
Information
Integration
Digital Model
Analysis
Synthesis
Jeremy G. Frey
Goal
Knowledge
Literature
Plan &
COSHH
not just one laboratory
but many co-laboratories
working together
Digital Model
Report
Information
Integration
Analysis
Synthesis
Smart Laboratory
Jeremy G. Frey
Making best use of the Plan COSHH
Jeremy G. Frey
Smart Lab
http://smarttea.org
Jeremy G. Frey
Smart Help
http://smarttea.org
Jeremy G. Frey
Laboratory Context
COSHH
Plan
Record
Annotation
Guide
Experimenters
Jeremy G. Frey
Digital Context
Chemistry Starts in the Lab
URI
URI
URI
Lab
Lab
Lab
URI
NCS
Raw data
URI
URI
URI
Database
Publication
Structure
URI
Jeremy G. Frey
Semantic Grid Project
• Inference based on the semantics
• Importance of Ontology
• But problem of contradictions even within
a domain
• This is not an avoidable issue
Jeremy G. Frey
XML
But need more general descriptions for services
RDF – resource description framework
DAML-S (for describing services)
Interface
Simulation
program
Gaussian
ab initio
program
XML wrapper
XML wrapper
Personal
Agent
Jeremy G. Frey
Databases
• Database will become the key method of
handling all data
• Metadata must be generated at inception
and added as data traverses the workflow
• Version control, audit and backup handled
at the database level.
Jeremy G. Frey
Talk
• The UK e-Science Programme
• The Comb-e-Chem Project
• “Smart Lab”
• NCS Grid Service
• Structure Analysis Services
• Dissemination & Publication
Jeremy G. Frey
Centralised remote equipment, multiple users,
few experts
Users
Users
Users
Data & control links
Experiment
Experiment
Expert
Access Grid links
Remote (Dark) Laboratory
•Model for National crystallographic
Service NCS
Jeremy G. Frey
Expert
Expert is the central
resource in short supply
Users
Users
Users
Experiment
Experiment
Experiment
Access grid
& control links
Manufacturer
Support Service
Local link
“External” link
Jeremy G. FreyRaman Project
•Model for Combinatorial
Smart Labs
Synthesis
Sample
NCS
Archive
Raw images
Processed diffraction
pattern
Structure
CCDC
Validation
Database
CIF
metadata
Automated structure
determination
Jeremy G. Frey
Journal
Archiving of Data
RAW DATA:
 Automatic archiving and retrieval with Atlas Datastore
(RAL)
 Development of schema for retrieval of crystallographic
metadata from relational databases (ISIS Data analysis
group)
 Storage Resource Broker (SRB): Uniform access interface
to different types of storage devices
RESULTS DATA:
 Automatic deposition of CIF data with CCDC GRIDenabled pre-deposition database
Jeremy G. Frey
Data Trail
• Drill down through the analysis path
• Look at increasingly raw data
• Often large expansion in quantity and
variety at each stage
Jeremy G. Frey
Publication@Source
• Must be able to track back to the original
data
• Primary reason is to allow new analysis in
the future by other researchers.
• In a university environment this may be
viewed as a public responsibility in
business environment ensuring maximum
value from investment.
• Does have implications for provenance
and even fraud! Jeremy G. Frey
Journals:
Publication @ source
Database
Journal
Journal
Paper
Materials
Laboratory Data
Multimedia
Jeremy G. Frey
“Full” record
Publication Chain
Bibliography
Student
Journal
Professional
Body Archive
Institution
Jeremy G. Frey
Laboratory
e-Bank Project
• Link comb-e-chem and other semantic
grid science projects to the e-print system
at Southampton
• Provide dissemination and provenance
Jeremy G. Frey
Changing the way we work
E-Lab:
X-Ray
Crystallography
Samples
Quantum
Mechanical
Analysis
Data
Provenance
Authorship/
Submission
Samples
Laboratory
Processes
Laboratory
Processes
Structures
DB
E-Lab:
Combinatorial
Synthesis
Properties
Prediction
Data Mining,
QSAR, etc
Laboratory
Processes
Properties
DB
Design of
Experiment
Data Streaming
Visualisation
Agent Assistant
Jeremy G. Frey
E-Lab:
Properties
Measurement
Download