How EPCC can help with computationally demanding code and large data sets

advertisement
How EPCC can help
with computationally
demanding code and
large data sets
MVM Research Symposium,
3rd November 2008
Eilidh Grant
Applications Consultant, EPCC
egrant1@epcc.ac.uk
+44 131 650 5115
Overview
• What is EPCC?
• Sample projects
– Modelling gene regulatory networks
– Storing large amounts of microscopy data
• How to get involved
How EPCC can help – 3rd November 2008
2
What is EPCC?
•
•
•
•
High performance computing
Develop software for all areas of science and technology
80 expert staff
2 national UK supercomputers - HECToR and HPCx
QuickTime™ and a
decompressor
are needed to see this picture.
QuickTime™ and a
decompressor
are needed to see this picture.
How EPCC can help – 3rd November 2008
3
Biology projects
Genetic markers of
Circadian rhythm of
colorectal cancer
Gene regulatory
Arabidopsis thaliana
EPCC
networks
Parallel R for the
statistical analysis of
microarray data
A data grid for
cell biology
Evolution of virulence
in bird flu
How EPCC can help – 3rd November 2008
4
Gene regulatory networks
• Aim is to statistically analyse various biological data
(microarrays, proteomics) to produce a bayesian network
model of the underlying gene regulatory network.
• Can only handle small networks of genes on a desktop
computer.
• EPCC are investigating how to get this code, originally
written in Matlab to run on HPCx.
How EPCC can help – 3rd November 2008
5
Gene regulatory networks
CSBE Research Fellow Marco Grzegorczyk said: The
implementation of the Metropolis-coupled Markov chain Monte
Carlo algorithms [on HPCx] will allow us to infer more
interesting networks with much more genes."
– Centre for Systems Biology at Edinburgh
How EPCC can help – 3rd November 2008
6
Data grid
• Aim is to help scientists to share their data:
– in simple, secure, and effective manner.
• Microscope image data from cell biology:
– though any (experimental) data could be relevant.
• Working with researchers in Oxford and Edinburgh:
– Davis and Finnegan groups studying early development of fruit fly
Drosophilia Melanogaster.
How EPCC can help – 3rd November 2008
7
Data grid
• EPCC have developed a technology called
DiGS (Distributed Grid Storage),
• Data grid software, provides:
– multi-Terabyte, distributed storage capability.
– automated data replication and backup function.
– periodic validation and consistency checking of data.
• Supports annotation of experiments, with scientific metadata:
– assisting with data provenance.
– allowing fast and efficient searches across large volumes of data.
How EPCC can help – 3rd November 2008
8
How to get involved
• Speak to us
• Collaborative research through a joint funding bid
• HECToR - dCSE grants available for 6-12 months of effort
and 10 000s CPU hours.
• MSc projects or co-supervision of a PhD
• HPCx - for the next year there will be time available on HPCx
with some developer support.
How EPCC can help – 3rd November 2008
9
Contact Details
• Please get in touch to discuss any projects that could benefit
from working with EPCC
• http://www.epcc.ed.ac.uk
• Contact EPCC at EPCC-Support@ed.ac.uk
How EPCC can help – 3rd November 2008
10
Download