GridQTL A Grid Portal for QTL Mapping of Compute Intensive Datasets

advertisement
GridQTL
A Grid Portal for QTL Mapping of
Compute Intensive Datasets
People
University of Edinburgh
Institute of
Evolutionary
Biology
National eScience Centre
Sara Knott
Ian White
George Seaton
Jules Hernandez
Jean-Alain Grunchec
Dave Berry
John Allen
Roslin Institute
& R(D)SVS
Chris Haley
Dirk-Jan de Koning
Burak Karacaören
Wenhua Wei
Suzanne Rowe
What is GridQTL?
• 5 year Research Council funded programme
• AIM: To develop a Grid based platform for the
analysis of multiple traits with high density
markers with complex genetic models
Develop the Grid platform
Develop the genetic analysis
Implement analysis on platform
And keep it free!
History: QTL Express
• User-friendly web-based software to
map QTL in outbred populations
• Fast robust analysis of structured outbred
populations
• Variance component estimation for general
pedigrees
• http://qtl.cap.ed.ac.uk
• Java servlet platform
QTL Express
New developments
• 100,000s marker genotypes per
individual (SNPs, AFLPs)
• 10,000s phenotypes, e.g. through
microarray technology
• 1000s individuals
New Models
• Epistasis
• Combined linkage and linkage
disequilibrium mapping (LDLA)
• Expression QTL (eQTL)
• Genome-wide SNP association scans
New requirements
• New analysis take many CPU
• Need to store data and intermediate results
• One QTL study, many analyses
• Batch submission and data retrieval
• Solution: distributed computing=> Grid
Grid
Distributed
DistributedComputing
Computing
•• Loosely
Looselycoupled
coupled
•• Heterogeneous
Heterogeneous
•• Single
SingleAdministration
Administration
Grid
GridComputing
Computing
•• Large
Largescale
scale
•• Cross-organizational
Cross-organizational
•• Geographical
Geographicaldistribution
distribution
•• Distributed
Management
Distributed Management
Source: Hiro Kishimoto GGF17 Keynote May 2006
Cluster
Cluster
•• Tightly
Tightlycoupled
coupled
•• Homogeneous
Homogeneous
•• Cooperative
Cooperativeworking
working
Development of QTL analyses
•QTL mapping methods for multiple traits
•QTL mapping methods for high-density
multiple trait data
•Methods to detect two-QTL epistatic
interactions
•Variance components methods to map QTL
by linkage disequilibrium in general
pedigrees
GridQTL Analysis portfolio
• Experimental crosses
• Epistasis, Multi-trait, eQTL
• General pedigrees
• VC, LDLA, LD,
• Sib-pairs, Half-sibs, single large full-sib
family
• Opportunistic additions
Development of Grid Platform
•Transform existing QTL Express Java
Servlet platform to be Grid compatible
•Implemented using GridSphere
(www.gridsphere.org)
•Ultimately, GridQTL as analysis component
for integrative biology
GridQTL Portal
GridQTL Portal
Analysis 1
Data
Mgr
Analysis 2
Analysis
Portlet
Analysis 3
Analysis 4
Analysis 5
Meta
Sched
UK e-Science Grid
or NGS Resources
Implementation Challenges
• QTL software developed in many
programming languages
• Transform stand-alone scientific
applications to Grid applications
• Licensing of commercial components
(software and libraries)
• Saturation and robustness of UK Grid
Where are we now?
• QTL Express migrated to GridSphere
• Want to test-drive GridQTL?
• E-mail Sara Knott for login details
• http://www.GridQTL.org.uk
• Contacts:
• Sara Knott
s.knott@ed.ac.uk
• Mail us your suggestions
Acknowledgements
• Peter Visscher & Denise Ecklund
• Scientific Advisory Board:
•
•
•
•
•
•
•
Dr. Peter Visscher, QIMR, Brisbane
Dr. Stephen McGough, Imperial college, London
Dr. Robert Stevens, University of Manchester
Dr. Lon Cardon, University of Oxford
Dr. Mike Goddard, University of Melbourne
Dr. Orjan Carlborg, Uppsala Univeristy
Dr. Robin Thompson, Rothamsted Research
Download