e-HTPX www.e-htpx.ac.uk An e-science resource for high throughput protein crystallography Need for High Throughput Protein Crystallography Biochemistry provides the only experimental basis for causal understanding of biological mechanisms (Sydney Brenner) Nobel Prize 2002 Protein Structures can be interpreted in (bio)chemical terms Experiments then carried out to test hypotheses about the biochemical (and cellular) function of the protein Complex systems with large number of proteins - high throughput required for complete description Overview of Protein Crystallography Crystallisation Data Collection Phasing Protein Production Protein Structure Target Selection Structure analysis Deposition Top Level Aim of e-HTPX Link the various stages in to one single all encompassing interface from which users can initiate, plan, direct and document their experiment either locally or remotely from a desktop computer. Deliver this ready for Diamond Key Components Access to • Instruments • Databases • Computing Facilities • Real samples transferred • Security • Safety • Automation Tests many e-science procedures Resistance to Automation Luddites, when faced with the use of machines (operated by less-skilled labour) to drive down their wages and to produce inferior goods, turned to wrecking the offensive machines and terrorizing the offending owners in order to preserve their wages, their jobs, and their trades. Deposition of Data in PDB REMARK 200 EXPERIMENTAL DETAILS REMARK 200 EXPERIMENT TYPE : X-RAY DIFFRACTION REMARK 200 DATE OF DATA COLLECTION : NULL REMARK 200 TEMPERATURE (KELVIN) : NULL REMARK 200 PH : NULL REMARK 200 NUMBER OF CRYSTALS USED : NULL REMARK 200 REMARK 200 SYNCHROTRON (Y/N) : Y REMARK 200 RADIATION SOURCE : NULL REMARK 200 BEAMLINE : NULL REMARK 200 X-RAY GENERATOR MODEL : NULL REMARK 200 MONOCHROMATIC OR LAUE (M/L) : M REMARK 200 WAVELENGTH OR RANGE (A) : NULL REMARK 200 MONOCHROMATOR : NULL REMARK 200 OPTICS : NULL REMARK 200 REMARK 200 DETECTOR TYPE : NULL REMARK 200 DETECTOR MANUFACTURER : NULL REMARK 200 INTENSITY-INTEGRATION SOFTWARE : NULL REMARK 200 DATA SCALING SOFTWARE : NULL REMARK 200 REMARK 200 NUMBER OF UNIQUE REFLECTIONS : NULL REMARK 200 RESOLUTION RANGE HIGH (A) : NULL REMARK 200 RESOLUTION RANGE LOW (A) : NULL REMARK 200 REJECTION CRITERIA (SIGMA(I)) : NULL REMARK 200 REMARK 200 OVERALL. REMARK 200 COMPLETENESS FOR RANGE (%) : NULL The Data Model MAPPING MMCIF XML/SQL/mmCIF Dictionary Protein production Crystallisation Data Collection Phasing Refinement UML Code Generation Analysis DATA MODEL (classes) Deposition Properties of data model Must be •readily extensible •relatively easy to maintain •relatively easy to understand •able to cope with realities of projects •mappable to established formats A single standard defining an agreed method for representation and structure of the data using UML for the primary description Crystallisation Facilities Web Site Session View Click on a well … Have I got Crystals? Remote Submission of Crystals to Synchrotron Database on Beamline Sample Changer and goniometry on SRS Automatic Data Collection with DNA e.g. ISPyB e-HTPX contribution Feedback of Results Submitting Jobs via the Grid e.g. Service for Multiple Model Molecular Replacement Search Grid Portal accessing facilities at the EBI, OPPF, YSBL, SRS or ESRF) The Grid portal also allows users to securely store, upload, download and move large volumes of data between Grid-hosts Non e-HTPX arrows in green User Issues Users more used to hands on approach Hurdle of obtaining Grid Certificates (but allows single sign on) Security and reliability Developer Issues • Transfer from Globus Toolkit 2 to Globus Toolkit 4 • Use or otherwise of workflow tools for parts of the process • Support of e-HTPX after grant ends • Scientists required to test procedures and give feedback Achievements so Far Comprehensive Data Model Developed Specific Services Developed • Crystallisation • Automation of X-ray Data Collection • Structure Solution by Molecular Replacement Portals to access Services “Internal” User testing in progress • Useful feedback being obtained e-HTPX Grant Holders BM14 ESRF Cambridge Cardiff Daresbury EBI Oxford York Martin Walsh Randy Read Omer Rana Rob Allan, Greg Diakun, Martin Guest, Colin Nave, Miroslav Papiz, Martyn Winn Kim Henrick Robert Esnouf, David Stuart, Kevin Cowtan 1.4M from BBSRC + 127K from DTI 4 year Project www.e-htpx.ac.uk e-HTPX developers Research Associates employed (plus others) Ronan Keegan, David Meredith, Graeme Winter, Michael Gleaves, CLRC Daresbury Laboratory Chris Mayo, (Jonathan Diprose), The Wellcome Trust Centre for Human Genetics Oxford Ludovic Launer MRC France, c/o ESRF, Grenoble Joel Fillon, Oleg Dolmanov (Anne Pajon, John Ionides) European Bioinformatics Institute, Cambridge Paul Young, York Structural Biology Laboratory e-HTPX SAC Rod Hubbard (chair) Simon Phillips David Brown Peter Kuhn Omer Rana Randy Read Christian Cambillau Charlotte Capener (BBSRC Observer) Adoption of e-Science "the workshop of the weaver was a rural cottage, from which when he tired of sedentary labour he could sally forth in to his little garden, and with the spade or the hoe tend to it's culinary productions.“ The Industrial Revolution moved people from their homes to factories Will e-Science reverse the trend?