Informatics for the International Mouse Phenotyping Consortium Andrew Blake (1), Terrence Meehan (2) , Steve Brown (1), Paul Flicek (2), Ann-Marie Mallon (1), Helen Parkinson (2), William Skarnes (3) and members of the IMPC consortium (1) MRC Harwell, Harwell, UK, (2) EMBL-EBI, Hinxton, UK, (3) Wellcome Trust Sanger Institute, Hinxton, UK THE IMPC WEB PORTAL Overview The International Mouse Phenotyping Consortium (IMPC) is: • Generating a knockout mouse strain for almost every proteincoding gene in the next decade • Characterizing each strain using a standardized, broad-based phenotyping pipeline. • Supported by the MPI2 consortium consisting of experts from EMBL-EBI, MRC Harwell, and the Wellcome Trust Sanger Institute Key IMPC Resources IMPReSS- Standardized protocols iMitS • • • The MPI2 consortium is: • Centralizing, analyzing, and integrating all phenotype data generated by IMPC partners • Standardizing phenotype protocols to allow comparison of data across centers • Using dedicated ‘data wranglers’ to work with IMPC partners to ensure proper transfer and quality control of data occurs. • Developing a statistical analysis pipeline that will automate genotype-to-phenotype associations. • Annotating all data with biomedical ontologies to enhance data integration. • The tracking interface used by IMPC partners Production centers register intent to create a knockout strain Metrics available to centers and funding agencies • • • • The Many Roles of the IMPC Informatics Team Mouse production centers Standardizing Procedures & Tracking LIMS QC tools are available to data providers Unusual data patterns are highlighted as part of quality control Data wranglers will sign off on data after passing QC Analysis IMPC databases Troubleshooting & Quality Control Disseminate Statistical Pipeline & Data Integration The MPI2 consortium coordinates production of mice, harmonize protocols and data export across centers, associate genes to phenotypes using an automated statistical pipeline, and performs data integration to gain new insights into human disease. mousephenotype.org Registration of Interest Contact us www.mousephenotype.org/contact-us Automated Statistical Analysis- A statistical package in the widely used R environment is being developed to automate genotype-to-phenotype associations. User driven design- We test all of our online components with biomedical researchers to improve the utilit and ease of use of the IMPC web portal. Conclusion • The MPI2 consortium is involved in all aspects of the IMPC pipeline from tracking mouse production to data integration and dissemination efforts • Genotype-to-phenotype associations automated statistical pipeline • The IMPC web portal is under a rapid development cycle with features being added after user testing. For the latest see: The IMPC home page that provides access to IMPC online resources • Beta Site Comprehensive Search Features be made by an Data from the first knockout mouse lines phenotyped as part of the IMPC project will be available for analysis soon References • Users can now interest in genes register • Will receive updates on knockout strain production • New features are tested on a beta site • Includes • gene page • phenotype page • Data visuallization will beta.mousephenotype.org Acknowledgements We would like to acknowledge members of the MPI2 consortium including: Julian Atienza-Herrero1, Chao-Kung Chen2, Armida Di Fenza1, Richard Easty3, Simon Greenaway1, Alan Horne3, Vivek Iyer3, Natasha Karp3, Gautier Koscielny2, Jeremy C. Mason2, David Melvin3, Hugh Morgan1, Asfand Qazi3, Mathew Redden1, Ahmad Retha1, Luis Santos1, Duncan Sneddon1, Jonathan W.G. Warren2, Henrik Westerberg1, Robert Wilson3, Gagarine Yaikhom1 Online forum- A forum focused on production and phenotyping protocols for the IMPC community with over a hundred users. QC Tools • Export The web portal for standardized phenotyping protocols Protocols based on conferences, calls, and an online forum IMPReSS will provide mappings between measurements and phenotypes Ontology Use- Data is annotated to widely used ontologies including the MP, MA, EMAPA to aid in data integration efforts. The community is invited to sign up to receive strain production updates and to evaluate the presentation of data as we develop the IMPC web portal: www.mousephenotype.org IMPC Pipeline Data Wranglers- Dedicated personnel that work with mouse centers to ensure data is uploaded correctly and of high quality. 1) Mallon AM, et al Accessing data from the International Mouse Phenotyping Consortium: state of the art and future plans. Mamm Genome. 2012 Oct;23(910):641-. PMID: 22991088. 2) Brown SD, Moore MW. The International Mouse Phenotyping Consortium: past and future perspectives on mouse phenotyping. Mamm Genome. 2012 Oct;23(9-10):63240.. PubMed PMID: 22940749. Funding • Data is ontologies indexed using • Allow users to drill down to the data they want This resource is developed and maintained by the MPI2 consortium Funding supported by the NIH Common Fund Mechanism: U54 HG006370– Mouse Phenotyping Informatics Infrastructure – MPI2 The imits resource was initially developed developed by the I-DCC, supported by the European Union (Project number: 223592)