The Open Microscopy Environment: Informatics & Quantitative Analysis for Biological Microscopy, HCAs & Image Repositories Jason Swedlow Wellcome Trust Centre for Gene Regulation & Expression College of Life Sciences University of Dundee Scotland Glencoe Software, Inc Seattle, WA USA The Image Problem • A pretty picture? • A measurement? The Image Problem • A pretty picture? • A measurement? Organelles Cells Lead Discovery Target Validation Physiology Dynamics Pathology In Vivo Current Imaging Workflow Paradigm No Standards Experiment? Treatment? Image? Analytics? Annotations? Maybe it’s not boring… Maybe it’s not boring… What is complex data (for us)? • Our focus is local– what happens in single lab • Experiment-driven data acquisition output is a “result”, not shared data most data is garbage • Data generation is from (multiple) proprietary platforms • A modern biomedical lab is an enterprise data generator: 8 data collectors, 2 – 20 different acquisition platforms, 10 – 1000 samples per acquisition, 0.1 – 20 GB/sample • Heterogeneous training in data processing, curation, & retention • Experimental applications & models evolve on the timescale of months (at least); so must all useful informatics tools • Tools that enable discovery are used, regardless of pain and misery involved. Tools that don’t are discarded. Complex Data: Mitotic spindle orientation in neurogenesis Wilcock et al, 2007. Wilcock et al (2007) 55 4D images; ~5 GB each Max intensity projection. 30 sections @ 1.5µm; 7min timelapse, 38 hours Complex Data: Mitotic Kinetochore Dynamics GFP-CENP-A HeLa Stable Cell Line MBL Kinetochore Consortium Danuser, McAinsh, Meraldi, Swedlow Labs Fully validated cell handling & perturbation Defined imaging protocol Correct temporal sampling (7.5s/3D stack) Wilcock et al, 2007. Fully automated processing pipeline Jaqaman et al (2010) Complex Data: Mitotic Kinetochore Dynamics Condition Control Control (synchronised) Nocodazole Taxol Fixed Hec1 siRNA Nuf2 siRNA MCAK siRNA WilcockKIF18A et al, siRNA 2007. Cenp-E siRNA Cenp-E siRNA + taxol CAPD2 siRNA Separase siRNA Bub1 siRNA Mad1 siRNA Mad2 siRNA BubR1 siRNA MG132 Total ` Number of cells 293 36 34 19 10 17 18 71 24 68 72 33 41 31 47 16 10 36 876 Aligned sister pairs 11962 1293 1259 520 400 584 587 3281 631 1951 2345 710 1466 1161 1532 301 244 1194 31421 Current Imaging Workflow Paradigm No Standards Experiment? Treatment? Image? Analytics? Annotations? Towards Image Informatics Reqs for Image Informatics? Interoperability Metadata Interfaces OME: Brief Overview • Founded 2000 by J. Swedlow, P. Sorger, I. Goldberg • Funded by Wellcome Trust, BBSRC, EPSRC, NIH • Partnered with other image informatics efforts (LOCI, UCSB, VU, NCBO, EPCC, EBI…) and many commercial imaging companies • Currently, Dundee, LOCI and Baltimore are main development sites • Open source (LGPL & GPL), nightly release • Initial goal: provide image informatics infrastructure for biological microscopy OME: Brief Overview • Founded 2000 by J. Swedlow, P. Sorger, I. Goldberg • Funded by Wellcome Trust, NIH, BBSRC, EPSRC • Partnered with other image informatics efforts (LOCI, UCSB, VU, NCBO, EPCC, EBI…) and many commercial imaging companies • Currently, Dundee, LOCI and Baltimore are main development sites • Open source (LGPL & GPL), nightly release • Current goal: Development and integration of usable, effective tools for image data management, visualization and analysis OME: What We Do OME Data Model Open, Exchangeable file formats Open Image Data Management Software The OME Data Model: http://www.ome-xml.org OME: What We Do OME Data Model Open, Exchangeable file formats OME-XML OME-TIFF Bio-Formats Open Image Data Management Software OME Bio-Formats: Proprietary File Conversion OME Bio-Formats: Proprietary File Conversion OME: What We Do OME Data Model Open, Exchangeable file formats Open Image Data Management Software (See movies at http://openmicroscopy.org/videos/) The OMERO Platform OMERO: Tools & Workflow OMERO.insight: Data Viz and Analysis OMERO.insight: Data Viz, Manage & Analyze OMERO ScreenPlateWell: High Content Assays OMERO ScreenPlateWell: High Content Assays OMERO.tables: HCS Analysis Results Defining, Storage & Interfaces for ROIs GroEL data and box-files from http://ami.scripps.edu/experiment/index.php The OMERO Platform ImageJ Integration with OMERO Matlab as an OMERO Client Calling OMERO.Blitz API from Matlab: • All calls go through a Java API called blitzGateway • Matlab wrappers for the calls • Allows access to most API calls and provides helper methods for the most common operations. Matlab as an OMERO Client Cellprofiler Integration with OMERO Cellprofiler Integration with OMERO OMERO data Cellprofiler output in OMERO OMERO Tools: Simple Resources for Workflow OMERO Tools: Simple Resources for Workflow OMERO ScreenPlateWell: Help from Outside OME in Use STARTS PER IP ADDRESS ================================================================ APPLICATION EDITOR BIO-FORMATS IMPORTER INSIGHT SERVER WEB UNIQUE IPs 292 12007 1157 1825 1398 398 OME: Warts & All • Funded through 2011 • Committed to delivery & open development • If users reject it, we reject it too. • Technically, very ambitious (c.f. “data deduplication”) • Very collaborative, especially with groups committed to production Towards Image Repositories JCB DataViewer: Access to Original Image Data JCB DataViewer: Access to Original Image Data http://openmicroscopy.org OME: Current Team • Dundee (Swedlow Lab) - Chris Allan, Jean-Marie Burel, Brian Loranger, Donald MacDonald, Will Moore, Andrew Patterson, Aleksandra Tarkowska • Dundee (Usable Image) - Catriona Macaulay, David Sloan, Paula Forbes, Xinyi Jiang, Scott Loynton • Univ Wisconsin, Madison (LOCI) - Kevin Eliceiri, Curtis Rueden • Baltimore - Ilya Goldberg, Josiah Johnston, Tomasz Macura • Glencoe Software - Josh Moore, Melissa Linkert, Carlos Neves Potential conflict: JRS is a Co-Founder of Glencoe Software