Martin Stöter HT - Technology Development Studio (TDS), the HC-Screening Unit at the MPI-CBG stoeter@mpi-cbg.de KNIME workshop February 27th 2015, Berlin Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Outline - Introduction into High-Content Screening (HCS) data and the HCS Tools nodes - Hands-on session HCS Tools - Introduction into Scripting Integration nodes - Hands-on session Scription Integration Martin Stöter, MPI-CBG, Dresden, Germany 2 Technology Development Studio (TDS) MPI-CBG, Dresden, Germany Screening facility for academic laboratories Provide full service for automation and cell-based screens, RNAi and chemical screens Equipment: liquid handling robots, drop dispensers, plate washers, plate readers, High Content Screening platforms Martin Stöter, MPI-CBG, Dresden, Germany 3 Data Analysis is a Bottleneck in HCS! Complex Experiments Lots of data (too much for Excel) Fancy data analysis / mining Scientists Many scientists, but few data analysts Sometimes different languages Data analysis is often a bottleneck! + Data analyst +… HCS Tools + 4 High-Content Screening (HCS) data Data generation - Cells (RNAi, compounds) - Microscopy -> images - Image analysis - Cell features/parameters -> well data Tasks/problems - Read data from various sources SQL database, XML, Excel, various .csv … - Screening specific statistics - Screening specific utilities - Data mining, visualization 1 A B C D E F G H I J K L M N O P 2 3 4 DMSO DMSO 0.001 DMSO 10 DMSO 10 DMSO 3 DMSO 3 DMSO 1 DMSO 1 DMSO 0.3 DMSO 0.3 DMSO 0.1 DMSO no AB no AB 0.1 DMSO no AB no AB 0.1 DMSO DMSO 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 DMSO DMSO DMSO DMSO DMSO DMSO DMSO DMSO DMSO DMSO DMSO DMSO DMSO DMSO 0.001 10 10 3 3 1 1 0.3 0.3 0.1 0.1 0.1 Martin Stöter, MPI-CBG, Dresden, Germany 5 HCS Tools HCS Tools for KNIME Data Import Image Analysis Readers (Opera, Operetta, MotionTracking) Plate Readers (Envision, GeniusPro, MSD SectorImager) Other (Example Data, Generic XML) Normalization Percent-of-control (POC), Normalized percent inhibition (NPI) Z-score, B-score Vector Length Normalization (clustering) Optional: robust statistics (Median + MAD) Select wells to normalize (controls, samples) Quality Control Z-prime factor (Z‘), Multivariate Z‘, SSMD CV (coefficient of variance) Optional: robust statistics (Median + MAD) Select wells to normalize (controls, samples) HCS Tools HCS Tools for KNIME Utilities Handle barcodes, wells and row letters Join Layout from Excel (well annotation, meta data) Visualization Plate Heatmap Viewer (NEW) Plate Vier (old) Dose Response (dependent on R!) Advanced Statistics BinningAnalysis (NEW) Data Manitupation Range Filter, Splitter Outlier Removal HCS Tools: Standardized Data Format - Enforce standardization of data format - Different reader nodes to shape a common data structure - Lower the knowledge entry barrier for new users “barcode”, “plateRow”, “plateColumn”, param1, param2, … -> Eases up the usage of other HCS Tools nodes HCS Tools: Expand well Standardization of the well coordinates: - “plateRow” and “plateColumn” as integer values resemble well position matrix (instead od well) - Some nodes select these columns as default (Join Layout, Plate Heatmap Viewer) - Compatible with 96, 384 and 1536 well format - Plate Row Converter (letter to integer and wise versa) for e.g. generation of well string HCS Tools: Barcode Standard Regular expression for interpretation of barcode (NEW): - Standardized table structure -> connection to our TDS compound database - (?<libplatenumber>[0-9]{3})(?<projectcode>[A-z]{2})(?<date>[0-9]{6})(?<replicate>[A-z]{1}) - Configurable in Preferences -> KNIME -> HCA Tools - Multiple barcodes / regular expressions possible - … work in progress… HCS Tools: Annotate Experiment Excel is the tool for experiment documentation and assay development Join Layout node is Excel Reader for defined spread sheet Plate format with multiple well attributes (1 plate layout -> 1 column in KNIME) - Title of layout starts in cell C5 - Two empty rows between the layout HCS Tools: Normalization To compare data from different plates, days or runs data must be normalized per plate Selectable reference well population per plate Percent-of-control (POC), Normalizes-percent-ofinhibition (NPI), Z-Score Robust statistics (median & mad instead of mean & sd) with statistics table as second output HCS Tools: Quality Control (QC) Quality control statistic measure the assay performance Selectable (multiple) reference well population per plate Z-Prime factor (Z’), multivariate Z’, strictly standardized mean difference (SSMD), coefficient of variance (CV) Robust statistics (median & mad instead of mean & sd) HCS Tools: Binning Analysis "CellProfiler and KNIME: open source tools for high content screening.". Methods in molecular biology (Clifton, N.J.) 2013 986, S. 105-22 Binning analysis describes changes in distributions Great tool for moving from cell to well data (instead of just taking mean per well) HCS Tools: Plate Viewer (old) 179 plates x 384wells = ~70.000 data points times x parameters Martin Stöter, MPI-CBG, Dresden, Germany 15 HCS Tools: Plate Heatmap Viewer Visualization of screening campaigns with meta data Easy to find visually patters, drifts, errors… - 10 x 384well plate - 3 replicates - ~10,000 data points - Raw data - Meta data from barcode - Normalized data - Different readout - Meta data from layout - Browsing single plate - Viewing the well data - Display of images - … more New features: - KNIME Colors - HiLite support - representation of images - many different configurations, e.g. color scale… HCS Tools: the demo Ok… now let’s go to the workflow and see the nodes… The data set: CellProfiler Image data (pre-cleaned up as a .table due to technical reasons) - 10 x 384well plates in 3 replicates with 3 images per well Acknowledgements Software Development Antje Niederlein Felix Meyerhofer (past) Holger Brandl (past) TDS team (MPI-CBG) HCS Tools KNIME Michael Berthold and the KNIME team 18