workshop_HCS-tools - MPI-CBG

advertisement
Martin Stöter
HT - Technology Development Studio (TDS),
the HC-Screening Unit at the MPI-CBG
stoeter@mpi-cbg.de
KNIME workshop
February 27th 2015, Berlin
Screen Mining with KNIME
A user-friendly framework
for high throughput/content data analysis
Outline
- Introduction into High-Content Screening
(HCS) data and the HCS Tools nodes
- Hands-on session HCS Tools
- Introduction into Scripting Integration nodes
- Hands-on session Scription Integration
Martin Stöter, MPI-CBG, Dresden, Germany
2
Technology Development Studio (TDS)
MPI-CBG, Dresden, Germany
Screening facility for academic
laboratories
Provide full service for automation
and cell-based screens, RNAi and
chemical screens
Equipment: liquid handling robots, drop
dispensers, plate washers, plate readers,
High Content Screening platforms
Martin Stöter, MPI-CBG, Dresden, Germany
3
Data Analysis is a Bottleneck in HCS!
Complex Experiments
Lots of data (too much for Excel)
Fancy data analysis / mining
Scientists
Many scientists, but few data analysts
Sometimes different languages
Data analysis is often a bottleneck!
+
Data analyst
+…
HCS
Tools
+
4
High-Content Screening (HCS) data
Data generation
- Cells (RNAi, compounds)
- Microscopy -> images
- Image analysis
- Cell features/parameters -> well data
Tasks/problems
- Read data from various sources
SQL database, XML, Excel, various .csv …
- Screening specific statistics
- Screening specific utilities
- Data mining, visualization
1
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
2
3
4
DMSO DMSO
0.001 DMSO
10 DMSO
10 DMSO
3 DMSO
3 DMSO
1 DMSO
1 DMSO
0.3 DMSO
0.3 DMSO
0.1 DMSO
no AB no AB 0.1 DMSO
no AB no AB 0.1 DMSO
DMSO
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
0.001
10
10
3
3
1
1
0.3
0.3
0.1
0.1
0.1
Martin Stöter, MPI-CBG, Dresden, Germany
5
HCS
Tools
HCS Tools for KNIME
Data Import
Image Analysis Readers
(Opera, Operetta, MotionTracking)
Plate Readers
(Envision, GeniusPro, MSD SectorImager)
Other (Example Data, Generic XML)
Normalization
Percent-of-control (POC), Normalized percent inhibition (NPI)
Z-score, B-score
Vector Length Normalization (clustering)
Optional: robust statistics (Median + MAD)
Select wells to normalize (controls, samples)
Quality Control
Z-prime factor (Z‘), Multivariate Z‘, SSMD
CV (coefficient of variance)
Optional: robust statistics (Median + MAD)
Select wells to normalize (controls, samples)
HCS
Tools
HCS Tools for KNIME
Utilities
Handle barcodes, wells and row letters
Join Layout from Excel
(well annotation, meta data)
Visualization
Plate Heatmap Viewer (NEW)
Plate Vier (old)
Dose Response (dependent on R!)
Advanced Statistics
BinningAnalysis (NEW)
Data Manitupation
Range Filter, Splitter
Outlier Removal
HCS Tools: Standardized Data Format
- Enforce standardization of data format
- Different reader nodes to shape a common data structure
- Lower the knowledge entry barrier for new users
“barcode”, “plateRow”, “plateColumn”, param1, param2, …
-> Eases up the usage of other HCS Tools nodes
HCS Tools: Expand well
Standardization of the well coordinates:
- “plateRow” and “plateColumn” as integer values resemble well position matrix (instead od well)
- Some nodes select these columns as default (Join Layout, Plate Heatmap Viewer)
- Compatible with 96, 384 and 1536 well format
- Plate Row Converter (letter to integer and
wise versa) for e.g. generation of well string
HCS Tools: Barcode Standard
Regular expression for interpretation of barcode (NEW):
- Standardized table structure -> connection to our TDS compound database
- (?<libplatenumber>[0-9]{3})(?<projectcode>[A-z]{2})(?<date>[0-9]{6})(?<replicate>[A-z]{1})
- Configurable in Preferences -> KNIME -> HCA Tools
- Multiple barcodes / regular expressions possible
- … work in progress…
HCS Tools: Annotate Experiment
Excel is the tool for experiment documentation and assay development
Join Layout node is Excel Reader for defined spread sheet
Plate format with multiple well attributes (1 plate layout -> 1 column in KNIME)
- Title of layout starts in cell C5
- Two empty rows between the layout
HCS Tools: Normalization
To compare data from different plates, days or runs data
must be normalized per plate
Selectable reference well population per plate
Percent-of-control (POC), Normalizes-percent-ofinhibition (NPI), Z-Score
Robust statistics (median & mad instead of mean & sd)
with statistics table as second output
HCS Tools: Quality Control (QC)
Quality control statistic measure the assay performance
Selectable (multiple) reference well population per plate
Z-Prime factor (Z’), multivariate Z’, strictly standardized
mean difference (SSMD), coefficient of variance (CV)
Robust statistics (median & mad instead of mean & sd)
HCS Tools: Binning Analysis
"CellProfiler and KNIME: open source tools for high content screening.". Methods in molecular biology (Clifton, N.J.) 2013 986, S. 105-22
Binning analysis describes changes in distributions
Great tool for moving from cell to well data (instead of
just taking mean per well)
HCS Tools: Plate Viewer (old)
179 plates x 384wells = ~70.000 data points times x parameters
Martin Stöter, MPI-CBG, Dresden, Germany
15
HCS Tools: Plate Heatmap Viewer
Visualization of screening campaigns with meta data
Easy to find visually patters, drifts, errors…
- 10 x 384well plate
- 3 replicates
- ~10,000 data points
- Raw data
- Meta data from barcode
- Normalized data
- Different readout
- Meta data from layout
- Browsing single plate
- Viewing the well data
- Display of images
- … more
New features:
- KNIME Colors
- HiLite support
- representation of images
- many different configurations, e.g. color scale…
HCS Tools: the demo
Ok… now let’s go to the workflow and see the nodes…
The data set: CellProfiler Image data (pre-cleaned up as a .table due to
technical reasons)
- 10 x 384well plates in 3 replicates with 3 images per well
Acknowledgements
Software Development
Antje Niederlein
Felix Meyerhofer (past)
Holger Brandl (past)
TDS team (MPI-CBG)
HCS
Tools
KNIME
Michael Berthold and the KNIME team
18
Download