Presentation - Sage Bionetworks

advertisement
The Broad-DREAM Gene Essentiality
Prediction Challenge: Motivation, Data,
Scoring, and Results
Mehmet Gönen
Department of Biomedical Engineering
Oregon Health & Science University
Genome-wide RNAi screens in
a large number of cell lines
Pathway Activation
Mutational analysis
Copy number
Dependencies
Extensive molecular
characterization of cell lines
Gene expression
• Project Achilles is a large collaboration between the Broad Institute
and the Dana-Farber Cancer Institute to identify and catalog
genetic vulnerabilities across hundreds of molecularly
characterized cancer cell lines.
 The Broad-DREAM Gene Essentiality Prediction Challenge
is designed to help addressing this goal, specifically to look for
models that can predict these vulnerabilities from biomarkers.
• Current public datasets Achilles v2.0 and v2.4.3 contain 102 and
216 cell lines, respectively
 http://www.broadinstitute.org/achilles
 Querying genes for essentiality values across cell lines
 Downloading data at the shRNA or gene level
 Forwarding data to GENE-E or GenePattern module PARIS
shRNA Loss of Function Screens
Overview
Infect a cell line with the shRNA library
shRNA Library
Cancer Cell Line:
Each cell in the
population gets one
single shRNA
Measure Essentiality based on
proportion of shRNAs present at the
end of the experiment compared to
the start
Readout is Next Generation
Sequencing (previously array-based
hybridization) of shRNA barcode
sequences to get counts per shRNA,
per cell line
Red and Green in this example are depleted during the experiment and therefore have a
negative effect on viability (are ‘essential’ to viability). Orange is neutral to viability. Blue and
Yellow are enriched during the experiment and therefore have a positive effect on viability.
Assay
Development
Screening
Raw data QC
Final Data
Deconvolution
Data processing GenePattern modules
Quality control
GenePattern modules
FilterLowshRNAs
FPmatching
shRNAremoveOverlap
ReplicatesQC
removeSamples
shRNAfoldChange
NormLines
shRNAcollapseReps
shRNAmapGenes
Removes undesirable shRNAs
and cell line replicates:
• Those shRNAs low in initial
DNA pool or that overlap in
sequence.
• Replicate samples that fail QC
• Calculates fold change values
between initial DNA pool values and
final cell line values.
• Normalizes cell lines to the same
scale (quantile, ZMAD, PMAD)
• Collapses replicate cell line samples
to a single values per shRNA
• Maps shRNAs to gene symbols
based on a mapping file
.gct file of shRNA
scores per cell line
ATARiS or Demeter
.gct file of gene
scores per cell line
http://genepattern.broadinstitute.org/gp/
RNAi Data is Challenging
• shRNAs targeting the same gene
produce different effects
Viability data for one gene
in one sample
• Sources of variability
 Varying suppression levels
 Off-target effects
 Noise
shRNA-5
shRNA-4
• No experimental way to fully assess
shRNA performance
 Especially off-target suppression
shRNA-3
• Naïve averaging of the values will
hurt the signal
shRNA-1
shRNA-2
-3
-2
-1
0
1
Cell death
2
DEMETER infers gene dependencies by explicitly
modeling both gene- and seed-effects
Gene effects
(DEMETER
solutions)
α1
shRNA #1
β1
Seed seq #1
Gene A
α2
shRNA #2
β2
Seed seq #2
Gene B
α3
shRNA #3
β3
Seed seq #3
[…]
shRNA readouts
.
.
.
shRNA #4
shRNA #5
Seed effects
.
.
.
Seed seq #4
Seed seq #5
shRNAi =
µi + a i ×Gene(i ) +
bi × Seed(i) + e
shRNA #6
Seed seq #6
shRNA #7
Seed seq #7
shRNA #8
Seed seq #8
shRNA #9
Seed seq #9
shRNA #10
Seed seq #10
0 £ a i , bi £ 1
[…]
[…]
Challenge Data Release
• Phase 1 (89 cell lines): from June 2 to August 11
 45 in training set (fully available)
 22 in leaderboard set (hidden gene essentiality)
 22 in test set (fully hidden)
• Phase 2 (149 cell lines): from August 11 to September 1
 66 in training set (fully available)
 33 in leaderboard set (hidden gene essentiality)
 50 in test set (fully hidden)
• Phase 3 (149 cell lines): from September 1 to September 28
 105 in training set (fully available)
 44 in test set (hidden gene essentiality)
Sub-challenge 1 Question
• Build a model that best predicts the gene essentiality scores of
thousands of genes, using the molecular characteristics/features of
the cancer cell lines
23288 copy
number features
18960 gene
expression
features
1667
mutation
features
14738 gene
essentiality
scores
Sub-challenge 1 Scoring
• We use Spearman’s rank correlation coefficient to evaluate the
performance
measured gene
essentiality scores
…
ρ1
……….
ρi
predicted gene
essentiality scores
…
……….
ρ14738
• Overall score is the mean of 14738 correlation values
Sub-challenge 2 Question
• Identify the most predictive features for each gene essentiality score
of a prioritized list of 2647 genes
23288 copy
number features
K1 features
18960 gene
expression
features
1667
mutation
features
K2 features K3 features
K1 + K2 + K3 ≤ 10
single gene
essentiality
score
Sub-challenge 3 Question
• Identify the most predictive features for all gene essentialities of a
prioritized list of 2647 genes
23288 copy
number features
K1 features
18960 gene
expression
features
1667
mutation
features
K2 features K3 features
K1 + K2 + K3 ≤ 100
2647 gene
essentiality
scores
Sub-challenges 2 & 3 Scoring
• We use Spearman’s rank correlation coefficient to evaluate the
performance
measured gene
essentiality scores
…
ρ1
……….
ρi
predicted gene
essentiality scores
…
……….
ρ2647
• Overall score is the mean of 2647 correlation values
Challenge Overview
• 300 registered participants
• 2896 leaderboard submissions
– 1621 in sub-challenge 1
– 585 in sub-challenge 2
– 690 in sub-challenge 3
• 48 teams with final submissions
– 21 in sub-challenge 3
– 13 in sub-challenge 2
– 14 in sub-challenge 3
• 22 unique teams with final submissions
• 52 topics in the discussion forum
• 233 posts in the discussion forum
Overall Scores in Sub-challenge 1
Separated
from the rest
(next slide)
0.25
Overall score
0.20
0.15
0.10
0.05
0.00
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21
Teams
Overall Comparison in Sub-challenge 1
Teams
Teams
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
1
0.5 0.43 0.47 0.17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0.43 0.5 0.46 0.22
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
0.47 0.46 0.5 0.19
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
0.17 0.22 0.19 0.5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
0
0
0
0
0.5
0.5 0.49 0.41 0.06 0.01
0
0
0
0
0
0
0
0
0
0
0
6
0
0
0
0
0.5
0.5 0.49 0.41 0.06 0.01
0
0
0
0
0
0
0
0
0
0
0
7
0
0
0
0
0.49 0.49 0.5 0.43 0.06 0.01
0
0
0
0
0
0
0
0
0
0
0
8
0
0
0
0
0.41 0.41 0.43 0.5 0.09 0.01
0
0
0
0
0
0
0
0
0
0
0
9
0
0
0
0
0.06 0.06 0.06 0.09 0.5 0.21 0.02
0
0
0
0
0
0
0
0
0
0
10
0
0
0
0
0.01 0.01 0.01 0.01 0.21 0.5
0
0
0
0
0
0
0
0
0
0
11
0
0
0
0
0
0
0
0
0.5 0.01
0
0
0
0
0
0
0
0
0
12
0
0
0
0
0
0
0
0
0
0
0.01 0.5
0.1
0
0
0
0
0
0
0
0
13
0
0
0
0
0
0
0
0
0
0
0
0.1
0.5 0.02
0
0
0
0
0
0
0
14
0
0
0
0
0
0
0
0
0
0
0
0
0.02 0.5
0
0
0
0
0
0
0
15
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5 0.22
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.22 0.5
0
0
0
0
0
17
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0
0
0
0
18
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0
0
0
19
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0
0
20
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0
21
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0.02 0.1
0.1
• Wilcoxon signed rank
test over 14738
correlation scores for
each team pair
• p-values are displayed
• Blue: significantly
different
• White: no significant
difference
Nonparametric Friedman’s Test
used to detect differences in treatments across multiple blocks
Score matrix (sij)
M1
M2
M3
M4
G1
0.5
0.3
0.2
0.6
G2
0.2
0.5
0.5
G3
0.1
0.3
G4
0.3
G5
G6
Rank matrix (rij)
M1
M2
M3
M4
G1
2
3
4
1
0.4
G2
4
1.5
1.5
3
0.2
0.4
G3
4
2
3
1
0.2
0.1
0.4
G4
2
3
4
1
0.4
0.1
0.4
0.3
G5
1.5
4
1.5
3
0.2
0.4
0.3
0.1
G6
3
1
2
4
2.75
2.42
2.67
2.17
2.50
If the p-value is significant,
appropriate post-hoc multiple
comparisons tests would be
performed
Detailed Comparison of Top 4 Teams
• Nonparametric
Friedman’s test on
rankings over 14738
correlation scores of top
four teams
1
• Methods are different
with a p-value = 3e-4
Teams
2
• Tukey’s honestly
significant difference
criterion as the post-hoc
test
3
4
2.46
2.48
2.50
2.52
Average rank
2.54
2.56
• Top three teams could
not be separated from
each other
Wisdom of Crowds in Sub-challenge 1
0.25
Overall score
0.20
0.15
0.10
0.05
0.00
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21
Teams
Sub-challenge 2 Overall Scores
Separated
from the rest
0.25 (next slide)
Overall score
0.20
0.15
0.10
0.05
0.00
1
2
3
4
5
6
7
Teams
8
9
10
11
12
13
Overall Comparison in Sub-challenge 2
Teams
Teams
1
2
3
4
5
6
7
8
9
10
11
12
13
1
0.5
0.04
0
0
0
0
0
0
0
0
0
0
0
2
0.04
0.5
0.15
0.06
0.06
0
0
0
0
0
0
0
0
3
0
0.15
0.5
0.32
0.31
0.04
0.01
0.01
0
0
0
0
0
4
0
0.06
0.32
0.5
0.48
0.09
0.03
0.04
0
0
0
0
0
5
0
0.06
0.31
0.48
0.5
0.1
0.04
0.04
0
0
0
0
0
6
0
0
0.04
0.09
0.1
0.5
0.3
0.34
0.05
0.06
0
0
0
7
0
0
0.01
0.03
0.04
0.3
0.5
0.45
0.13
0.16
0
0
0
8
0
0
0.01
0.04
0.04
0.34
0.45
0.5
0.1
0.13
0
0
0
9
0
0
0
0
0
0.05
0.13
0.1
0.5
0.45
0.02
0
0
10
0
0
0
0
0
0.06
0.16
0.13
0.45
0.5
0.01
0
0
11
0
0
0
0
0
0
0
0
0.02
0.01
0.5
0
0
12
0
0
0
0
0
0
0
0
0
0
0
0.5
0
13
0
0
0
0
0
0
0
0
0
0
0
0
0.5
• Wilcoxon signed rank
test over 2647
correlation scores for
each team pair
• p-values are displayed
• Blue: significantly
different
• White: no significant
difference
Wisdom of Crowds in Subchallenge 2
0.25
Overall score
0.20
0.15
0.10
0.05
0.00
1
2
3
4
5
6
7
Teams
8
9
10
11
12
13
Overall Scores in Sub-challenge 3
Separated
from the rest
(next slide)
0.25
Overall score
0.20
0.15
0.10
0.05
0.00
1
2
3
4
5
6
7
8
Teams
9
10
11
12
13
14
Overall Comparison in Sub-challenge 3
Teams
Teams
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1
0.5
0.04
0
0
0
0
0
0
0
0
0
0
0
0
2
0.04
0.5
0.09
0.03
0
0
0
0
0
0
0
0
0
0
3
0
0.09
0.5
0.3
0
0
0
0
0
0
0
0
0
0
4
0
0.03
0.3
0.5
0
0
0
0
0
0
0
0
0
0
5
0
0
0
0
0.5
0.26
0
0
0
0
0
0
0
0
6
0
0
0
0
0.26
0.5
0.02
0
0
0
0
0
0
0
7
0
0
0
0
0
0.02
0.5
0.3
0.19
0.11
0
0
0
0
8
0
0
0
0
0
0
0.3
0.5
0.35
0.24
0
0
0
0
9
0
0
0
0
0
0
0.19
0.35
0.5
0.36
0
0
0
0
10
0
0
0
0
0
0
0.11
0.24
0.36
0.5
0
0
0
0
11
0
0
0
0
0
0
0
0
0
0
0.5
0.19
0
0
12
0
0
0
0
0
0
0
0
0
0
0.19
0.5
0
0
13
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0
14
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
• Wilcoxon signed rank
test over 2647
correlation scores for
each team pair
• p-values are displayed
• Blue: significantly
different
• White: no significant
difference
Wisdom of Crowds in Sub-challenge 3
0.25
Overall score
0.20
0.15
0.10
0.05
0.00
1
2
3
4
5
6
7
8
Teams
9
10
11
12
13
14
Further Analysis for Feature
Selection in Sub-challenge 2
Copy number
Gene expression
Mutation
10
Number of features
9
8
7
6
5
4
3
2
1
0
1
2
3
4
5
6
7
Teams
8
9
10
11
12
13
Further Analysis for Feature
Selection in Sub-challenge 3
Gene expression
Mutation
Copy number
100
Number of features
90
80
70
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
Teams
9
10
11
12
13
14
Further Analysis for
Predictability of Genes
400
200
0
1000
Frequency
3000
Sub−challenge 2
0
0.0
0.5
−0.5
1.0
0.0
Scores
0.5
Scores
200
400
600
Sub−challenge 3
0
−0.5
Frequency
Frequency
Sub−challenge 1
−0.5
0.0
0.5
Scores
1.0
1.0
Acknowledgments
• Organizers
 Mehmet Gönen, Adam Margolin (OHSU)
 Barbara Weir, Aviad Tsherniak, Sara Howell (Broad Institute)
 Daniel Marbach (EPFL)
 Bruce Hoff, Thea Norman (Sage Bionetworks)
 Gustavo Stolovitzky (IBM Computational Biology Center)
• Funding
 Broad Institute (Projects Achilles and CCLE) and Dana-Farber
Cancer Institute
 NCI Cancer Target Discovery and Development Network (CTDD)
 NCI Integrative Cancer Biology Program (ICBP)
• Journal Partner
 Nature Biotechnology (Craig Mak and Andy Marshall)
Best Performing Teams
• Sub-challenge 1
 BERL (Masayuki Karasuyama and Hiroshi Mamitsuka)
 UPS (Vladislav Uzunangelov, Sahil Chopra, Kiley Graim, Daniel
Carlin, Yulia Newton, Alden Deran, Adrian Bivol, Sam Ng, Kyle
Ellrott, Evan Paull, Artem Sokolov, and Joshua M. Stuart)
 wtwt5237 (Tao Wang, Xiaowei Zhan, Hao Tang, Yang Xie, and
Guanghua Xiao)
• Sub-challenge 2
 Guanlab_UMich (Fan Zhu and Yuanfang Guan)
• Sub-challenge 3
 Team FAT (Peddinti Gopalacharyulu, Alok Jaiswal, Kerstin
Bunte, Suleiman Khan, Jing Tang, Antti Airola, Krister
Wennerberg, Tapio Pahikkala, Samuel Kaski, and Tero
Aittokallio)
Download