subset coefficient

advertisement
Table 1S – Test Dataset:
Tab 1- Test dataset Transcription Factor
A subset of the True Interactions- 90 transcription factors. A list of the 90
transcription factors from TRANSFAC used for the gold standard
Tab 2- List of Genes in Test dataset
A subset of the True Interactions (TI) from TRANSFAC - 90 transcription factors
and their targets. This list contains 1330 unique elements. If a transcription
factor is also a target it is listed as TF;Tar. The gold standard.
Tab 3- TRANSFAC interactions
The entire list of True Interactions- 2486 TRANSFAC Interactions in the test
dataset. The gold standard.
Tab 4- Full Transcription Factor list
A list of all human transcription factors. Curated from AnimalTFDB and KEGG.
Table 2S - Details of method comparison
Tab 1- Optimized CV value
A table showing, for each dataset (Eppert, Metzeler, Valk, Macrae, and TCGA),
the optimized coefficient of variation (CV ) value used in AML 2.1.
Tab 2- CV Optimization Comparison
Details of the different methods of network inference. For each method
(ARACNE, Pearson Correlation, GENIE3, and TIGRESS) and dataset the number of
true interactions before the CV cutoff and after the CV cutoff for both 100 and
1,000 of the top interactions are listed. We also compare the precision of the
methods, which for this test was higher for Poisson correlation.
Tab 3- Pre-CV Integrated Network
For each of the overlap groups generated for the various methods the precisions
(ratios) of TI hits found in overlaps 2 through 5 are listed for no CV cutoff. In the
random case, interactions were first randomly selected from all the possible
interactions of each of the 5 Pre-CV datasets. Below this are more details on the
interactions found in the various networks using the other methods, compared
to Pearson correlation. This includes shared interactions and TI hits in addition to
the new interactions compared to the Pearson correlation. The Venn diagram
shows the overlap among all the new TF interactions of the various methods.
Tab 4- Post-CV Integrated Network
For each of the overlap groups generated for the various methods, the precisions
(ratios) of TI hits are listed after the respective cutoff. In the random case,
interactions were first randomly selected from all the possible interactions of
each of the 5 Post-CV datasets. For this test the precision of the methods was
higher for Below this are more details on the interactions found in the various
networks using the other methods, compared to Pearson correlation. This
includes shared interactions and TI hits in addition to the new interactions
compared to the Pearson correlation. The Venn diagram shows the overlap
among all the new TF interactions of the various methods.
Table 3S – Poisson optimization
Tab 1- CC Cutoff and # of Interactions
This table shows the cutoff value for each dataset using the Pearson Correlation
method. It also shows the number of interactions after using the correlation
cutoffs and the interactions included in AML 2.1 after reproducibility analysis.
Tab 2- Total Interactions
The number of interactions in each overlap group found using the indicated
average TIs per bin (equivalent to lambda of Poisson) and p-value are shown. We
also show the ratio of the sum of interactions found in overlap 2 and above
divided by the interactions found in overlap 1. This ratio is a measure of
reproducibility.
Tab 3- TI Hits
The number of TIs in each overlap group are listed for all combinations of TI rate
per bin and p-value (ranging from 1 to 3 and 0.05 to 0.30 respectively).
Tab 4- TI Enrichment
The number of TIs divided by the interactions in that overlap group are listed for
all combinations of TI rate per bin and p-value (ranging from 1 to 3 and 0.05 to
0.30 respectively)
Tabs 5 and 6 – Overlap ratio rank and final rank
The parameters are first ranked according to reproducibility (in overlap ratio)
and then (in Final rank) the top third, which had comparable values of
reproducibility, is ranked using the TI enrichment, a measure equivalent to
precision. Lambda =2 and p value = 0.1 were therefore used in the Poisson
distribution to identify the correlation coefficient cutoff.
Table 4S – TFG overlaps with TI hits
All Tabs
The transcription factor gene (TFG) interactions and the datasets used (column
D). Each tab represents the different overlap groups from 1 to 5 for AML 2.1. For
overlap 2 we also list separately the interactions from this group that were
included in AML 2.1. The average rank is obtained from the five datasets ranked
by the Pearson correlation coefficient. Additionally, each tab contains the TI
precision (ratio) for that overlap group.
Table 5S – AML 2.1 PPI overlaps
Overlap Tabs
The protein-protein (PPI) interactions and the datasets used. Each tab represents
the different overlap groups from 1 to 5 for AML 2.1. The mean of the average
coefficient column can be found at the bottom of each overlap tab.
Eppert_HippieHit_15000000_300
All 15,000,000 interactions partitioned into 300 bins with 50,000 interactions per
bin. Ranked left to right in descending order of correlation.
Eppert_HippieHit_500000_500
500,000 interactions partitioned into 500 bins with 1,000 interactions per bin.
Ranked left to right in descending order of correlation.
Table 6S – Network AML 2.1
Tab 1- AML_2.1 Network
The list of all interactions in the AML_2.1 Network. Column C denotes pp if part
of the PPI sub-network and pd if part of the TFG sub-network. Column D is 1 if a
directed edge and 0 otherwise.
Tab 2- AML_2.1_Genes
A list of all genes (accompanied by theit respective Ensembl_Gene and
EntrezGene IDs) in the AML_2.1 Network.
Table 7S – Cluster analysis
All Tabs
MCODE cluster analysis of AML_2.1 Network. This table also includes RECON2
summary and counts indicating which gene sets are elevated in AML or Normal
samples using odds ratio and P-Value. Additionally, all clusters have a tab
containing statistics obtained using BiNGO GO Enrichment.
Table 8S – AML 2.1 properties and mutations
Tab 1- AML_2.1_Mutation_Subnetwork_Nei
A subset of the AML_2.1 Network containing all interactions that have an edge
to a mutation gene.
Tab 2 - Gene_Mutation_Count
The mutations and mutation first neighbors as well the number of edges
separating the gene from a mutation.
Tab 3- Gene_GOEnrichment
GO functional enrichment analysis of the mutation sub-network. See Biology
results section, AML Mutations, for more details about these statistics.
Tab 4- AML 2.1 properties
Individual functional and network details for each of the genes in the AML_2.1
Network.
Table 9S KIEN beta coefficients
Tab 1- Table9S_kinase_equation_coeffic
Full list of coefficients calculated using the method discussed in section
"Regression analysis with KIEN" within the methods of the paper. A larger
coefficient for a kinase means that inhibition of that kinase is associated to a
stronger response in the in vitro experiments with patients' primary cells. Note
that only 101 kinases in this list are present in the AML 2.1 network.
Table 10S TFG
All Tabs
Overlap groups for the TFG sub-network generated when using the same
technique for AML_2.1 and applying it to a total of 12 datasets (Eppert,
Metzeler, Valk, Macrae, TCGA, Alcalay, Alizadeh, Klein_15434, Klein_21261,
Roller, Taskesen, Verhaak). See main text for GEO numbers.
Table 10S PPI
All Tabs
Overlap groups for the PPI sub-network generated when using the same
technique for AML_2.1 and applying it to a total of 12 datasets (Eppert,
Metzeler, Valk, Macrae, TCGA, Alcalay, Alizadeh, Klein_15434, Klein_21261,
Roller, Taskesen, Verhaak). See main text for GEO numbers.
Table 11S – 5 to 12 group 2
All Tabs
TFG data used to generate figure 4B in the main paper.
Table 12S – TI Hits 12 and 10 to 12
Tab 1- TIHit_12
A summary of the TI ratios (precision) for the overlap groups of the 12 datasets
mentioned in Table 10S TFG.
Tab 2- 5 datasets
A summary of AML_2.1 used for comparison to the network containing 12
datasets.
Tab 3- TFG 10 to 12
This table shows how many TFG interactions from overlaps 1 through 10
(obtained using 10 datasets) were present in the next 2 overlap groups or were
still in the same group (using 12 datasets). Shown in figure 4A.
Tab 4- PPI 10 to 12
This table shows how many PPI interactions from overlaps 1 through 10
(obtained using 10 datasets) were present in the next 2 overlap groups or were
still in the same group (using 12 datasets).
Download