Supplementary Information (Colon cancer molecular subtypes): - Additional Methods: - Analysis of Microsatellite instability - Analysis of mutations in KRAS codon 12/13 - Analysis of BRAF V600E Mutation - Analysis of PI3K Mutations - Immunohistochemical analysis of β-catenin. - Immunohistochemical analysis of Ki-67 (proliferation). - Immunohistochemical analysis of M30 (apoptosis). - Supplementary Tables: - Table S1 Analysis of KEGG pathways - Table S2 Correlations of tumor subtypes with clinical parameters - Table S3 Performance of the classification of the 167 genes Low-stroma predictor - Table S4 Coincident genes between different gene signatures - Supplemental Legend to Figure S1 Additional Methods: Analysis of Microsatellite instability DNA was purified from the same TRIZOL tumor lysate used for RNA purification once the RNA was already extracted. Sequences of Bat25, Bat26 and S5S346 were analyzed by PCR using the following primers: Bat 26-sense: FAM-TGACTACTTTTGACTTCAGCC Bat 26-antisense: AACCATTCAACATTTTTAACCC Bat 25-sense: HEX-TCGCCTCCAAGAATGTAAGT 1 Bat-25-antisense: TCTGCATTTTAACTATGGCTC S5S346-sense: TET-ACTCACTCTAGTGATAAATCGGG S5S346-antisense: AGCAGATAAGACAAGTATTACTAGTT Fragments were run in an ABI Prism 310 using the software ABI Prism Collection and Gene Scan 3.1 (Applied Biosystem). Samples with at least two microsatellites affected were considered as MSI+. Analysis of mutations in KRAS codon 12/13. Mutations in codon 12 and 13 of KRAS were analyzed by PCR followed by restriction analysis and electrophoresis following Hatzaki et al method (1) with some modifications: For KRAS codon 12 PCR primers sequence were: Sense: ACTGAATATAAACTTGTGGTAGTTGGACCT Antisense: (FAM-6)-CCTCTATTGTTGGATCATATTCGTC 109 bp PCR product was treated with BstN1 to cut it in two pieces of 29 and 80 bp. Fragments were run in the ABI Prism 310. The 80 bp fragment was labeled with FAM-6 and was detected by the sequencer if there was wild type KRAS. If codon 12 was mutated restriction enzyme did not cut and the complete 109 PCR product was detected. For KRAS codon 13 PCR primers sequence were: Sense: (FAM-6) -TAACGCCTGCTGAAAATGACTG Antisense: GTATCGTCAAGGCACTCTTGCCTAGG 79 bp PCR product was treated with HaeIII to cut it in two pieces of 53 and 26 bp. Fragments were run in the ABI Prism 3.10. The 53 bp fragment was labelled with FAM-6 and was detected by the sequencer if there was wild type KRAS. If codon 13 was mutated restriction enzyme did not cut and the complete 79 PCR product was detected 2 Analysis of BRAF V600E Mutation: V600E BRAF Mutation was analyzed using Taq-Man Assays and Real Time-PCR (ABI Prism 7500 sequence detection system). PCR primer and reporter sequences were: BRAF-sense: 5’-CATGAAGACCTCACAGTAAAAATAGGTGAT-3’ BRAF-antisense: 5’-TGGGACCCACTCCATCGA-3’ Wild Type Reporter sequence: 5’-VIC-CTAGCTACAGTGAAATC-3’ Mutant Reporter sequence: 5’-FAM-TAGCTACAGAGAAATC-3’ Real-time PCR was performed in a final reaction volume of 25 μl containing 12.5 μl of 2× TaqMan Universal PCR Master Mix (Applied Biosystems), 0.625 μl of 40x Assay Mix and 20ng of DNA. Amplification conditions were 10 minutes at 95°C, followed by 40 cycles of 15 seconds at 92°C and 1 minute at 60°C. Analysis of PI3K Mutations: PI3K mutations in exon 9 and exon 20 were analyzed by PCR followed by direct sequencing. PCR primer sequences were: PI3K-Exon 9-sense: GCTTTTTCTGTAAATCATCTGTG PI3K-Exon 9-antisense: CTGAGATCAGCCAAATTCAGT PI3K-Exon 20-sense: ACATTCGAAAGACCCTAGCC PI3K-Exon 20-antisense: CAATTCCTATGCAATCGGTCT PCR was performed in a final volume of 25 μl using 50 ng of genomic DNA, 2.5 μl 10 × Taq buffer, 2.5 μl MgCl2 (25 mM), 0.5 μl of each primer (10 μM), 2.5 μl dNTP (2 mM) and 0.2 μl Taq polymerase (1 U/μl). 3 PCR conditions for PI3K-Exon 9 were: 10 min. at 95°C, followed by 10 cycles of 30 sec at 95°C, 30 sec at 60ºC and 30 sec at 72°C and 25 cycles of 30 sec at 95°C, 30 sec at 55ºC and 30 sec at 72°C. A final step of 10 min at 72ºC was added. PCR amplification conditions for PI3K-Exon 20 were: 10 min. at 95°C, followed by 10 cycles of 30 sec at 95°C, 30 sec at 58ºC and 30 sec at 72°C and 25 cycles of 30 sec at 95°C, 30 sec at 55ºC and 30 sec at 72°C. A final step of 10 min at 72ºC was added. Direct sequencing was carried out using BigDye terminator V 1.1 cycle sequencing kit (Applied Biosystems). Sequence analysis was performed in an ABI PRISM 3130. Primers used during sequencing were the same used for amplification. Immunohistochemical analysis of β-catenin. Sections 3 μm thick of TMAs were cut, deparaffinized in xylol, and then rehydrated in descending dilutions of ethanol. Antigen retrieval was done in a hot bath (100ºC) for 15 min with ER-1 Bond Retrieval solution (pH 6.0). The endogenous peroxidase activity was blocked by 10 min of incubation with H2O2 at room temperature. Sections were incubated with β-catenin polyclonal antibody (Master Diagnostica) during 60 min at room temperature. Then, sections were incubated with a secondary antibody (horseradish peroxidase (HRP)) for 30 min at room temperature followed by incubation with diamiobenzidine (DAB) for 5 min at room temperature. Tissues were counterstained with hematoxylin. Immunohistochemical analysis of Ki-67 (proliferation). Sections 3 μm thick of TMAs were cut, deparaffinized in xylol, and then rehydrated in descending dilutions of ethanol. Antigen retrieval was done in steamer with citrate buffer 4 (pH 6.0) (Dako Real Target Retrieval solution) for 8 min. The endogenous peroxidase activity was blocked by 10 min of incubation with H2O2 at room temperature. Sections were incubated with ki67 monoclonal antibody (clone MIB1) (Dako) during 30 min at room temperature. Then, sections were incubated with a secondary antibody (horseradish peroxidase (HRP)) for 30 min at room temperature followed by incubation with diamiobenzidine (DAB) for 5 min at room temperature. Tissues were counterstained with hematoxylin. Immunohistochemical analysis of M30 (apoptosis). Sections 3 μm thick of TMAs were cut, deparaffinized in xylol, and then rehydrated in descending dilutions of ethanol. Antigen retrieval was done in a hot bath (100ºC) for 15 min with ER-1 Bond Retrieval solution (pH 6.0). The endogenous peroxidase activity was blocked by 10 min of incubation with H2O2 at room temperature. Sections were incubated with M30 Cytodeath monoclonal antibody (Roche Pharma) during 60 min at room temperature. Then, sections were incubated with a secondary antibody (horseradish peroxidase (HRP)) for 30 min at room temperature followed by incubation with diamiobenzidine (DAB) for 5 min at room temperature. Tissues were counterstained with hematoxylin. 1. Hatzaki A, Razi E, Anagnostopoulou K, et al: A modified mutagenic PCR-RFLP method for KRAS codon 12 and 13 mutations detection in NSCLC patients. Mol Cell Probes 2001; 15:243-247. 5 Supplemental Tables: Table S1: Analysis of Kegg Pathways LS permutation p-value KS permutation p-value EfronTibshirani's GSA test p-value Goeman's global test p-value hsa04512 ECM-receptor interaction 0.00001 0.00001 < 0.005 < 0.0000001 hsa01430 Cell Communication 0.00001 0.0023208 < 0.005 < 0.0000001 hsa04514 Cell adhesion molecules (CAMs) 0.00001 0.00001 0.15 < 0.0000001 hsa04510 Focal adhesion 0.00001 0.00001 0.01 < 0.0000001 hsa04360 Axon guidance 0.00001 0.00001 0.01 < 0.0000001 hsa04610 Complement and coagulation cascades 0.00001 0.0046577 0.01 < 0.0000001 hsa04810 Regulation of actin cytoskeleton 0.00001 0.00001 0.05 < 0.0000001 hsa04670 Leukocyte transendothelial migration 0.00001 0.0012397 0.07 < 0.0000001 hsa00190 Oxidative phosphorylation 0.00001 0.00001 0.14 < 0.0000001 hsa04060 Cytokine-cytokine receptor interaction 0.00001 0.00001 0.15 < 0.0000001 hsa00230 Purine metabolism 0.00001 0.00001 0.38 < 0.0000001 hsa04310 Wnt signaling pathway 0.00001 0.00001 0.51 < 0.0000001 hsa00532 Chondroitin sulfate biosynthesis 0.0001609 0.0699698 0.04 < 0.0000001 hsa04350 TGF-beta signaling pathway 0.0001835 0.0629803 0.01 < 0.0000001 hsa04940 Type I diabetes mellitus 0.0006505 0.0003245 0.3 0.0000033 hsa05130 NA 0.0032635 0.1833333 0.12 < 0.0000001 hsa05131 NA 0.0032635 0.1833333 0.12 < 0.0000001 hsa04640 Hematopoietic cell lineage 0.0039121 0.0003785 0.26 < 0.0000001 hsa04612 Antigen processing and presentation 0.0046006 0.0071049 0.39 0.0000083 hsa00530 Aminosugars metabolism 0.063982 0.0031385 0.06 < 0.0000001 Kegg Pathway Pathway description Kegg pathways comparison between clusters using 14764 genes for random variance estimation. 20 out of the 164 investigated gene sets passed the 0.005 significance threshold in at least two tests. LS/KS permutation test found 20 significant gene sets; Efron-Tidshirani's maxmean test found 2 significant gene sets; Goeman's Global test found 156 significant gene sets 6 Table S2: Correlations of tumor subgroups with clinical parameters Cluster 1 (35) Cluster 2 (12) Rigth Colon 18 Left colon 17 Parameter Localization Histologic grade Tumoral Margin lymphocyte infiltration Vascular invasion Perineural invasion Tumour extension (T) Lymph nodes (N) Metastasis (M) K-Ras Mutation PI3K Mutation Sex Age RIN Ki-67 (IHC) Apoptosis (IHC) Cluster 3 (22) Cluster 4 (14) 3 7 9 9 15 5 High grade 1 2 2 1 Low grade 34 10 20 13 Infiltrative 27 9 15 7 Mixt 4 1 4 1 Expansive 4 2 3 6 Absent 24 6 12 7 Low 3 3 6 2 Medium 6 3 3 5 High Yes 2 0 1 0 17 5 8 2 No 18 7 14 12 Yes 3 1 1 1 No 32 11 21 13 T1 1 0 0 0 T2 8 5 5 5 T3 21 6 17 7 T4 5 1 0 2 N0 21 8 12 11 N1 8 2 6 3 N2 6 2 4 0 M0 26 10 16 13 M1 9 2 6 1 Mutated 17 5 5 4 WT 18 7 17 10 Mutated 7 2 3 6 WT 28 10 19 8 Male 20 5 8 9 Female 15 7 14 5 Mean 70.5 72.3 72.6 71.0 SD 13.3 10.5 9.6 11.6 Mean 9.0 8.7 8.9 9.1 SD 0.6 0.8 0.7 0.8 Mean 54.2 64.0 53.4 55.0 SD 19.6 25.2 13.0 20.6 Mean 8.5 7.8 8.5 6.6 6.1 4.9 SD 8.6 6.2 ҳ : Chi-Square; LR: Likelihood Ratio; RIN RNA Integrity Number. p-global 0.103 LR 0.462 LR 0.325 LR 0.452 0.168 LR 0.356 LR 0.458 LR 0.376 0.217 0.229 0.341 0.825 ҳ LR 0.282 0.438 ҳ LR 0.944 0.865 ҳ ҳ KW KW KW KW 7 Table S3 Performance of the 167 genes Low-stroma-subtype predictor Class Sensitivity Specificity PPV NPV Low-stroma 0.958 0.972 0.979 0.946 Other 0.972 0.958 0.946 0.979 Mean % of correct classification: 96% PPV: Positive-Predictive Value, NPV: Negative-Predictive Value Table S4 Coincident genes between different gene signatures Coincident genes (2) between Low-stroma predictor and Eschrich et al. NM_000582 Homo sapiens secreted phosphoprotein 1 (SPP1) NM_173574 Homo sapiens zinc finger protein 683 (ZNF683) Coincident gene (1) between Low-stroma predictor and Garman et al. predictor: NM_006475 Homo sapiens periostin, osteoblast specific factor (POSTN) Coincident gene (1) between Low-stroma predictor and Wang et al. predictor: NM_004063 CDH17 Coincident genes (12) between Low-stroma predictor and Jorissen et al. predictor: L12350 Human thrombospondin 2 (THBS2) NM_000104 Homo sapiens cytochrome P450, (CYP1B1), NM_000138 Homo sapiens fibrillin 1 (FBN1) NM_000362 Homo sapiens TIMP metallopeptidase inhibitor 3 (TIMP3) NM_000582 Homo sapiens secreted phosphoprotein 1 (SPP1), transcript variant 2 NM_002192 Homo sapiens inhibin, beta A (INHBA) NM_003239 Homo sapiens transforming growth factor, beta 3 (TGFB3) NM_004791 Homo sapiens integrin, beta-like 1 (ITGBL1) NM_006475 Homo sapiens periostin, osteoblast specific factor (POSTN) NM_006873 Homo sapiens stonin 1 (STON1) NM_080927 Homo sapiens discoidin, CUB and LCCL domain containing 2 (DCBLD2), NM_148672 Homo sapiens chemokine (C-C motif) ligand 28 (CCL28) Coincident genes (2) between Low-stroma predictor and Oncotype DX predictor: NM_004460 FAP NM_002192 INHBA Coincident gene (1) between Low-stroma predictor and ColoPrint predictor: NM_000862 HSD3B1 8 Legend to the supplemental figure Figure S1 - Hierarchical clustering of the combined set of 159 tumor samples. Samples from cluster 5, normal tissue and adenomas were excluded. Yellow shadow: Cluster-1 or Low-stroma-subtype; green shadow: Cluster-2 or Immunoglobulin-related-subtype; red shadow: Cluster-3 or High-stroma-subtype; blue shadow: Cluster-4 or Mucinous-subtype; (E): Eschrich samples. 9