GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG ATCGACTGGCGATCAAACAGTCACGCGCATCGATCGACTGA GATCGCGGCATCGCGACGCGGATAAATACGAGCACTACAAA TGACTACGGGATTTTACGCGCGATACGACTGACTGACTAGC GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG LINCS Fall Consortia Meeting ATCGACTGGCGATCAAACAGTCACGCGCATCGATCGACTGA GATCGCGGCATCGCGACGCGGATAAATACGAGCACTACAAA Broad Institute U54 Team TGACTACGGGATTTTACGCGCGATACGACTGACTGACTAGC TGACGATCGAGAGACTCG01010001010101000101010101001 Todd Golub, co-PI 0010101010100000011110101111101001010101000111011101 Wendy Winckler, co-PI 0111101101010111001010101000111010101001100101110101 Aravind Subramanian, Team Leader 0111010010001010100011110101000010101010100010100011 0010101000101011110101000100100100101010001000001011 October 27, 2011 0010101010100000011110101111101001010101000111011101 0111101101010111001010101000111010101001100101110101 0111010010001010100011110101000010101010100010100011 0010101000101011110101000100100100101010001000001011 0101001010000101111101001010010101011101010010101001 BASIC DISCOVERIES GENETIC CONNECTIONS THERAPEUTIC IMPACT PATHWAYS DISEASE STATES TOOL COMPOUNDS DRUGS GWAS TCGA RNAi CHEMICAL SCREENS NAT’L PRODUCTS SLOW (SOME NEVER START) DOES NOT SCALE NO LEVERAGE DIAGNOSTICS LINCS as a Solution • perturbations scalable to genome • high information content read-outs (e.g. gene expression) • inexpensive • mechanism to query database Toward a reduced representation of the transcriptom gene expression is correlated genes samples Reduced Representation of Transcriptome reduced representation transcriptome genome-wide expression profile computational inference model ‘landmarks’ 100 60 40 20 A. Subramanian, R. Narayan 100 300 500 700 1000 1000 1500 2000 5000 10000 14812 0 22283 ~ 100,000 profiles % connections 80 80% number of landmarks measured 1000-plex Luminex bead profiling AAAA 3' 5' RT 3' 5' 5'-PO4 | TTTT 3' Luminex Beads (500 colors, 2 genes/color) ligation 5' 5' PCR hybridization 001 Reagent cost: $3/sample Validation of L1000 approach Gene-level validation 12 1000-plex-Luminex 11 10 9 92% R2 > 0.6 Similar to AFFX vs ILMN 8 7 6 5 4 6 8 10 12 14 Affymetrix Affymetrix ($500) C-Map Connections Affymetrix simulation Luminex ($5) 1,000-plex Connections Published (32) Internal (152) 26 (80%) 121 (80%) 28 (86%) 142 (94%) Putting it all together Illustration: Bang Wo Cell Types GTEx Primary hTERT-immortalized cells Patient-derived iPS cells* Banked primary cells* (T-cells, macrophages, hepatocytes, myocytes, adipocytes) Cancer cell lines * in assay optimization 2-3 weeks Cell Repository (e.g. Coriell) 3-4 weeks 4-6 weeks Reprogramming [Oct4, Sox2, Klf4, Myc] Neural Differentiation Astrocyte somatic cell isolation fibroblasts Oligodendrocyte Neural progenitors Neuron Perturbagens Small-molecules (n=4,000) Genes (n=3,000) Automated Quality Control Measures Overall failure rate ~ 8% LINCS Proposal (~ 600,000 profiles) 4,000 compounds • 1,300 off-patent FDA-approved drugs • 700 bioactive tool compounds • 2,000 screening hits (MLPCN + others) 2,000 genes (shRNA + cDNA) • known targets of FDA-approved drugs (n=150) • drug-target pathway members (n=750) • candidate disease genes (n=600) • community nominations (n=500) 20 cell lines • emphasis on reproducibility and availability • cancer and primary, non-cancer • some ‘doubling down’ to assess intra-lineage diversity Progress to date http://www.broadinstitute.org/lincs_beta/ DATA RELEASE (BETA) proposed actual projected Signature of p53 ORF p53 vs. empty vector • p53 is NOT a Landmark Gene • p53 pathway is #1 pathway of 512 in MSigDB P < 0.001 Ramnik Xavier Making connections in primary macrophages NF-kB pathway genes (all INFERRED) pathway rank: 1/512 LPS pathways curated from literature (n=512) Jens Lohr Prioritizing human genetics candidates Ramnik Xavier, MGH Signatures of genetic variants connect to disease genes Ramnik Xavier, MG Disease variants connect to pathways e.g. CD40 to ATG16L1 (both regulators of autophagy) Ramnik Xavier, MGH ERG transcription factor important in hematopoietic stem cells, prostate cancer ERG-BINDING SMALL-MOLECULES Defining a gene expression signature of ERG activity integrating experimental and clinical data Gain of Function: Primary prostate + hTERT +ST +AR +/-ERG Loss of Function: VCaP cells +/- ERG shRNA 120 Patient Samples: Physician’s Health Study 3/69 ERG-binders inhibit ERG gene expression program L1000 as primary small-molecule screen read-out 12,985 compounds screened for ERG signature Name Rank Library BRD-K42581894-001-01-1 1 DOS BRD-K42581894-001-01-1 2 DOS BRD-K14408783-001-01-5 3 DOS Wortmannin 4 Bioactives BRD K78122587 5 ChemDiv BRD-K91899208-001-01-8 6 DOS BRD-K24750847-001-01-2 7 DOS BRD-K18273607-001-02-1 8 DOS BRD-K76892938-001-01-9 9 DOS AZD2281 (Olaparib) 10 Bioactives BRD-K86715531-001-01-1 11 DOS BRD-K95688283-001-01-9 12 DOS BRD-K99179945-001-01-5 13 DOS Analytical and software challenges 1. 2. 3. 4. 5. Infrastructure: data and compute server Optimization of connectivity metrics and statistics Optimization of inference models (context-aware) UI: query tools and results visualization Addressing off-target effects of perturbagens Aravind Subramanian Wendy Winckler Justin Lamb Computational Rajiv Narayan Josh Gould RNAi Platform Laboratory Chemical Biology Platform Dave Peck Genetic Analysis Platform Willis Reed-Button Broad Program Scientists Xiaodong Lu