MicroArrays #1: Technology & Gene Target ID Timothy J. Triche, MD, PhD Pathologist-in-Chief Childrens Hospital Los Angeles University of Southern California Basic Forms of Array Technology Multiple Northerns/slot blots Macroarrys (spotted cDNAs, filter substrate) Microarrays (a la’ Pat Brown) (cDNA spots) Microarrays (oligomers synthesized in situ) (Affymetrix) Microarrays (spotted oligomers, a la’ Wold/Caltech 50mer core) Macro Array Technology Dozens or hundreds of cDNAs spotted onto nylon membranes in duplicate arrays, in replicate cDNAs from samples of interest hybridized to replicate membranes Differential expression inferred from spot density Quantitation, hybridization to homologous genes a problem Spotted cDNA Arrays Up to 10,000 genes per chip Full length cDNA spots Requires competitive hybridization Based on Pat Brown technology Sequence validation a problem Oligomeric Spotted Arrays Customizable like spotted arrays Performance characteristics ~ Gene Chips Gene representation ~ Gene Chips, unlike spotted arrays Low cost Illustration courtesy of B. Wold, Caltech Reproducibility Same sample preparation 2 probe arrays Comparative expression (r=1 +/- 2 SD) Virtually no false positives Differential Gene Expression T cells, stimulated (y axis) vs. control (x axis) All genes outside inner R lines are dysregulated (up or down) Rationale for Affymetrix Gene Chips Pro: • Current Standard for Array Technology • Greatest versatility (re-sequencing, expression, multiple species, directed arrays in near future) • Sequences by definition verified • Allows user-constructed x-wise comparisons • No need for competitive hybridizations • Uses minimum amount of tissue/RNA Con: • No choice of individual genes on array • Customized arrays prohibitively expensive Gene Expression Analysis B) Cut ~12 frozen sections A) Cut pilot section of OCT embedded frozen tissue C) Extract RNA (<5ug total RNA) D) Synthesis of double-stranded cDNA E) In-vitro transcription w/ biotinylated nucleotides F) Size confirmation of cRNA transcripts tumor non-tumor dissection of tumor tissue when possible pure tumor G) Fragmentation of cRNA 500 bp Affymetrix Core Facility Micro Array Technology > 20 24mers of known expressed gene sequence tiled across chip, with paired mismatches Fluorerscent cmRNA hybridized to chip Fluorescence intensity quantitated and analyzed by defined algorithms Affymetrix Probe Arrays Unlike cDNA arrays, a series of >20 24mers are synthesized in situ by photolithography, representing exonic sequences from 5’ to 3’ of the genes of interest: Some Currently Available Chips HuFL (1,300 to 60,000 human genes) P53 (detects over 400 known p53 mutations, in codons 2-11) HIV (HIV PRT sequencing, protease & RT, codons 1-400, over 1,500 nucleotides) CYP450 (cytochrome p450, 10 alleles of 2D6 gene & 2 alleles of 2C19 gene) Mu 19, 11, & 6.5 (19,000 to 6,500 mouse genes) Rat Genome U34 (>24,000 rat genes & ESTs) Yeast 61 (6,100 Saccharomyces ORFs) Affymetrix SNP Chips >1,500 intragenic polymorphisms distributed over entire genome Re-sequencing chip ~ p53, HIV Number of SNPs to vastly increase in near future Polymorphisms likely to predict disease susceptibility, therapeutic responsiveness, disease severity Pathway Analysis: Tumor The P53 Paradigm IGF-II APC PI.3Kgam mapk3pk rps6ka2 hek8 c-yes FHIT OB-cad2 wnt-1 ATM plakoglobin mdm2 intg/alpha Bcl-2 erb-A pax3 caveolin waf1 gadd45 erb-B2 GLI Kup70 25 20 RN A DNA Gene Mutation 15 10 Gene Expressio n Expressio n Profiles 5 0 Fold change Bax Rb Cell Cycle (cancer) p 53 Biologic Defect Mutation Detection p 21 mdm 2 Gadd 45 Apoptosis Bcl-2 BRCA1 DNA Repair MSH2 CdK cyclin D1 Biologic Pathways P53: Mutation vs. Function Neuroblastoma N-myc Southern N-myc FISH Shimada Classification System Stroma-Rich Distribution of Immature Cells Stroma-Poor Unfavorable Histology Age > 5 Isolated Clumped Age < 18 mos. 18-60 mos. MKI > 200 MKI < 200 Nodular Differentiating Intermixed WellDifferentiated Undifferentiated MKI > 100 MKI < 200 Neuroblastoma Survival N-myc Status, Shimada classification, and Outcome Neuroblastoma Prognosis Comparison of MYCN Status, TRKA expression, & Survival Myc-Mediated Cell Pathways Amplified N-myc DNA Abundant N-myc Protein DNA Myc:Max Instability Heterodimers Inhibit differentiation P53 Protein Apoptosis Promote Proliferation Biologic Interpretation cdk2 Cdk 2 Cyclin A Cyclin E Cdk 2 + RbE2F p21 Cyclin A PO4 Cyclin E + RB-PO4+E2F cdk2 RB-PO4+E2F G1 S CyclinD RB-PO4+E2F + G2 CyclinD Cdc4 or 6 RbE2F + cdk2 Degradation MDM2 PCNA p16INK4 Cyclin B p19 GADD 45 Induction: RB-PO4+E2F p21 Inhibition: G1 cdk2 Cyclin B p27KIP Cdc4 or 6 M p53 Apoptosis BAX BCL2 Pathway Analysis: Tumor The P53 Paradigm IGF-II APC PI.3Kgam mapk3pk rps6ka2 hek8 c-yes FHIT OB-cad2 wnt-1 ATM plakoglobin mdm2 intg/alpha Bcl-2 erb-A pax3 caveolin waf1 gadd45 erb-B2 GLI Kup70 25 20 RN A DNA Gene Mutation 15 10 Gene Expressio n Expressio n Profiles 5 0 Fold change Bax Rb Cell Cycle (cancer) p 53 Biologic Defect Mutation Detection p 21 mdm 2 Gadd 45 Apoptosis Bcl-2 BRCA1 DNA Repair MSH2 CdK cyclin D1 Biologic Pathways Colon Ca & Colon: clustering Alon, et al: PNAS, June, 1999 Colon: Tumor vs. Normal Alon, et al: PNAS, June, 1999 Tumor Profiling by Array Analysis Khan, et al: Cancer Res, Nov. 1998 Ewing’s Sarcoma Ewing’s Sarcoma Cytogenetics Chr 11 chr 22 EWS EWS FLI-1 FLI-1 Normal Der Normal Der Fusion Genes in Childhood Sarcomas Sarcoma PAX/FKHR fusion gene found only in Alveolar RMS EWS/ets defines Ewing’s/pPNETs, including EOE in IRS No fusion gene found in Emb RMS Other fusion genes define other STSs Fusion Gene Alveolar RMS PAX3,7/FKHR Embryonal RMS None (LOI,LOH) Ewing’s EWS/ets Synoviosarcoma SYT/SSX Desmoplastic RCT EWS/WT-1 Soft Part Melanoma EWS/ATF Liposarcoma TLS/CHOP Ewing’s Interphase FISH Alternate Splicing of EWS/Fli-1 Transcript 1 EWS 656 NH2 COOH RNA BD 349 265 Type 3 Insert Type 1 Insert EWS / Fli-1 NH2 COOH ETS D Type 2 Insert Human 1 NH2 Fli-1 198 219 452 ETS ETSDD COOH PCR of Chimeric Gene Transcripts Virtually all Ewing’s & pPNETs express a variant of EWS/FLI-1 or an EWS/ets chimeric gene Type 3 Type 2 Type 1 Ewing’s Sarcoma: Survival by Fusion Gene Status Overall survival 1,0 p=0.034 ,8 EWS-FLI1 type 1 ( n=46) ,6 ,4 EWS-FLI1 other types (n=27) ,2 0,0 0 20 40 60 80 100 Follow-up (months) From de Alava, et al:J Clin Oncol 1998 120 140 Induced EWS/FLI-1 Expression 3 clones compared to typical Ewing’s cell line EWS/FLI-1 compared to housekeeping gene (GAPDH) Induced levels ~ native tumor levels EWS/FLI-1 Induces Cell Cycle E/F expression induced, cells harvested 24 hrs. later Induced compared to uninduced Majority of uninduced cells in G0/G1 Vast majority of induced cells in S/G2 Known EWS/FLI-1 Gene Targets Numerous genes identfied by RDA, DD-PCR, Subtraction Libraries, etc., but no consistent pattern emerges – Denny et al: May: Sorensen: Wu et al Triche et al: type 16 keratin, Integrin Manic Fringe Gastrin-Releasing Peptide MAT1 E2F, Cyclin D Ewing’s Gene Targets E/F Induced Gene Expression Human cells + / - EWS/FLI1 expression were analyzed 25 mRNA was hybridized to a gene expression chip 15 Genes showing significant variance were plotted Three genes, including the most affected, IGF-II, are critical to development of the dorsal somitic mesoderm (=> RMS) and neural crest (=> PNET) 20 10 5 0 -5 -10 -15 IGF-II/ex APC PI.3Kgam mapk3pk rpS6ka2 hek8 c-yes FHIT OB-cad-2 wnt-1 ATM plakoglob mdm2-a intg/alph Bcl-2-bet erb-A pax2 caveolinwaf1 gadd45 erb-B2 (he GLI Genes in Developing Somites The Dorsal Somitic Mesoderm Gives Rise to both skeletal muscle and neural crest derivatives smo wnt ptc shh gli PAX IGF-II ? IGF-R #1 Problem in Array Interpretation: • • • • Bioinformatics! All array technologies generate massive amounts of data Individual gene data are rarely informative Marginally deviant genes and clusters often the most informative None of the above is generally apparent from casual inspection Bioinformatics Solutions Affymetrix GeneChip (basic) Spotfire (visualization) GeneSpring (visualization, analytic) MolPat (academic; cluster analysis) GENECLUSTER (MIT) SOMs (Self Organizing Maps) Custom solutions (EM, etc.) Needed Solutions: Or, The Biologist’s Wish List Neural net models of biologic pathways In silico representation of array data in these models Linkage of in silico model systems to data from biological systems (eg, transfectants, knockouts, knockins, cre-lox mice) Predictive tools for additional gene finding