Shirley© Functional Genomics (1) Yow--Ling Shiue 薛 佑 玲 Yow Institute of Biomedical Science National Sun YatYat-sen University St Steps of f G Genome Analysis A l i Genome sequence assembled, markers 8 8 Gene location/gene map (mapping) Gene prediction – train a model for each genome (including EST & cDNA sequences) Genome annotation Functional genomics 8 8 8 8 8 8 Identify repetitive sequences – mask out, out filter http://www.ornl.gov/sci/techresources/Human_Genome/research /function.shtml Comparative genomics & Integrative genomics Shirley© Functional Genomics Technology Goals 8 Generate sets of full full-length cDNA clones and sequences that represent human genes and model organisms 8 Support research on methods for studying functions of nonproteinnonprotein p -coding g sequences q 8 Develop technology for comprehensive analysis of gene expression 8 Improve methods for genome genome-wide mutagenesis 8 Develop technology for large large-scale protein analyses 8 http://www.ornl.gov/sci/techresources/Human_Genome/research/function.shtml p g Definition (1) – Hieter & Boguski 1997 8 The development & application of global 8 8 8 Genomeenome n m -wide id orr System ystem-wide experimental approaches to assess gene function by g y making g use of the information & reagents provided by structural genomics It is i characterized h t i db by high hi h-throughput highth h t or largelarge -scale experimental methodologies 8 Combined with statistical or computational analysis of the results Definition D fi i i (2 (2) – UC D Davis i Genome G Center C A means of assessing phenotype differs from more classical approaches h primarily i il with ith respectt tto 8 8 The scale & automation of biological investigations 8 8 A classical investigation of gene expression might examine how the expression of a single gene varies with the development of an organism in vivo Modern functional genomics approaches however, approaches, however would examine 1,0001,000 -10,000 genes are expressed as a function of development http://genomics ucdavis http://genomics.ucdavis. edu/index_html.html Definition (3) – Hunt & Livesey (ed.) 8 Subtracted cDNA libraries 8 Differential display (DD) 8 Representational difference analysis 8 Suppression subtractive hybridization 8 cDNA microarrays 8 2-D gell electrophoresis l h http://www.oup.co.uk/isbn/0 http://www.oup.co.uk/isbn/0p p -1919-963774 963774--1 Functional Genomics 8 How to do 8 What to know 8 Gene G n expression xpr ssi n 8 Gene regulation 8 Genome Genome--wide mutagenesis Shirley© 8 8 8 8 8 8 8 8 Data-mining Data[SAGE] SAGE] Microarray M croarray analysis analys s Subtractive cDNA libraries Y YeastYeast -two hybrids h b id Transgenics Transposon targeting RNAi & miRNA h http://www.ncbi.nl // b l m.nih.gov/Tools/ Expression Arrays - Microarray 8 C Cell ll growth th in different diff nt environments, treatments etc. 8 Isolate Is late RNA B cDNAs 8 Measure expression using array technology 8 Create database of expression information 8 Data Analysis 8 Display Di l information i f ti iin an easyeasyto--use format to 8 Show ratio of expression under different conditions Affymetrix® Affymetrix Affymetr x® food ch x® chip p Shirley© Hi t i l P Historical Perspective ti 8 DNA hybridization (1960s) 8 Detection of hybrids 8 8 8 8 Hydroxyapatite Ca5(PO4)3OH Radioactive labeling Enzyme--linked detection Enzyme Fluorescent labeling 8 Fixing sample on solid support 8 Southern blots (1970s) 8 Northern blots 8 Dot blots Shirley© Basic Principles 8 Main novelty is one of scale 8 Hundreds or thousands of probes rather than tens 8 Probes are attached to solid supports 8 Robotics are used extensively 8 Informatics is a central component at all stages Shirley© Gene Expression Analysis (Whole Genome) 8 Quantitative Q tit ti Analysis A l i of fG Gene A Activities ti iti 8 Transcription Profiles Yang et al. BMC Genomics 2005 6:90 doi:10.1186/1471 doi:10.1186/1471--21642164-6-90 M j T Major Technologies h l i 8 cDNA probes (> 200 nt), usually produced by PCR PCR,, attached to either nylon or glass supports 8 Oligonucleotides (25(25-80 nt) attached to glass support 8 Oligonucleotides (25 (25--30 nt) synthesized in situ on silica wafers (Affymetrix) 8 Probes attached to tagged beads Shirley© 4187 genes; 91 samples Principal Uses of Chips (1) 8 GenomeGenome-scale gene expression analysis 8 Differentiation 8 Responses to environmental factors 8 Disease processes 8 Effects of drugs 8Genome Genome--scale profiling of gene expression in hepatocellular carcinoma: classification and survival prediction 8CCR Frontiers in Science (2006); Lee et al. Hepatology 40:66740:667-76 (2004) Principal Uses of Chips (2) 8 Detection of sequence variation 8 Genotyping 8 Detection of somatic mutations (e.g. in oncogenes) g 8 Direct sequencing Allele-specific hybridization (ASH) Chee et al. 1996; Wang et al. 1998; Lindblad-Toh Lindblad Toh et al al. 2000; 40 different, 2525-bp oligos Toshiba's hepatitis C SNP typing chip SNP Strategy - "GeneChip Mapping Assay cDNA Chips 8 Probes are cDNA fragments, fragments, usually amplified by PCR 8 Probes are deposited on a solid support, either positively charged g nylon y or g glass slide 8 Samples (normally polyA+ RNA) are labeled using fluorescent dyes dyes 8 At least two samples are hybridized to chip 8 Fluorescence at different wavelengths measured by a scanner Shirley© Molecular Cell Biology, Biology Lodish 5th Ed. cDNA Chip Design 8 Probe selection 8 Non Non-redundant set of probes 8 Includes genes of interest to project 8 Corresponds to physically available clones 8 Chip layout 8 Grouping of probes by function 8 Correspondence between wells in microtiter plates and spots on the chip Shirley© Probe Selection 8 Make sure that database entries are cDNA 8 Preference for RefSeq entries 8 Criteria for non non-redundancy 8 >98% identity over >100 nt 8 Accession number is unique 8 Mapping of sequence to clone 8 Use Unigene clusters 8 Directly use data from sequence verified ifi d collection ll ti (e.g. Research R h Genetics) 8 Independently verify sequence Agilent A il t T Technology: h l 60 60mer probe selection; GeneBin Shirley© cDNA Arrays on Nylon and Glass 8 Nylon arrays 8 Up to about 1,000 probes per filter 8 Use radiolabeled cDNA target 8 Can use phosphorimager or XX-ray film 8 Glass arrays 8 Up p to about 40,000 , probes per p p slide,, or 10,000 per 2cm2 area (limited by arrayer’s capabilities) 8 Use fluorescent targets 8 Require specialized scanner RZPD N Nylon l array Overview of the Production of a Pair of Cheap, Cheap LowLow -density Nylon Arrays of PCR Products StemCellDB: library ID http://stemcell.princeton.edu/v1/sbs_screen.html Actual image g of two duplicate p arrays y of 332 clones each, probed with Sca+ ((-) AA4 AA4- (top) or AA4AA4- (-) Sca+ (bottom) subtracted probe populations http://stemcell.princeton.edu/v1/sbs_screen.html N th Northern Blotting Bl tti Confirmation C fi ti http://stemcell.princeton.edu/v1/sbs_screen.html Array Type & Spot Density Array Type Nylon Macroarrays y Microarrays y Nylon Glass Microarrays Oligonucleotide Chips Spot Density (per cm 2 ) < 100 < 5000 < 10,000 <250,000 Probe Target Labeling cDNA cDNA cDNA oligo's RNA mRNA mRNA mRNA Radioactive Radioactive/Flourescent Flourescent Flourescent Glass Chip Manufacturing 8 Choice of coupling method 8 Physical (charge), nonnon-specific chemical, specific chemical (modified PCR primer) 8 Choice of printing method 8 Mechanical pins: flat tip, split tip, pin & ring 8 Piezoelectric Pi l t i (壓電的) (壓電的)deposition 壓電的)d d deposition iti (“i (“ (“ink iinkk-jet jjet”) t”) t”) 8 Robot design g 8 Precision of movement in 3 axes 8 Speed and throughput 8 Number Numb r of f pins pins, numbers numb rs of f spots sp ts per p r pin load l d CHIP 1000, Shimadzu Biotech Physical Spotting Typical Ink Jet Spot Deposition Results Volume per spot spot: Spot size: Spot density: 250 nl 1, 1,100 µm 70/cm2 Volume p per spot: p 0.5 nl Spot size: 115 µm Spot density: 4,800/cm2 Labelled BSA (Cy5) Typical Pin Spot Deposition Microarray Results 7x11 microarray consisting of identical Cy5Cy5-BSA spots (pitch 500 mm) Typical CV: ≤ 5% Protocol Labeling and Hybridization 8 Targets are normally prepared by oligo(dT) li (dT) primed i d cDNA DNA synthesis th i 8 Probes should contain 3’ end of mRNA 8 Need CoT1 DNA as competitor (esp. LINE) 8 Alternative protocol is to make ds cDNA containing t i i b bacterial t i l promoter, t th then cRNA 8 Can work with smaller amount of RNA 8 Less L quantitative i i 8 Hybridization usually under coverslips Shirley© Scanning the h Arrays 8 Laser scanners 8 Excellent spatial p resolution 8 Good sensitivity, sensitivity, but can bleach fluorochromes 8 Still rather slow 8 CCD (Charged (Charged-Coupled Device) scanners 8 Spatial p resolution can be a problem p 8 Sensitivity easily adjustable (exposure time) 8 Faster and cheaper than lasers BioRad: VersArray ChipReader ™ llaser confocal scanners 8 In all cases, raw data are images showing fluorescence on surface of chip Shirley© Example: Zeptosens Planar Waveguide Principle – for High Sensitivity Fluorescence Microarray Detection free label Microarray i on chip excitation of bound label Imaging of surfacesurface-confined fluorescence CCD camera Glass Microarray – 326 Rat Heart Genes, 2X spotting Coffee Break 8 What did a Math book says y to the other? 8 I have a lot of problems! The Affymetrix Approach 8 Probes are oligos synthesized in situ using a photolithographic approach 8 There are at least 13 13-16 oligos per gene (PM) (PM),, plus an equall number b of f negative controls l (MM) 8 The apparatus requires a fluidics station for hybridization and a special scanner 8 Only a single fluorochrome is used per hybridization 8 It is very expensive ! Shirley© Affymetrix GeneChip® GeneChip® Affymetrix y Chip p Production - GeneChip GeneChip® p® (Photolithography) Production of an Affymetrix GeneChip: through h h the h use off photolithography & combinaotrial chemistry specific ifi DNA probes b are constructed on the chip surface (Coe & Antler 2004) The use of oligonucleotide arrays. mRNA RNA iis extracted d ffrom cells and amplified through a process that l b l th labels the RNA for f analysis. The sample is then applied to an array & and bound RNA stained (Coe & Antler 2004). P b Design Probe R=Discrimination R Discrimination Score = (PM(PM -MM)/(PM+MM) http://www.affymetrix.c om/support/technical/te pp chnotes/statistical_refer ence_guide.pdf C Commercial i l Chips Chi 8 Clontech, Incyte, Research Genetics 8 FilterFilter-based arrays with up to about 8,000 8 000 clones 8 Incyte/Synteni 8 10,000 probe chips, not distributed (have to send them target RNA) Incyte microarray 8 Affymetrix 8 OligoOligo-based chips with 12,000 genes of known function (13 13-16 oligos/gene) and 4x10,000 from ESTs 8 http://www.affymetrix.com/products/ arrays/index affx arrays/index.affx Shirley© Affymetrix Designs Alternative Technologies 8 Synthesis of probes on microbeads 8 Hybridization in solution 8 Identification of beads by fluorescent bar coding by embedding transponders 8 Readout using micro micro-flow cells or optic fibers 535 Multipurpose p p Cell 8 Production of “universal” arrays 8 Array uses a unique combination of oligos, oligos and probes containing the proper complements Shirley© TwoTwo -color Assay: Assay DASL Hybridization of Labeled Amplicons to Bead Bead-based Address Code Sequences q on Sentrix Universal Arrays y http://www.illu htt // ill mina.com/pro ducts/arraysr eagents/univ ersal_arrays.i lmn Illumina© Universal array A: 100 beads with different probe DNA are arrayed in a capillary in the intended order B. Microscopic image C A beadC. bead b d-array system t Sample, buffer & waste reservoir •Sample solution from the sample p reservoir moves back & forth inside the beadbead -array during hybridization y & buffer solution from the buffer reservoir is introduced during washing Fiber Optics Technology To learn more: Illumina Illumina’ss Web site Arrays for Genetic Analysis 8 Mutation detection 8 Molecular l l Inversion Probe Technology l for SNP Genotyping (next slide) 8 20,000 SNPs in a single array 8 PCR followed by primer extension, with detection of alleles by MALDIMALDI -TOF mass spectroscopy (MS) (Sequenom) 8 Gene loss & amplification 8 Measure gene dosage in genomic DNA by hybridization to genomic probes Shirley© Genome Research 2005 15, 269269-75. http://www.affymetrix.com/technology/mip_technology.affx#snp 8Four Four--color single g array y technology; up to 12, 000 SNPs per reaction 8Amplification with universal PCR primer pair 8Each amplified p probe p contains a unique q tag sequence that is complementary to a sequence on the universal tag array 8Tags have been selected to have a similar Tm & base composition & to be maximally orthogonal in sequence complementarity Bioinformatics of Microarrays 8 Array design: design: choice of sequences to be used as probes 8 Analysis of scanned images 8 Spot detection, normalization, quantitation 8 Primary analysis of hybridization data 8 Basic statistics, reproducibility, data scattering, etc. 8 Comparison of multiple samples 8 Clustering, SOMs, kk-mean classification … 8 SOMs= selfself-Organizing Maps (a subtype of artificial neural network, lowlow-dimensional viwes of highhigh-dimensional data) 8 8 Unsupervised learning Sample p tracking g and databasing g of results Shirley© Microarray Data Pipeline P l Microarray Data on the Web 8 Many groups have made their raw data available, but in many formats 8 Some groups have created searchable databases 8 There are several initiatives to create “unified” databases 8 EBI: ArrayExpress 8 NCBI: Gene Expression Omnibus 8 Companies are beginning to sell microarray expression data (e.g. Incyte) Shirley© Other Web Links 8 Leming Shi’s Gene Gene--Chips.com page 8 Very rich source of basic information and commercial and academic links 8 DNA chips for dummies animation 8 The Big Leagues: Pat Brown and NHGRI microarray projects Shirley© http://www.coactivepr.com/assets/pdf/writin g samples/sequenom/Genotyping%20Bro g_samples/sequenom/Genotyping%20Bro chure_v8.pdf 2004 Protons e- matrix-assisted laser desorption/inoization RNase-A: U RNaseand dC RNase-T1: RNaseT1 Gspecific, f digestion of the dC-transcript of th opposite the it strand Single Nucleotide Polymorphisms RNase--A: U and C RNase 8A sequence change can have multiple affects on the mass spectra 8It can result in a mass shift, introduction of a cleavage site or removal of f a cleavage g site 8The forward reactions indicate the presence of f a SNP through h h mass shift 8The reverse reactions pinpoint the location of the SNP in the amplicon reference f sequence Only One Final Word of Wisdom... 8 “...although g the computer p is a wonderful helpmate for the sequence searcher and comparer, biochemists and molecular biologists must guard d against i the h blind bli d acceptance of f any algorithmic output; given the choice, think like a biologist and not a statistician statistician” 8 Russell F. Doolittle, 1990 Shirley© Suppressive Subtractive Hybridization cDNA libraries Tester cDNA with Adaptor 1 Tester cDNA with Adaptor 2 Driver cDNA (in excess) first hybridization all components denatured To remove the most common sequences a b c d { second hyb: mix, add freshly denatured driver; driver; anneal a,b,c,d + e fill in f n the th ends n a (Diatchenko et al., 1996. 1996 Proc Proc. Natl Natl. Acad. Sci. USA. 93:6025 ) add primers; primers; PCR amplify no amplification b no ampl amplification f cat on c linear amplification d no amplification e exponential amplification Efficacy Effi of f SSH Ji et al. 2002 BMC Genomics 3:12 8 Diatchenko et al. 1996 ((PNAS 93:6025)) 8 Could detect as little as 0.001% target 8 Critical factor is relative concentration of target in tester and driver populations 8 Effective enrichment when 8 Target present at >= 0.01% 8 Concentration ratio>= ratio>= 55-fold SSH Advantages & Drawbacks 8 Advantages 8 8 8 8 Normalization of transcript levels Detects small (2(2-fold) differences in transcript levels Identify y previously p y uncharacterized genes g (novel g (novel genes) genes) Generates subtracted libraries rapidly 8 Drawbacks k 8 Isolating & sequencing transcripts slow & laboratories 8 Many M clones l may contain t i the th same sequences 8 All transcripts must be verified by Northern or quantitative RT q RT-PCR Yeast TwoTwo-Hybrid System (1) 8 Protein Protein--protein interaction 8 A yeast vector for expressing a DNA DNA-binding domain 8 Flexible linker region without the associated activation domain, domain, e.g e ., the deleted GAL4 containing c ntainin amino amin acids 11--692 8 A cDNA sequence q encoding g a protein p or protein p domain of f interest = bait domain is fused in frame to the flexible linker region so that the vector will express a hybrid protein composed of the DNA DNA-binding domain, domain, linker region, region, and bait domain Molecular M l l Cell C ll Biology, Bi l Lodish 5th Ed. Yeast TwoTwo-Hybrid System (2) 8 A cDNA library is cloned into multiple copies of a second yeast vector that encodes a strong activation domain & flexible linker, to produce a vector library p y expressing p g multiple p hybrid y proteins proteins, p , each containing a different fish domain 8 The bait vector & library of fish vectors are then transfected into engineered yeast cells in which the only copy of a gene required for histidine synthesis (HIS) is under control of a UAS with binding g sites for the DNADNA-binding g domain of the hybrid y bait protein p 8 Transformed cells that express the bait hybrid & interacting fish hybrid will be able to activate transcription of the HIS gene 8 The flexibility in the spacing between the DNADNA-binding & activation ti ti d domains i of f eukaryotic k ti activators ti t makes k thi this system t work k Yeast TwoTwo-Hybrid System (3) 8 A twotwo-step selection process is used 8 The bait vector also expresses a wild wild-type TRP gene gene,, and the hybrid vector expresses a wild wild-type LEU gene 8 Transfected cells are first grown in a medium that lack of tryptophan & leucine but contain histindine 8 Only cells that have taken up the bait vector & one of the fish plasmids will survive in this medium 8 The cells that survive then are plated on a medium that lacks histidine Yeast TwoTwo-Hybrid System (4) 8 Those cells expressing a fish hybrid that does not bind to the bait hybrid cannot transcribe the HIS gene & consequently will not form a colony on medium lacking histidine 8 The few cells that express a bait bait-binding fish hybrid will grow & form colonies in the absence of histindine 8 Recovery of the fish vectors from these colonies yields cDNA DN encoding d protein domains d that h interact with h the h bait domain Coffee Break 8 What do boxers and astronomers have in common? 8 They both see stars!!!