MCB 317 Genetics and Genomics Topic 11, part 2 Genomics Need to Add to part 2 or 3 A. Chip-seq B. Deep sequencing for expression profiling C. Illumina? movie Genomics Summary A. B. C. D. E. Microarrays: expression profiling and other uses Global Gene Knockouts Global protein localization in yeast Global complex identification in yeast Global two-hybrid analysis in yeast and other organisms F. RNAi G. Transgenics, gene “knock-outs” (genetics not genomics) H. Human Genome Project, Next Generation Sequencing, and Comparative Genomics Yeast “Knockout” Library Delete YFG Delete all genes (individually) Disruption of “All” Yeast Genes • • • • Approx 6000 genes Make 6000 sets of disruption primers Disrupt each gene in a diploid Dissect all 6000 diploids – Identify set of essential genes – Identify set of non-essential genes Yeast “Knockout” Library • Delete one copy of each gene in diploid – 5,916 “genes” deleted – 5,916 diploid strains constructed • Dissect to determine if gene is essential – 1,105 genes = essential – 18.7% of genes = essential • Construct an ordered library of haploids for nonessential genes – 4,811 mutant strains in library Genomics Summary A. B. C. D. E. F. G. H. Microarrays: expression profiling and other uses Global Gene Knockouts Global protein localization in yeast Global complex identification in yeast Global two-hybrid analysis in yeast and other organisms RNAi Transgenics, gene “knock-outs” (genetics not genomics) Human Genome Project, Next Generation Sequencing, and Comparative Genomics Genomics Biochemistry Subunits of Protein Complex Genomics: High-throughput genetics Protein D Orthologs and Paralogs E H Gene Ab B, G A F Txn Profile C Mutant Gene B, G Protein Profile/ Localization Mutant Organism Genetics Genomics Summary A. B. C. D. E. F. G. H. Microarrays: expression profiling (and other uses) Global Gene Knockouts Global protein localization in yeast Global complex identification in yeast Global two-hybrid analysis in yeast and other organisms RNAi Transgenics, gene “knock-outs” (genetics not genomics) Human Genome Project, Next Generation Sequencing, and Comparative Genomics Genomics Biochemistry Subunits of Protein Complex Genomics: High-throughput genetics Protein D Orthologs and Paralogs E H Gene Ab B, G A F Txn Profile C Mutant Gene B, G Protein Profile/ Localization Mutant Organism Genetics Genomics Summary A. B. C. D. E. F. G. H. Microarrays: expression profiling and other uses Global Gene Knockouts Global protein localization in yeast Global complex identification in yeast Global two-hybrid analysis in yeast and other organisms RNAi Transgenics, gene “knock-outs” (genetics not genomics) Human Genome Project, Next Generation Sequencing, and Comparative Genomics Genomics Biochemistry Subunits of Protein Complex Genomics:Hi ghthroughput genetics Protein D Orthologs and Paralogs E H Gene Ab B, G A F Txn Profile C Mutant Gene B, G Protein Profile/ Localization Mutant Organism Genetics 8100 Human DBD-ORFs x 8100 Human AD-ORFs Genomics Summary A. B. C. D. E. F. G. H. Microarrays: expression profiling and other uses Global Gene Knockouts Global protein localization in yeast Global complex identification in yeast Global two-hybrid analysis in yeast and other organisms RNAi Transgenics, gene “knock-outs” (genetics not genomics) Human Genome Project, Next Generation Sequencing, and Comparative Genomics Evolution of RNAi (current model) 1. a. Viruses are bad (so are transposons). b. Many viruses have dsRNA genomes c. euks originally lacked dsRNAs d. invent mechanism to kill dsRNA 2. Evolve mechanism to regulate endogenous genes a. RNA degradation b. inhibit translation c. form heterochromatin 3. Use as experimental technique Genomics Summary A. B. C. D. E. F. G. H. Microarrays: expression profiling and other uses Global Gene Knockouts Global protein localization in yeast Global complex identification in yeast Global two-hybrid analysis in yeast and other organisms RNAi Transgenics, gene “knock-outs” (genetics not genomics) Human Genome Project, Next Generation Sequencing, and Comparative Genomics Knockout Mouse: The Goal YFG Marker Gene Replace the coding region of YFG with a selectable marker gene Knockout Mouse Transfected DNA can integrate at random sites (standard transgenic organism). This is a relatively common event. Or the Transfected DNA can Replace the Endogenous Copy of the Gene via Homologous Recombination. This is a relatively rare event. Gene Deletion Deletion by Homologous Recombination Marker Gene Knockout Mouse Knockout Mouse How to select for the cells in which the Occludin gene is replaced with a mutant allele (a null allele) in the face of the fact that most of the transformed DNA will integrate at random sites? neor gene makes mammalian cells resistance to the drug G418 tkNSV makes mammalian cells sensitive to the drug ganciclovir Knockout Mouse YFG neor tkHSV Red = Mouse DNA including YFG and regions upstream and downstream of YFG Blue = neor gene, Green = tkHSV gene Black = plasmid DNA (not homologous to any mouse DNA) Knockout Mouse Lodish 5-40 Knockout Mouse Different Types of Transgenic Organisms Genomics Summary A. B. C. D. E. F. G. H. Microarrays: expression profiling and other uses Global Gene Knockouts Global protein localization in yeast Global complex identification in yeast Global two-hybrid analysis in yeast and other organisms RNAi Transgenics, gene “knock-outs” (genetics not genomics) Human Genome Project, Next Generation Sequencing, and Comparative Genomics Goals of Human Genome Project 1. Generate Genetic, Physical and Sequence maps of the human genome 2. Sequence genomes of a variety of model organisms: Comparative Genomics 3. Develop improved technology for mapping and sequencing 4. Develop computational tools for capturing, storing, analyzing, displaying, and distributing map and sequence data 5. Sequence ESTs and cDNAs 6. Consider social, ethical and legal challenges posed by genetic information Genomicists look at two basic features of genomes: sequence and polymorphism • Major challenges to determine sequence of each chromosome in genome and identify many polymorphisms – How does one sequence a 500 Mb chromosome 600 bp at a time? – How accurate should a genome sequence be? • DNA sequencing error rate is about 1 per 600 bp – How does one distinguish sequence errors from polymorphisms? • Rate of polymorphism in diploid human genome is about 1 in 1000 bp – Repeat sequences may be hard to place – Unclonable DNA cannot be sequenced • Up to 30% of genome is heterochromatic DNA that can not be cloned Whole-genome shotgun sequencing Private company Celera used to sequence whole human genome • Whole genome randomly sheared three times – Plasmid library constructed with ~ 2kb inserts – Plasmid library with ~10 kb inserts – BAC library with ~ 200 kb inserts • Computer program assembles sequences into chromosomes • No physical map construction • Only one BAC library • Overcomes problems of repeat sequences Fig. 10.13 Pyrosequencing, pt 1 Rxn1 (DNA)n + dNTP DNAP (DNA)n+1 + PPi Rxn2 Adenosine phosphosulfate = APS APS + PPi ATP sulfurylase ATP Pyrosequencing, pt 2 Luciferin Luciferase ATP Oxyuciferin + Light ADP Apyrase: dNTP -> dNDP + Pi -> dNMP + Pi + Pi Pyrosequencing, overview dTTP APS PPi GCTACACT CGATGTGACTGTA ATP Luciferin Oxyuciferin + Light Luciferase Pyrosequencing Add one nt (A) -> detect light (yes or no) Apyrase degrades excess nt (A) Add next nt (C) -> detect light (yes or no) Apyrase degrades excess nt (C) Repeat cycle 100’s of times Pyrosequencing Pyrosequencing Emulsion PCR 1. Add linkers (primers) to ends of genomic fragments 2. Attach frags to 1000’s of beads in a mixture 3. Add PCR reagents 4. Add oil and make an emulsion so that each bead is in it’s own droplet (it’s own PCR reaction) 5. Amplify DNA to make millions of identical copies. Each bead has millions of copies of a single DNA Pyrosequencing Pico-titer plate 200,000-400,000 wells per plate 1. Add beads to picotiter plate, only one bead fits in each well 2. Add a second type of bead, smaller, that holds the DNA bead in the wells and delivers enzymes to the wells 3. Flow the nts into the wells one at a time and record the light emitted from each well using a CCD camera Pyrosequencing Currently the best machines can sequence 400 - 600 million base pairs in one 10 hour run Haploid human genome = 3,000,000,000 bp therefore sequence haploid human genome to 1x depth in 6 days with one machine. The current target goal for sequencing individual human genomes is to get the cost down to $1,000 per genome. At present the cost is around “$5,000-$10,000” per individual (last year)… Illumina claims to have hit the $1,000 cost per genome in January of 2014 Illumina Sequencing Technology See Movie