Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: - once you have the sequence data, you really have just started. - The goals are then: - identify where genes are (Open Reading Frames) - find promoters and regulatory elements to confirm this is a gene (and not a pseudogene). - in eukaryotes, find splice sites, introns and exons - identify structural sequences like telomeres and centromeres - convert the DNA sequence into the predicted AA sequence of the protein - predict protein structure and function by identifying ‘domains’ and ‘motifs’ - These goals are attained by computer analyses of gene/AA sequence data, and comparison with known described genes. This is: BIOINFORMATICS Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: 1. NCBI – BLAST search compares sequence to other sequences in the database 1 ggggcacccc tacccactgg ttagcccacg ccatcctgag gacccagctg cacccctacc 61 acagcacctc gggcctaggc tgggcggggg gctggggagg cagagctgcg aagaggggag 121 atgtggggtg gactcccttc cctcctcctc cccctctcca ttccaactcc caaattgggg 181 gccgggccag gcagctctga ttggctgggg cacgggcggc cggctccccc tctccgaggg 241 gcagggttcc tccctgctct ccatcaggac agtataaaag gggcccgggc cagtcgtcgg 301 agcagacggg agtttctcct cggggtcgga gcaggaggca cgcggagtgt gaggccacgc 361 atgagcggac gctaaccccc tccccagcca caaagagtct acatgtctag ggtctagaca 421 tgttcagctt tgtggacctc cggctcctgc tcctcttagc ggccaccgcc ctcctgacgc 481 acggccaaga ggaaggccaa gtcgagggcc aagacgaaga cagtaagtcc caaacttttg 541 ggagtgcaag gatactctat atcgcgcctt gcgcttggtc ccgggggccg cggcttaaaa 601 cgagacgtgg atgatccgga gactcgggaa tggaagggag atgatgaggg ctcttcctcg 661 gcgccctgag acaggaggga gctcaccctg gggcgaggtt ggggttgaac gcgccccggg 721 agcgggaggt gagggtggag cgccccgtga gttggtgcaa gagagaatcc cgagagcgca 781 accggggaag tggggatcag ggtgcagagt gaggaaagta cgtcgaagat gggatggggg 841 cgccgagcgg ggcatttgaa gcccaagatg tagaagcaat caggaaggcc gtgggatgat 901 tcataaggaa agattgccct ctctgcgggc tagagtgttg ctgggccgtg ggggtgctgg 961 gcagccgcgg gaagggggtg cggagcgtgg gcgggtggag gatgagaaac tttggcgcgg 1021 actcggcggg gcggggtcct tgcgccccct gctgaccgat gctgagcact gcgtctcccg Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: 1. NCBI – BLAST search compares sequence to other sequences in the database 2. Open Reading Frames: base sequences which would code for long stretches of AA’s before a stop codon would be reached. Typically, these are found by looking for [5’ – ATG…-3’] sequences that follow a promoter (TATA, CAAT, GGGCGG). The complement would be [3’ – TAC..-5’], which would encode a start codon in RNA [5’- AUG…3’] Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: 1. NCBI – BLAST search compares sequence to other sequences in the database 2. Open Reading Frames: base sequences which would code for long stretches of AA’s before a stop codon would be reached. Typically, these are found by looking for [5’ – ATG…-3’] sequences that follow a promoter (TATA, CAAT, GGGCGG). The complement would be [3’ – TAC..-5’], which would encode a start codon in RNA [5’- AUG…3’] 3. Regulatory regions and splicing sites (GT-AG): Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: - Sequence Homology: search libraries for similar sequences already described in other proteins with known function, in other species….. Arabidopsis thalia Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: - Sequence Homology: search libraries for similar sequences already described in other proteins with known function, in other species… or the same species Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: - Sequence Homology: search libraries for similar sequences already described in other proteins with known function, even in other species. - Domain / Motif Analysis: Certain AA sequences are known to have a certain structure (‘motif’ like “helix-turn-helix”) or function (‘domain’ like an ion channel sequence, DNA binding region). Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: - Sequence Homology: search libraries for similar sequences already described in other proteins with known function, even in other species. - Domain / Motif Analysis: Certain AA sequences are known to have a certain structure (‘motif’ like “helix-turn-helix”) or function (‘domain’ like an ion channel sequence, DNA binding region). - Mutant Analysis: Mutate the gene (insert a non-functional sequence) in vitro, then insert in cells and observe effects of “knocking out” function in different tissues or the whole organism. Capecchi, Evans, and Smithies were awarded the 2007 Nobel Prize for their technique for inserting a gene into embryonic cells…this gene can be a mutant, non-functional gene (“knock-out”) or a functional gene (“knockin”). Typically, you would then screen mice for those who, by luck, had transformed cells end up in their gonads. These mice will pass the mutation to their gametes; so if you mate a male and female, you will create offspring that are homozygous for this mutation across their entire genome….and you can see it’s effects. Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression - Construction of a microarray – ‘gene chip’ Can create a chip with unique sequence DNA from every gene in a genome (‘probe’). Take a tissue sample Isolate m-RNA Make labeled c-DNA Expose to chip and allow complementation Wash Analyze florescence at each point; binding denotes that this tissue has this gene on at this point in development Take a tissue sample Isolate m-RNA Make labeled c-DNA Expose to chip and allow complementation Wash Analyze florescence at each point; binding denotes that this tissue has this gene on at this point in development Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression F. Phylogenetic Analyses: Comparative Genomics - DNA or AA sequences can be compared across species For example… download the sequence for cytochrome-c from different organisms: >Arabidopsis MASFDEAPPGNPKAGEKIFRTKCAQCHTVEKGAGHKQGPNLNGLFGRQSGTTPGYSYSAA NKSMAVNWEEKTLYDYLLNPKKYIPGTKMVFPGLKKPQDRADLIAYLKEGTA >Euglena GDAERGKKLFESRAGQCHSSQKGVNSTGPALYGVYGRTSGTVPGYAYSNANKNAAIVWED ESLNKFLENPKKYVPGTKMAFAGIKAKKDRLDIIAYMKTLKD >Hippo GDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLHGLFGRKTGQSPGFSYTDANKNKGITWG EETLMEYLENPKKYIPGTKMIFAGIKKKGERADLIAYLKQATNE >Mosquito MGVPAGDVEKGKKLFVQRCAQCHTVEAGGKHKVGPNLHGLFGRKTGQAAGFSYTDANKAK GITWNEDTLFEYLENPKKYIPGTKMVFAGLKKPQERGDLIAYLKSATK >Rice MASFSEAPPGNPKAGEKIFKTKCAQCHTVDKGAGHKQGPNLNGLFGRQSGTTPGYSYSTA NKNMAVIWEENTLYDYLLNPKKYIPGTKMVFPGLKKPQERADLISYLKEATS Use clustalX to align sequences and resolve a phylogeny Use n-j plot to see the plot 0.02 Eu glena Mosquit o Hippo Ri ce Arabidopsis Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression F. Phylogenetic Analyses: Comparative Genomics G. Conclusions from Genomic Studies: - there is remarkable homology in protein/gene sequence between species Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression F. Phylogenetic Analyses: Comparative Genomics G. Conclusions from Genomic Studies: - there is remarkable homology in protein/gene sequence between species - physiological/developmental complexity is not correlated with genome size Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression F. Phylogenetic Analyses: Comparative Genomics G. Conclusions from Genomic Studies: - there is remarkable homology in protein/gene sequence between species - physiological/developmental complexity is not correlated with genome size - only 2-5% of human genome codes for proteins Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression F. Phylogenetic Analyses: Comparative Genomics G. Conclusions from Genomic Studies: - there is remarkable homology in protein/gene sequence between species - physiological/developmental complexity is not correlated with genome size - only 2-5% of human genome codes for proteins - although there are 100,000 proteins, there are only 20,000 genes… suggesting that most genes encode multiple proteins, produced through transcript and post-translational processing. Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression F. Phylogenetic Analyses: Comparative Genomics G. Conclusions from Genomic Studies: - there is remarkable homology in protein/gene sequence between species - physiological/developmental complexity is not correlated with genome size - only 2-5% of human genome codes for proteins - although there are 100,000 proteins, there are only 20,000 genes… suggesting that most genes encode multiple proteins, produced through transcript and post-translational processing. - Most of the genome does NOT encode protein. However, large fractions of DNA do encode nc-RNA’s… “non-coding RNA’s” which are not translated but are produced by transcription and then exert a regulatory function (mi-RNA’s and others). Genomics A. Overview: B. Sequencing: C. Finding Genes – structural genomics and ‘annotation’: D. Identifying Gene Function – functional genomics: E. Comparing Protein Expression F. Phylogenetic Analyses: Comparative Genomics G. Conclusions from Genomic Studies: - there is remarkable homology in protein/gene sequence between species - physiological/developmental complexity is not correlated with genome size - only 2-5% of human genome codes for proteins - although there are 100,000 proteins, there are only 20,000 genes… suggesting that most genes encode multiple proteins, produced through transcript and post-translational processing. - Most of the genome does NOT encode protein. However, large fractions of DNA do encode nc-RNA’s… “non-coding RNA’s” which are not translated but are produced by transcription and then exert a regulatory function (mi-RNA’s and others). - So, organisms with similarities in coding genes can be remarkably different…as a consequence of how the production of those proteins is regulated in different cell types and at different developmental periods. PHEW!!!! Recombinant DNA Technology combines DNA from different sources – usually different species Utility: this is done to study DNA sequences to mass-produce proteins to give recipient species new characteristics as a therapy/curative for genetic disorders (‘gene therapy’) Human insulin, created in bacteria Corn damaged by corn borer and fungi “bt-corn”, with a bacterial gene Genomics Genetic Engineering A. To mass-produce proteins Genomics Genetic Engineering A. To mass-produce proteins Making human insulin Genomics Genetic Engineering A. To mass-produce proteins A1-antitrypsin was the first; antithrombin is the first transgenic protein produced in animals to be approved by FDA for human use. Eukaryote genes may not be read properly by bacterial hosts because of introns and regulatory elements. In addition, the protein may not be processed correctly or fold correctly. Using a eukaryotic host solves these problems… but tissue expression is the problem. Genomics Genetic Engineering A. To mass-produce proteins Vaccines (HPV vaccine – ‘Gardasil’ ) are being synthesized that consist of only a few proteins that initiate the immune response, rather then the entire virus (or bacterium). The genes for these proteins could be put in food, to intiate an immune response. Genomics Genetic Engineering A. To mass-produce proteins B. To give species new characteristics The EPSP synthase gene in E. coli confers resistance to glyphosate – the primary ingredient in herbicides like Round-Up©. Genomics Genetic Engineering A. To mass-produce proteins B. To give species new characteristics Agrobacterium is a plant pathogen that inserts Ti plasmids into host cells. These plasmids have been used as vectors for introducing the gene into plant tissues, which grow into new plants. Genomics Genetic Engineering A. To mass-produce proteins B. To give species new characteristics Bacillus thuringiensis is a bacterium that produces a protein that crystallizes in insect guts, killing the insect. Since the 1930’s, the bacteria were sprayed on crops to reduce insect damage. The treatment was very short term, as the bacteria died quickly. Genomics Genetic Engineering A. To mass-produce proteins B. To give species new characteristics Same process – splice to an Agrobacterium plasmid, with tissue-specific promoters. Genomics Genetic Engineering A. To mass-produce proteins B. To give species new characteristics Issues: - genetic homogeneity of crop plants - 2011 study – toxin present in 93% of pregnant women in a town in Canada, and increases in immunological responses. - used as feed for animal stock - patterns of use and the evolution of resistance Genomics Genetic Engineering A. To mass-produce proteins B. To give species new characteristics Place gene for growth hormone from chinook salmon into Atlantic salmon, next to a constitutive promoter (gene always on, right?) Grow 10x faster, to same mature size Models suggest it would outcompete native species if released into the wild Genomics Genetic Engineering A. To mass-produce proteins B. To give species new characteristics C. Gene Therapy - Create a viral vector with a functional human allele – adenosine deaminase - Infect target tissue - Probably need to repeat unless you can transform stem cells 1990-first trial of gene therapy – Ashanti DeSilva. 40 treated since then with 100% efficacy. OTC - ornithine transcarbamylase deficiency syndrome. An X-linked disorder resulting in the inability to bind and convert ammonia to urea. Total loss of this protein is usually fatal shortly after birth. Jesse Gelsinger – died in 1999 at age 18, as a consequence of a gene therapy trial involving an adenovirus vector. He has an immunological reaction to the virus and died. OTC - ornithine transcarbamylase deficiency syndrome. “First, although Gelsinger and his family were under the impression that the pre-clinical animal studies had affirmed the trial's safety, two monkeys had actually died. This information appeared on the consent form submitted to the National Institutes of Health review board, but did not appear on the form signed by Jesse. Moreover, the Penn researchers did not disclose to either the Gelsingers or federal regulators that human volunteers in the same study had suffered adverse reactions - side effects serious enough to have halted the trials had they been reported. Not reporting adverse events in gene therapy clinical trials is clearly wrong, but it seems to have been par for the course in the 1990s: evidence collected shortly after Gelsinger's death showed that fewer than six percent of adverse events associated with gene therapy were properly reported at this time. Lastly, the lead researcher in the Penn study - James Wilson - did not disclose to the Gelsingers that he was conducting the clinical trial with a private company in which he had a stake. Wilson had a direct financial interest - not merely an academic one - in the trial's successful outcome.” From Center for Genetics and Society http://www.geneticsandsociety.org/article.php?id=4955 Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms Should the consumer know? If content is < 5%, should it be labeled GMO-free? Required in Europe and Asia… why not in U.S., which produces 65% of GM food worldwide? Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms B. Genetic Testing - 2008 – Genetic Information Nondiscrimination Act “prohibits the improper use of genetic information in health insurance and employment” ? Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms B. Genetic Testing - 2008 – Genetic Information Nondiscrimination Act “prohibits the improper use of genetic information in health insurance and employment” Lily Ledbetter XX XY ? Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms B. Genetic Testing - 2008 – Genetic Information Nondiscrimination Act “GINA does not cover an individual's manifested disease or condition--a condition from which an individual is experiencing symptoms, being treated for, or that has been diagnosed.” Sex discrimination in the workplace was not prohibited under GINA… so “Lily” was needed as a separate “equal pay for equal work” act. Lily Ledbetter XX XY ? Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms B. Genetic Testing - GINA - Genetic screening and embryo selection. “Preimplantation Genetic Diagnosis” – used in in vitro fertilization, screening early embryos for genetic abnormalities… or other traits? Suppose a child needs a bone marrow transplant… should parents be allowed to select among embryos to make a sibling capable of transfer? Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms B. Genetic Testing - GINA - PGD - Germline engineering Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms B. Genetic Testing - GINA - PGD - Germline engineering - Enhancement Gene Therapy Why not insert “better” genes? For Youth? Strength? Health? Genomics Genetic Engineering Bioethics A. GMO’s – Genetically Modified Organisms B. Genetic Testing - GINA - PGD - Germline engineering - Enhancement Gene Therapy