The Human Genome Project and 100 Million Years of Human Evolution David Haussler Center for Biomolecular Science and Engineering University of California, Santa Cruz The human genome is a recipe for an entire body and brain • The genome is organized into 23 pairs of human chromosomes (1-22 and the pair X,Y or X,X) • Each chromosome consists of DNA – molecular string of A, C, G, & T (bases), 3 billion in all • All cells in the body have the same DNA that was in the original fertilized egg • Genes are DNA sequence that codes for proteins (only about 1.5% of human genome) To what extent does a person’s genome define them? On July 7, 2000, UCSC posted human genome on the web Outgoing UCSC internet traffic for year 2000 The UCSC Genome Browser: a new kind of web-based genome microscope • Data from all over the world are fed into nightly updates of the UCSC browser database, analysis, and display • Every day, more than 7,000 biomedical researchers use it to scan the genome at ever greater detail, dimension, and depth, making more than 300,000 web page requests Explore the genome at http://genome.ucsc.edu UCSC Genome Bioinformatics Group Large-scale Operations in Genome Evolution Zack Sanborn Example: evolutionary history of a mammalian chromosome History of rat chromosome X Jian Ma, Bernard Suh, Brian Raney Example: evolutionary history of a mammalian chromosome History of rat chromosome X Jian Ma, Bernard Suh, Brian Raney Morpheus: new genes by segmental duplication Baboon chromosome Human chromosome Evan Eichler The demise of a gene Codon TGG for amino acid tryptophan became a stop codon in this gene before the human-chimp ancestor, killing the gene. Proteins of this type (acyltransferase 3) appear in all branches of life; this was the last in the hominid genome. Jing Zhu, Zack Sanborn, Craig Lowe Project to reconstruct the evolutionary history of the genomes of placental mammals Data from NHGRI Comparative Genome Sequencing Program Homo sapiens sapiens Homo sapiens neanderthalensis Homo sapiens chimpanzee (Pan troglodytes) Homo/Pan Gorilla Homo/Pan/Gorilla orangutan Hominidae (great apes) gibbon Homonoidae (apes) rhesus macaque Catarrhini (old world monkeys and apes) marmoset Anthropoidea tarsier Haplorhines bushbaby Primates pygmy tree shrew Eurachonta mouse (Mus musculus “genomicus”) Euarchontoglires Boreoeutheria common shrew Eutheria (placental mammals) elephant shrew Tursiops truncates Not all descendants of the eutherian ancestor are shrew-like We found 49 genomic regions that showed extremely accelerated evolution in humans Human Accelerated Region 1 Katie Pollard and Sofie Salama HAR1 produces a structured RNA sequence that is expressed in the fetal brain New interactions in the human version of this gene Computational prediction of structure conserved throughout amniotes Jakob Pedersen The six layers of the cerebral cortex are built during fetal brain development During development, the cerebral cortex is built “inside-out” by neurons migrating radially from the subventricular zone to the pial surface. This process is guided by the neurodevelopmental gene Reelin. Image: www.thebrain.mcgill.ca HAR1 is expressed in the same cells as Reelin (the Cajal-Retzius neurons), and during the same period of development (8-20 GW) Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Pierre Vanderhaeghen We are pursuing the hypothesis that HAR1 functions in cortical development and was involved in the evolution of the human brain Grand challenge of human molecular evolution Reconstruct the evolutionary history of each base in the human genome • • • • Discover functional elements of the genome Find the human evolutionary innovations Map the important human genetic variation Map the genome adaptations in individual cancer tumors that make them dangerous The UCSC Team Katie Pollard and Gill Bejerano Jim Kent Sofie Salama Adam Siepel Extended Credits Thanks to Jim Kent, Sofie Salama, Gill Bejerano*, Katie Pollard*, Adam Siepel*, Robert Baertsch, Galt Barber, Hiram Clawson, Mark Diekhans, Jorge Garcia, Rachel Harte, Angie Hinrichs, Fan Hsu, Donna Karolchik, Sol Katzman, Andy Kern, Bryan King, Robert Kuhn, Victoria Lin, Andre Love, Craig Lowe, Yontao Lu, Jian Ma, Chester Manuel, Courtney Onodera, Jakob Pedersen, Andy Pohl, Brian Raney, Brooke Rhead, Kate Rosenbloom, Krishna Roskin, Zack Sanborn, Kayla Smith, Mario Stanke, Bernard Suh, Paul Tatarsky, Archana Thakkapallayil, Daryl Thomas, Heather Trumbower, Jason Underwood, Ting Wang, Erich Weiler, Chen-Hsiang Yeang, Jing Zhu, and Ann Zweig, in my group at UCSC And to Webb Miller, Nadav Ahituv, Manny Ares, Mathieu Blanchette, Rico Burhans, Michele Clamp, Richard Gibbs, Eric Green, Haller Igel, John Karro, Eric Lander, Kerstin Lindblad-Toh, Jim Mullikin, Tom Pringle, Eddy Rubin, Armen Shamamian, Pierre Vanderhaeghen, and many other outside collaborators Single nucleotide polymorphisms (SNPS) • When we compare the genomes of many people, we see ~3 million variable bases (SNPs). That is one every 1000 bases. • Each SNP is a change that happened only once. • The more ancient the SNP, the more common – most SNPs come from before the time of a population bottleneck about 100,000 years ago, before our ancestors migrated out of Africa. • Each of your kids has about 175 new DNA changes, but nearly all changes are lost within 20 generations. • SNPs inherited together with no recombination form “haplotype blocks”. Polymorphism Data is Used to Help Locate Disease Genes • With new genotyping technology, there has been a revolution in our ability to discover diseaserelated genes. New discoveries have been made for diabetes, cancer, cardiovascular disease, auto immune diseases, and neurological diseases. • The ability to interactively explore the genome on the web is accelerating biomedical research and will eventually help us to better diagnose and cure disease. Genomes and the Central Dogma of Molecular Biology The Tree of Life DNA -> DNA (molecular evolution) DNA -> RNA -> protein (molecular cell biology) Neutral drift: a genetic change that does not affect the organism • Mutations occur all the time in protein-coding regions; some do not change the protein, so do not affect the fitness of the organism • Changing the third DNA base in this codon does not change the amino acid it encodes, alanine (A) Browser: Kent et al; conservation track: Siepel and Rosenbloom Negative selection: rejecting a change that decreases fitness • Some mutations would change the protein and thereby reduce fitness • Such changes are rejected by natural selection, and the DNA is conserved Browser: Kent et al; conservation track: Siepel and Rosenbloom Positive selection: a genetic change that increases fitness • Some mutations have a positive effect: This change from C to A in the gene FOXP2 changed the amino acid from threonine (T) to asparagine (N) , which may have improved fitness • Possible role in the evolution of speech Browser: Kent et al; conservation track: Siepel and Rosenbloom; FOXP2 results: Enard et al, Nature, 2002