Human Evolution Timelines Research History Evolution History Source: Jobling, Hurles & Tyler-Smith (2004) Human Evolutionary Genetics. Human Evolution The Data Genetic: Allele Frequencies, SNPs, Haplotypes Non-genetic: Language, culture, pets,pathogens, culture,.. The Dynamics Mutation, selection, recombination, The Genealogical Structure Phylogeny, Ancestral Recombination Graph, Pedigree Relationship to the great Apes, Ancestral Population of Human/Chimp Ancestor, Out of Africa Ancestral Population Structure, Selection, Migrations & Age of Alleles. Genealogies Iceland Models of Pedigrees Languages & Pathogens Populations & Basic Genealogical Structures Pedigree: Trace the ancestry of individuals Grand parents Phylogeny: Trace the ancestry of sequence points. Parents Now ARG: Trace the ancestry of sequences Other Genealogical Structures are possible network, language merging, population splitting Recombination Recombination: Gene Conversion: 1 meiosis •Total Haploid length males: 25.9 M - females: 44.6 M. •Gene conversions 1-2 orders higher. Length 300-2000 pb. Lander et al.(2001) “Initial sequencing and analysis of the human genome” Nature 409.860-912. + Kong,E. et al.(2002) “A high resolution recombination map of the human genome” Nature Genetics Mutations and Mutation Rates 1 mitosis or generation Average Number of Mitoses • Single nucleotide substitutions: ~10-7 Per Male generation (15:35 .. 20:150) • Microsatellites (~100.000): ~10-2 Per Female generation: ~24 • Small insertion deletions: ~10-8 A A A C C A A A C C A A A C C Selection: Positive & Negative Crow,JF (2000) “The Origins, Patterns and Implications of Human Spontaneous Mutation” Nature Review Genetics 1.1.40-47 + Strachan and Read (2004) chapter 11 +Jobling, Hurles and TylerSmith (2004) chapter 2 Coalescent Issues 1. The number of genetic ancestors 2. When gene-trees differ from species trees 3. Out of Africa 4. Ages of Alleles 5. Allele Gradients 6. Number of Genetic Ancestors 7. Selective Sweeps Human History Levels: Physical, Cultural & Genealogical The physical population size, N(t), and the efficient population size, Ne(t) are separate concepts. i. N(t)can mainly be studied by historical/archeological means, ii. Ne(t) can be studied genealogically, for instance by tracing the ancestries of DNA sequences. Main departures from simplest Population Genetical Models: A. Long epochs of exponential growth at increasing rates B. Bottlenecks & small populations. C. Migrations & Geographical subdivisions Our relationship to the great Apes. From Nei,2003 13 Myr 7 Myr 5.5 Myr 1 Myr Chimp Pygmee Chimp Humans Gorilla Orangutan Ancestral Population of Human and Chimp 7 Myr G H 5 Myr Now Human P(GeneTree SpecieTree ) 2 / 3et / 2 N e C Chimp Gorilla Example: Chen & Li (2001) 53 triads: 31 (H,C), 10 (H,G) & 11 (C,G) Out-of-Africa and different degrees of replacements Total replacement Europe No replacement Partial replacement 1-1.2 Myr 1-1.2 Myr 1-1.2 Myr 80-130 Kyr 80-130 Kyr 80-130 Kyr Africa Asia Europe Africa Asia Europe Africa Asia Example: Takahata (2001) found data could be explained by total replacement. Allele Frequencies and Principal Components Cavalli-Sforza,2001 •Allele frequencies for different localities are subjected to a smoothing procedure. •Principle Components are found and projected on geographical maps. •Strongly criticized (Sokal et al.): even no geographical structure will “look like” geographical structure, no timing of gradients,... 1. Agriculture 6-10 Kyr 2. Greek Colonisation 3 Kyr 3. Retraction of the Basques 4. Uralic People 5. Horse domestication Time slices All positions have found a common ancestors on one sequence All positions have found a common ancestors Time 1 2 1 1 2 1 2 1 2 1 2 N Population Number of genetic ancestors to the Human Genome time Sr– number of Segments E(Sr) = 1 + r C C C R R R sequence Simulations Statements about number of ancestors are much harder to make. Wiuf conjectured ~r/ln(r) Applications to Human Genome Parameters used Chromosome 1: 4Ne 20.000 Segments (Wiuf and Hein,97) Chromos. 1: 263 Mb. 52.000 263 cM Ancestors 6.800 All chromosomes Ancestors 86.000 Physical Population. 1.3-5.0 Mill. A randomly picked ancestor: (ancestral material comes in batteries!) 0 260 Mb 0 52.000 *35 0 7.5 Mb 8360 6890 *250 0 30kb Many sampled alleles relative to Ne Wakeley03, Pitman, Schweinberg 1. Simultaneous Events 2. Multifurcations. 3. Underestimation of Coalescent Rates Cystic Fibrosis (Wiuf 2000) F508 – possibly maintained by heterosis (1.023)- higher resistance to Salmonella infections. Data: 1. Frequency of F508-allele - .022. 2. Inter variability in 1.705 individuals 46 variable positions. 3. Model of human demography. Model parameters: mutation rate, heterosis advantage and an exponential growth model of human population expansion. Estimated age of F508 is estimated to be * Pedigree Issues Chinese http://demography.anu.edu.au/People/Staff/zhongwei.html Burke’s British Peerage http://www.burkes-peerage.net/sites/wars/sitepages/home.asp Mormons http://genealogy-mormons.com/ Icelandic http://www.decode.com + Helgason, A. et al. (2003 June) “A population-wide coalescent analysis of Icelandic matrilineal and patrilineal genealogies: Evidence for a faster evolutionary rate of mtDNA lineages than Y-chromosomes” American Journal Human Genetics. i. Icelandic Pedigree ii. Theoretical Models Quebec French Heyer and Tremblay, 1998 PNAS Icelandic Genealogies Helgason, 2003 Total Genealogy Males only Females only 1848 2 Ancestor cohort 1 1 1892 2 Year Of (June 2002) 276,00 Icelanders 131,060 born after 1972 was traced back. 2 3 1 2 1 2 1 1 1 1972 Contemporary cohort 2002 Icelandic Genealogies Helgason, 2003 Ancestors to 1972 cohort Backtracable Matrilines N = 31,817 Patrilines 73.9% 4 .3 g= 20% 22.1% 8.3% 3 .8 77.9% 25% 91.7% 15% 26.1% 13.8% 86.2% N = 64,150 10% Descendant cohort born after 1972 N = 66,910 5% 25 50 75 100 25 50 75 No. of descendants No. of descendants Matrilines Patrilines 100 Matrilines N = 20,443 93.4% 15.0% Patrilines Ancestral cohort born 1698-1742 N = 18,023 89.7% 10.3% g=7 .9 6.6% 12.5% 10.0% g=8 .8 Percent of ancestors Percent of ancestors N = 31,659 g= Matrilines Patrilines Ancestral cohort born 1848-1892 7.5% 5.0% 2.5% 29.3% 38.2% 25 50 75 100 125 150 175 200 225 250 275 300 325 350 375 No. of descendants 25 50 75 100 125 150 175 200 225 250 275 300 325 350 375 No. of descendants 70.7% 61.8% N = 64,150 Descendant cohort born after 1972 N = 66,910 45 Icelandic Genealogies Age of parent (years) Helgason, 2003 Patrilines Variation in annual offspring number greater for females in males, due to shorter generation time. Matrilines 40 35 30 25 20 1700 1750 1800 1850 1900 1950 Birth year of individual Positive correlation in fertility between parentoffspring. 2 Average number of offspring 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 Patrilines 0.2 0 1700 Matrilines 1750 1800 1850 1900 Birth year of parent 1950 2000 2000 Finding (Great)k Grand Parents. Finding Ancestral Individuals. Joe Chang 1999 Dec. Adv. Appl. Prob. 11 10 9 8 7 6 5 4 3 2 1 0 Let T be the time, when somebody was everybody’s ancestor. Changs’ result: lim T*/log2(N) =1 prob. 1 NOW Combining Ancestral Individuals and the Coalescent Finding Common Ancestors. Wiuf & Hein, 2000. NOW Unify the two processes: Sample more individuals Let each have p parents. ( p – possibly stochastic >= 1). Result: A discontinuity at 1. For p>1 change log2logp Comment: Genetic Ancestors is a vanishing set within Genealogical Ancestors. Derrida G +1 G Offspring Distribut ion for a marriage , pk . m is expected offspring number. Recursion: w (G 1) 1 w' (G ) 2 ' children of individual, ancestor in tree, w - weight probability that uni. random path leads to . Initialization: w (0) , Kammerle 89: Pair Moran Model A pair of children are born – they choose parents randomly. A pair is erased and the children pair take their place. A. The stationary distribution of number of ancestors to present population is hypergeometric: N N i i 1 i 2N N 1 B. R N ( )i 1,.. N R N - N/2 then N (0,1 / 8) N R(t ) N / 2 C. Rˆ N (t ) : and SN (t) : Rˆ N (t ). SN (t) converges to Ornstein - Uhlenbeck N process with infinitesi mal variance (y) 1/4 and infinitesi mal drift (y) -y. y 0 Non-Contributing Ancestors Recombination: 22 1 x y 22 1 …. 22 x x 1 x …. 22 46 packets 2k x …. 22 y 1 21 0 1 <≈72 + 46 packets 46 packets 1 k < ≈k*72 + 46 packets 46 packets 1 Ancestors: No Recombination: Generation: Kevin Donnelly, 1983 TPB y 1 x …. 22 46 packets y Non-Contributing Ancestors Yun song- pers.comm., 2003 Kevin Donnelly, 1983 The probability of 1. Any non-contributing ancestor 2.That a randomly chosen ancestors is non-contributing 1 2 4 8 16 32 64 128 256 512 Pedigree Inference Prior on Pedigrees Three Processes 1. Choosing Parents 2. Recombination 3. The Mutational Process Probability of data given pedigree Mother Father Posterior on Pedigrees Elston-Stewart (1971) -Temporal Peeling Algorithm Lander-Green (1987) - Genotype Scanning Algorithm Inheritable phenomena Genetic Material Sequences “Allele Frequencies” Language Culture Pathogens Pests Pets Morphological Characters Pathogen phylogenies Falush 2003 Helicobacter pylori is transmitted from mother to child. Falush et al. sequenced 8 genes from 370 strains from 27 populations – 3850 nucletides each. 5 ancestral populations:East Asia, Euro1, Euro2, Afr1 Afr2 Structure assign each polymorphism to an ancestral population. American indians are grouped as asian showing that H.pylori infection is ancient. Diversity of H.pylori 50 times larger than humans. Much recombination – i.e. positions can be treated as independent A. Maori is east asian. B. Inuit is Euro1 + Euro2 C. South African Afr2 D. English Cavalli- Sforza: Language Trees Cavalli-Sforza (1997) Genes Peoples and Languages PNAS 94.7719-24 Principle of Comparison. Loss of cognates (“homologous” words) Syntax Comparison. Sound use. Reconstruction (dependent on interpretation) – stretches back 2-6.000 years dependent on criteria. Historical Linguistics William Jones 1776 observes similarities between Sanskrit, Greek & Latin Swadesh (1952) makes on of the first glottochronological studies Kruskal, Dyer & Black (1971) large successful investigation. Principles: Distance - Swadesh’ rule. 20% lost per millenium. Parsimony Compatibility Likelihood Criticisms: Word Loss is not clocklike Languages and merge and borrow giving non-tree like structure Not much research goes into this area. Global Phylogeny Khoisan African Niger-Kordofanian Congo-Saharan Cavalli-Sforza,2001 Ruhlen, 1994 Nilo-Saharan Afro-Asiatic Kartvelian Dravidian Indo-European Uralic Eurasiatic 20-10 Kyr Eurasion/American 40-20 Kyr Altaic Eskimo-Aleut Chukchi-Kamchatkan Home sapiens sapiens 100-70 Kyr Amerind Na-Dene Eurasian 60-40 Kyr Dene-Caucasian 40-20 Kyr Sino-Tibetan Caucasian/Basque/Burushaski Asian 70-50 Kyr Austronesian Austro-Tai Daic Miao-Yao Austric Austro-Asiatic Dene-Caucasian 40-20 Kyr Pacific Indo-Pacific Australian Indo-European Language Trees Afghan Baluchi Persian Osetic Bengali Dyen, Kruskal & Black, 1992 Hindi Piazza, Cavalli-Sforza, 2001 Punjabi Marathi Nepali Kashmiri Singhalese Welch Irish Celtic Breton Bulgarian Macedonian Belorussian Ukranian Polish Chech Russian Slavic Serbo Croatian Slovenian Latvian Lithuanian Walloon Italian Ladin Portugese Spanish Sardinian Rumanian Danish Swedish Riksmaal Faraoese Icelandic Dutch German Frisian English Greek Armenian Albanian 9000 8000 7000 6000 5000 4000 3000 2000 1000 Romance Germanic French Germanic Language Trees From Embleton, 1986 Swedish 1540 Danish 1803 Norwegian 1051 Faraoese Icelandic 1842 194 English Tok Pisin 1051 Frisian 246 Flanders 1239 Africaans 1423 1025 1668 1051 1234 Dutch Yiddish 1476 Hamburg Lower Saxony 1558 Pennsylvanian German German TrS The Coalescent & Human Evolution (11.6.04) Human History Methodological Problems: Reconstucting haplotypes, defining haplotype blocks + HapMap. Relationship to the great Apes, Ancestral Population of Human/Chimp Ancestor, Out of Africa, The Neanderthal. Human Population Growth, Ancestral Population Structure, Selection, Migrations & Age of Alleles. SNPs Haplotypes, Recombination Hotspots & Haplotype Blocks. Individual Stories: Mitochondria, Y, autosomal chromosomes & alleles. Emperical Genealogies Iceland Other Genealogical Issues: Genealogical Ancestors, Genetic and Non-Contributing Ancestors Heritable Characters Languages Associated Animals, Plants & Pathogens Surnames Morphological Characters The Role of Coalescent Theory