Trees & Topologies Chapter 3, Part 2 A simple lineage • Consider a given gene of sample size n. • How long does it take before this gene coalesces with another gene in the sample? 2.5 2 1.5 1 0.5 0 1 2/24/2009 2 3 4 COMP 790 Trees & Topologies 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2 Single Lineage • How many events pass before it coalesces with another gene? 8 7 6 5 4 3 2 1 0 1 2/24/2009 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 COMP 790 Trees & Topologies 3 Disjoint subsamples Consider a sample of size n that is divided into two disjoint subsamples, A and B of sizes k and n-k, respectively. 2/24/2009 COMP 790 Trees & Topologies 4 Disjoint Subsamples (cont’d) • The probability that all genes in A find a MRCA coalescing with any gene in B is: • The probability that one of the two samples finds a MRCA before coalescing with members of the other sample is: 2/24/2009 COMP 790 Trees & Topologies 5 Disjoint Subsamples (cont’d) 2/24/2009 COMP 790 Trees & Topologies 6 Jump Process of Disjoint Subsamples • Jump processes: – (i,j) -> (i-1, j) with probability (i+1)/(i+j) – (i,j) -> (i,j-1) with probability (j-1)/(i+j) • Process starts in (k, n-k) and continues until (1,j) for some j. Eventually jumps to (0,j) for some j and finally reaches (0,1), where 0 denotes that sample A has been fully absorbed into B. 2/24/2009 COMP 790 Trees & Topologies 7 Disjoint Subsamples Example Gene tree of the PHDA1 gene from a sample of Africans and non-Africans. 2/24/2009 COMP 790 Trees & Topologies 8 A sample partitioned by a mutation Now, consider a sample of size n where a polymorphism divides the sample into two disjoint subsamples, A and B, of size k and n-k, respectively. 2/24/2009 COMP 790 Trees & Topologies 9 Comparing the mean values Jump processes: • (i,j) -> (i-1,j) with probability i/(i+j-1) • (i,j) -> (i, j-1) with probability (j-1)/(i+j-1) 2/24/2009 COMP 790 Trees & Topologies 10 Unknown ancestral state • If we do not know which of the two alleles is older, we have a slightly different situation. • Probability that an allele found in frequency k out of n genes is the oldest is k/n. • Probability that A carries the mutant allele is 1-k/n = (n-k)/n. • Jump processes become: – (i,j) -> (i-1,j) with probability i/(i+j) – (i,j) -> (i, j-1) with probability j/(i+j) 2/24/2009 COMP 790 Trees & Topologies 11 The age of the MRCA for two sequences Now consider the situation of two sequences with S2 = k segregating sites. 2/24/2009 COMP 790 Trees & Topologies 12 Probability of going from n ancestors to k ancestors • Probability of different number of ancestors starting with seven ancestors at time 0. 2/24/2009 COMP 790 Trees & Topologies 13 Probability of going from n ancestors to k ancestors Probability of different number of ancestors starting with seven ancestors at time 0 and ending with 4 ancestors at a different time. 2/24/2009 COMP 790 Trees & Topologies 14 Probability of going from n ancestors to k ancestors Probability that a sample of three genes have two ancestors at time r. 2/24/2009 COMP 790 Trees & Topologies 15 Questions? • Slides are available on the Wiki at: http://compgen.unc.edu/Courses/index.php/C omp_790-087 2/24/2009 COMP 790 Trees & Topologies 16