Methods Data We used genome-size multiple alignments of genomes of 8 vertebrates with that of H. sapiens (hg19 edition), and of 7 insects with D. melanogaster (dm3), obtained from UCSC genome database [11]. These data sets were chosen to represent the highest number of clades that branch off the lineage of H. sapiens (D. melanogaster), in order to maximize our resolution for mapping the initial (A B) replacements onto this lineage (Figure 1); a single species with the highest coverage was chosen from each such clade, and only species with coverage > 6 were used. Using the canonical splicing variants of 21,018 UCSC hg19 KnownGenes [12] for vertebrates, or of 13,300 FlyBase genes (BDGP release 5) [13] for insects, we extracted the alignment slices of protein coding regions for the orthologous genes. Single-nucleotide polymorphism data was obtained from dbSNP (release 134) [14] for human, and from Drosophila Genetic Reference Panel website [15] for D. melanogaster. Codon sites with gaps or missing data in any of the species were excluded from analysis. The total numbers of genes and codons analyzed are given in Table 1. Lengths of segments of phylogenetic trees were taken from the UCSC Genome Bioinformatics Site. All lengths are measured in the genome-average numbers of nucleotide replacements per site. Analysis The nucleotides in the internal nodes of the phylogenies were reconstructed using maximum likelihood as implemented in the codeml program of PAML package [16]. The results obtained using maximum parsimony were similar (data not shown). We mapped the nucleotide replacements to the phylogenetic trees as follows: whenever the nucleotides ascribed to the neighboring nodes differed, a nucleotide replacement was inferred to have occurred at the edge that connected these two nodes. No additional nonsynonymous replacements, beyond the initial replacement (A B) at one of the five (four) internal segments of the H. sapiens (D. melanogaster) lineage (figure 1), and (for calculation of Fd) the reversal (B A) or orthogonal replacement (B C) at the terminal segment of this lineage, were allowed at the codon along the lineage of H. sapiens (D. melanogaster). Throughout the paper, A and B refer to the two amino acids separated by a single nucleotide substitution in the second position of a codon, and C is either of the one or two remaining amino acids different from both A and B and separated from them by a single nucleotide substitution in the second position of the codon. Codons with nonsynonymous replacements beyond those two, i.e. with nonsynonymous replacements in the first or the third position of the codon, were not analyzed. For the analysis of polymorphisms, we then counted the numbers of second-position nucleotides that experienced a replacement in one of the internal segments of the H. sapiens (D. melanogaster) lineage, and are currently polymorphic in H. sapiens (D. melanogaster). More precisely, for each of the considered ages of the initial replacement, i.e. for each of the considered internal segments where such a replacement has occurred (figure 1), we calculated the frequency of polymorphisms that restore the ancestral amino acid as follows: F p (B →A )= poly (B →A )∣ initial (A →B ) , initial (A →B ) and the frequency of polymorphisms that give rise to an amino acid different than the ancestral one as follows: F p (B →C )= poly (B →C )∣ initial (A →B ) initial (A →B ) Here, initial(A B) is the number of A B replacements inferred to have happened at this internal segment; poly(B A) and poly(B C) is the number of codons with B A (B C) polymorphism, such that the ancestral state for polymorphism is B; and the vertical line means that the frequencies of such polymorphisms are measured at the sites of the inferred A B replacement. The ancestral state for polymorphism was determined as the allele carried by the reference genome (hg19 or dm3); for insects, it was additionally required that at least 50% of the D. menanogaster genotypes carry the ancestral allele B. The replacements were analyzed similarly, except instead of the polymorphism in H. sapiens (D. melanogaster), we used the number of replacements in the H. sapiens (D. melanogaster) lineage after its divergence from M. musculus (D. sechellia), as inferred by codeml. More precisely, for each of the five (four) internal segments of the H. sapiens (D. melanogaster) lineage, the following statistics were defined: F d (B>A )= terminal (B>A )∣ initial (A>B ) initial (A>B ) F d (B>C )= terminal (B>C )∣ initial (A>B ) initial (A>B ) Here, terminal(B A) and terminal(B C) are the numbers of B A and B C replacements in the H. sapiens (D. melanogaster) lineage after its divergence from M. musculus (D. sechellia).