Supplementary Materials 1) The relationship between sequence variability and probability of site patterns To investigate a quantitative relationship between sequence variability and the probability of the sites we consider a three-taxa star tree (1, 2, 3) each with the same branch length b. Based on the tree we simulated 10000 amino acid sites under the WAG model with varying branch lengths. The probabilities of the occurrences of the three sequence patterns (AAA, AAB, ABC) (here A, B and C represent any of the 20 amino acid residues but they are different among each other) were calculated. Several quantities measuring the differences in the probabilities between the site patterns were plotted in Figure S1 with respect to the branch lengths. The probabilities of the identical site AAA are on average higher than that of AAB which are in turn on average higher than the most variable site ABC for branch length less than 2. As the branches get longer the probability differences among the site patterns gradually diminish to 0. Fig. S1 The differences in the probabilities of the occurrences of the site patterns AAA, AAB and ABC under the WAG model for a star tree (1:b, 2:b, 3:b) varied as a function of the branch length b. 1 2) The correlation between the minimum number of differing nucleotides in codons for a pair of amino acids and its amino acid exchangeability score Figure S2 shows box plots of the WAG exchangeabilities (Whelan S, Goldman N: Mol Biol Evol 18:691-699, 2001) as a function of the minimum number of different nucleotides in codons. The 75 amino acid pairs having one-nucleotide difference in their codons have significantly higher WAG scores than the 101 amino acid pairs having two-nucleotide difference in their codons, which in turn have higher WAG scores than the 14 pairs with three-nucleotide difference in codons. Fig. S2 Distribution of the WAG exchangeabilities for pairs of amino acids. Pairs of amino acids are binned according to the minimum number of changes required to get from one to the other. 2 3) Comparing power of the six simulation cases for SLL using different codon frequencies in analyzing the data under M8A. To determine thresholds for the SLL test, a null distribution is obtained from a large number of codons (e.g., 20,000) simulated under neutral evolution conditions using the M8A model, which takes the same parameter values from the original data analyzed under a M8A model. Two commonly used codon frequency models can be used in M8A: (1) F0 model: equal codon frequencies (all 1/61); (2) F3x4 model: codon frequencies expected from the nucleotide frequencies at the three codon positions. The following table shows the power of SLL under the two codon frequency models for the six cases of positive selection introduced in the main text. Table S1 Power of the SLL tests using different codon frequencies in the M8A simulations. Positive selection casea # of positive sets predicted when simulating under M8A with the following codon frequency models F0 F3x4 Binomial test* Simes test Combined and unique sets* Binomial test* Simes test Combined and unique sets* 1) Original conditions from Lysin data 100 (100) 73 100 (100) 100 (100) 63 100 (100) 2) Branch lengths increased 10 fold 72 (97) 13 73 (97) 74 (98) 9 74 (98) 3) Branch lengths decreased 10 fold 67 (96) 31 79 (96) 58 (96) 28 71 (96) 4) Weak conditions 1 (ω3=1.5, p3 = 0.05) 16 (76) 13 27 (80) 4 (77) 12 16 (80) 5) Weak conditions 2 (ω3=1.5, p3 = 0.269) 71 (98) 16 75 (98) 44 (89) 16 50 (90) 6) Concatenated 3 M0 datasets 10 (57) 5 15 (60) 3 (32) 2 5 (32) a For each case, 100 datasets each of 200 codon sites were simulated and analyzed. * The first number was based on θc = 0.075 corresponding to the standard site-wise test α = 0.05 and the second number in brackets was based on θc = 0.05 corresponding to a site-wise test α = 0.03 (see main text for details). 3