Evolutionary constraint facilitates interpretation of genetic variation in resequenced human genomes David L. Goode, Gregory M. Cooper, Jeremy Schmutz, Mark Dickson, Eidelyn Gonzales, Ming Tsai, Kalpana Karra, Eugene Davydov, Serafim Batzoglou, Richard M. Myers, Arend Sidow Supplemental Material Supplemental Discussion SELECTION AGAINST VARIATION IN CONSTRAINED ELEMENTS AND AT CONSTRAINED SITES An initial indication of the effectiveness of using evolutionary constraint for evaluating the discovered variation is the significant underrepresentation of SNPs within constrained elements (CEs) compared to the flanking neutral regions (hypergeometric P = 1.26 x 10-4; Supplemental Table 3). Variation in CEs also exhibits a significantly lower average DAF (Wilcoxon P = 1.97 x 10-9; Supplemental Table 4) compared to neutral regions. CEs are the usual unit of prediction in comparative sequence analysis (Cooper et al, 2005; Siepel et al 2005; Asthana et al 2007; Margulies et al 2007; The ENCODE Consortium 2007) but classifying variation at the element, rather than single-nucleotide, level does not optimally leverage the signal of evolutionary constraint. Sites within CEs can be under much different levels of constraint (exemplified by synonymous versus nonsynonymous sites within coding exons), and many constrained sites reside outside of predicted CEs (Asthana et al 2007; Supplemental Table 3). We therefore conducted all analyses by classifying variation according to site-specific estimates of evolutionary constraint produced by GERP. For each site in an alignment, GERP calculates a “Rejected Substitution” (RS) score, so called because it is the estimated amount of evolutionary variation that was ‘rejected’ by past negative selection (Supplemental Methods). Classification of our 2214 SNPs by RS provides a substantially and significantly better signal of both depletion of SNPs and reduction in DAF in constrained sites than does classification by CEs (Supplemental Table 3), allowing us to extend our understanding of variation and constraint beyond the results of previous studies that showed that variation is suppressed in CEs (Drake et al 2006; Asthana et al 2007; Chen et al 2007; Katzman et al 2007). The reduction in DAF of alleles found at constrained sites is consistent across multiple annotation categories (Supplemental Figure 7). Furthermore, negative selection is acting against variation at all DAF levels, with the strength of selection increasing as DAF increases. We binned all 2214 SNPs according to their DAF and examined how the SNPs in each bin were distributed with respect to constrained sites. We found that the number of SNPs at constrained sites in all DAF bins was fewer than what one would expect to see by chance (Supplemental Figure S2), with the ratio of the observed to the expected number of SNPs decreasing as DAF increases. Even SNPs with a DAF < 0.125% (1 SNP in > 800 chromosomes) are under-represented at constrained sites, indicating negative selection is acting to suppress even the most rare variants in our sample. Furthermore, as the level of constraint increases, the strength of selection against variation in all DAF bins increases (compare the four lines in Supplemental Figure S2). INDELS The 124 insertions and deletions (indels) we detected in parallel with the SNPs also avoid constrained sites. The majority of indels are short, with only 14 (11.7%) having a length over 5 bp and 48.4% (60/124) of indels having a length of 1 base pair. The longest indel is a 313 bp deletion of part of the 3’ UTR of the L10 ribosomal protein (RPL10), found in seven heterozygotes belonging to the Luhya population from Webuye, Kenya. We ascertained 92 derived alleles that delete a total of 693 bases. Of these, 159 bases (22.9%) have RS > 1, a significant under-representation of constrained compared to the fraction of all bases we sequenced that were constrained (51.1% with RS > 1; hypergeometric P = 8.45 x 10-53). Derived alleles that constitute insertions have, by definition, no evolutionary constraint and we therefore evaluated the constraint of three bases on either side of the insertion. Of these 192 sites (representing the six neighbors of our 32 insertions), 73 (38.0%) are constrained at RS > 1. This is a significant underrepresentation (hypergeometric P = 1.57 x 10-4), showing that insertions avoid constrained neighbors. We note that excluding coding exons from the indel analysis did not change the results. Given their relatively small numbers and the uncertainty in inferring their DAFs, indels were excluded from subsequent analyses. POPULATION-SPECIFIC ANALYSES The depletion of variants as RS increases is generally consistent among the five geographic groups in our sample (Supplemental Figure S3). However, the average DAF of Europeans and Asians is substantially higher than the DAF of Africans, with the difference decreasing as RS increases and vanishing at the most constrained positions (Supplemental Figure S4). This pattern is likely due to historical demographic effects such as bottlenecking during the migration out of Africa, during which rare alleles were lost and common alleles increased in frequency (Marth et al 2004; Boyko et al 2008; Lohmueller et al 2008). The magnitude of the effect is substantial and suggests that evaluation of variation that focuses only on common alleles systematically underestimates the amount of functional variation in Africans compared to other geographic groups. The average number of derived alleles with RS > 1 carried by African individuals is virtually indistinguishable from that carried by European or Asian individuals (Supplemental Figure S5) and differences between the populations only emerge when genotypes are considered. African individuals carry more derived alleles in the heterozygous state (data not shown), whereas non-African individuals carry more derived alleles in the homozygous state (Supplemental Figure S6). Functional noncoding variation, which is present in our dataset in proportions that are roughly representative of the whole genome, therefore appears to behave similarly to coding variation with regards to its distribution among human populations (Lohmueller et al 2008). FALSE DISCOVERY RATE ESTIMATES FOR CONSTRAINED NONCODING SITES The sites we identify as constrained are a mixture of truly constrained sites and neutrallyevolving sites that appear constrained by chance (Cooper et al 2005; Eddy 2005). If this false discovery rate is dramatically higher for noncoding sequence than it is for coding sequence, this could explain the preponderance of noncoding SNVs we observe at constrained sites in an individual’s genome. To investigate this possibility, we applied an FDR of 0 to all coding sites. That is, we took every coding site that had a RS > 2 to be truly constrained. We then estimated the fraction of noncoding sites that could appear constrained by chance by using the Poisson distribution. For each aligned site the number of substitutions expected to be observed at that site (E) is equal to the neutral rate of evolution of the tree connecting the species present in the alignment at that site. For each discrete value of E found at the 1,301,261,424 noncoding sites having an E > 2, we used the Poisson distribution with a gamma equal to E to calculate the probability of a site with that E having a RS > 2 by chance. This probability was multiplied by the total number of sites with that E to get the number of sites with RS > 2 expected by chance for sites with that E. This led to a genome-wide estimate of 164,146,071 non-exonic sites having RS > 2 by chance, suggesting that up to 65% of all noncoding sites with RS > 2 (252,819,327) could appear constrained by chance. Thus, the number of “truly constrained” non-coding sites as predicted by our Poisson modeling (88,673,256) represents only 2.8% of the genome. This number and the fact that exons comprise only about 1.1% of the genome suggest a strict Poisson model may be too conservative. We also obtained an estimate of FDR using a stricter cutoff for constraint (RS > 4). There are 35,428,081 non-exonic sites with RS > 4 in the genome, whereas only 4,603,204 are expected to be observed by chance. This gives an FDR of 13% for noncoding sites with RS > 4. This indicates that Even if our inferences based on SNVs with RS > 2 are influenced by a high FDR at noncoding sites, the importance of non-coding and common alleles in human functional variation is demonstrated by the preponderance of noncoding and common variants at sites with RS > 4, where the FDR is much lower. Supplemental Figures Supplemental Figure S1: Mean derived allele frequency by RS score, for sites harboring SNVs (black diamonds) and sites one base 3’ of a SNV (blue circles) or one base 5’ of a SNV (green circles). Supplemental Figure S2: The ratio of observed number of SNPs at constrained sites to the number expected by chance for SNPs in six DAF bins, using four different RS cutoffs to define “constrained” sites. Supplemental Figure S3: Ratio of the observed to the expected number of SNPs in each RS bin, in five geographically distinct populations. RS bins were obtained by ranking SNPs at positions with a RS > 0 by RS score and dividing into quintiles (each group contains 20% of the SNPs with RS > 0). The expected number of SNPs in each bin was calculated by considering what fraction of the 227,252 sites we resequenced fall into each RS score range. Supplemental Figure S4: Mean DAF of SNPs at constrained positions, in five geographically distinct populations. RS bins were obtained by ranking SNPs at positions with a RS > 0 by RS score and dividing into quintiles (such that each group contained 20% of the SNPs with RS > 0). Supplemental Figure S5: Number of derived alleles carried per individual, in each population. Horizontal bars correspond to the 10%, 50%, and 90% quantiles. Supplemental Figure S6: Number of homozygous derived genotypes carried per individual, in each population. Horizontal bars correspond to the 10%, 50%, and 90% quantiles. Supplemental Figure S7: Mean derived allele frequency (DAF) of SNPs in each of five annotation categories at sites with RS <= 1 (unconstrained) and RS > 1 (constrained). Supplemental Tables 1 – 4 Supplemental Table 1: Negative selection acts against variation at constrained sites in all three individual genomes. The number of SNPs observed in each RS bin in each individual genome, relative to the number expected given the percentage of the whole genome that falls into each RS bin. Sites for which RS could not be measured due to insufficient sequence depth were placed in the ‘RS < 0’ bin. SNPs in each genome were binned according to the RS score of the site where they were found and the number of SNPs in each bin was compared to the number of SNPs expected in each bin, which was calculated by multiplying the percentage of the whole human reference genome that falls into each RS score bin by the total number of SNPs observed in each individual. All bins with a RS score > 0 have fewer than expected SNPs, and as the RS score range increases, SNPs become more and more under-represented. Whole Genome # of sites % of sites Individual Genomes # SNPs Chinese Observed Expected Total 3080419480 <0 1565352986 50.82% RS bin 0 to 2 1241001096 40.29% 2 to 5 257201259 8.35% >5 16864139 0.55% Total 3237659 <0 1721913 1645256 0 to 2 1145638 1304348 2 to 5 180103 270330 >5 4951 17725 0.88 1127298 1301487 0.67 170700 269737 0.28 4868 17686 Venter Observed/Expected Observed Expected 3230557 1.05 1752123 1641647 Yoruba Observed/Expected Observed Expected 3568675 1.07 1880205 1813466 0.87 1273104 1437703 0.63 202144 297968 0.28 5539 19537 1.04 0.89 0.68 0.28 Observed/Expected Supplemental Table 2: The number and percentage of bases sequenced from the ENCODE regions and of the SNPs at constrained positions that are coding and noncoding, using two RS score thresholds. RS > 3 is a more stringent threshold for constraint than is RS > 1. Supplemental Table 3: Variation is significantly depleted at constrained sites both within and outside of constrained elements. The number of sequenced bases or SNPs within constrained elements and/or at constrained positions (defined as having a rejected substitutions (RS) score > 1), with the total number of sequenced bases or SNPs given in parentheses. P-values were obtained from the hypergeometric distribution. Supplemental Table 4: SNPs in CEs also exhibit significantly lower average derived allele frequencies (DAF) than SNPs in neutral regions. The mean DAF of SNPs within/outside constrained elements (top row) and/or at constrained (RS > 1)/unconstrained positions (RS 1). Mean DAF is expressed as a percent (%). P-values were obtained using the Wilcoxon rank-sum test. Supplemental Methods Resequencing DNA Samples DNA was obtained from the Coriell Cell Repositories (http://ccr.coriell.org/Default.aspx) for 432 individuals from 5 populations: Yoruba from Ibadan (YRI; 121 individuals), Nigeria, Luhya from Webuye (LWK; 90 individuals), Kenya, CEPH (CEU; Utah Residents with Northern and Western European Ancestry) (75 individuals), Han Chinese from Beijing (CHB; 73 individuals) and the Japanese from Tokyo (JPT; 73 individuals). All individuals were unrelated. Overall, an equal number of males and females were resequenced, with nearly equal numbers of both in each population. A complete list of samples, including the Coriell catalog ID for each sample, is given in Supplemental Table 6. A complete list of the Coriell plates from which our samples were obtained is given in Supplemental Table 7. PCR Primer Selection A total of 575 regions (74 -1427 bp, median length 463 bp) were selected for PCR amplification and resequencing from the set of ‘merge6’ constrained elements identified from multiple alignments of the ENCODE regions using GERP 1.03, and from a subset of sequences from the ENCODE regions that were found to have promoter function in a high-throughput transient transfection luciferase reporter assay (Cooper et al 2006). For the location of each resequenced region, see Supplemental Table 8. For more information on constrained element detection please refer to the ‘Measures of Constraint’ section below. Amplicons were designed using Primer3 (Rozen and Skaletsky 2000), with primers selected from the neutral DNA flanking constrained elements, using sequence masked by RepeatMasker (Smit et al 1996-2004). If a suitable pair of primers could not be found after this first attempt, the left side was masked as above but the right side was masked for low complexity sequence only, by running RepeatMasker with the " -s -noint" option. If that also failed, another attempt was made, with the left side masked for low complexity only and the right side fully masked. The parameter $pcrhighsize was initially set to 600 bp, and increased if necessary. The Primer3 parameters used were: PRIMER_MAX_SIZE=24 PRIMER_OPT_SIZE=21 PRIMER_PRODUCT_SIZE_RANGE=400-$pcrhighsize PRIMER_PRODUCT_OPT_SIZE=500 PRIMER_NUM_RETURN=1000 PRIMER_OPT_TM=62 PRIMER_MIN_TM=59 PRIMER_MAX_TM=65 PRIMER_MIN_GC=25 PRIMER_MAX_GC=60 PRIMER_SALT_CONC=50 PRIMER_DNA_CONC=50 PRIMER_SELF_ANY=8 PRIMER_SELF_END=3 PRIMER_MAX_POLY_X=3 PRIMER_GC_CLAMP=0 PRIMER_MAX_END_STABILITY=8 PRIMER_PAIR_WT_PRODUCT_SIZE_LT=1.2 PRIMER_MAX_TEMPLATE_MISPRIMING=12 PCR Amplification DNA samples were arrayed onto plates and PCR amplification performed simultaneously on each sample. Samples were amplified in two batches comprising 48 and 384 samples. The reaction volume in each well was 5.5 L, comprising 1 L DNA (5.0 G/L), 0.5 L each of the forward and reverse primer (10 M), 1 L 5X PE Taq Gold Buffer, 0.4 L dNTPs (2.5 M), 0.035 L of Amplitaq Gold (PE) (5U/L) and 2.065 L H20. The following cycling conditions were used for PCR: 10 minutes at 95˚C, followed by 35 cycles of 30 seconds at 94˚C, 30 seconds at 63˚C and 23 seconds at 72˚C, and finishing with 3 minutes and 30 seconds and 72˚C. PCR products were kept at 4˚C after the reaction was completed. After PCR, products were cleaned up using 10 U of exonuclease I and 1U of shrimp alkaline phosphastase in a 2.0 L reaction volume. Sequencing For each sample on the PCR plate, 1.0 L of DNA was taken and combined with 1.0 L primer (10 uM), 1.0 L 50% BigDye Terminator v3.1 mix, 1.0 L 2.5X BigDye Terminator v1.1, v3.1 Sequencing Buffer and 1.0 L H20. The following cycling conditions were used for sequencing: 5 minutes at 96˚C, followed by 30 cycles of 10 seconds at 96˚C, 5 seconds at 50˚C and 4 minutes at 60˚C. Samples were kept at 4˚C after the sequencing was completed. Data was collected using an ABI 3730XL with 50cm arrays. Sequencing was performed at the Stanford Human Genome Center. Identification and verification of SNPs and indels All sequencing reads were visually inspected for quality and acceptable reads from each region were assembled using Phrap v.990319 (Ewing and Green 1998; Ewing et al 1998). SNPs were identified using PolyPhred 5 (Stephens et al 2006) and all SNP calls were confirmed by visually inspecting all PolyPhred SNP calls using Consed (Gordon et al 1998). Indels were identified through visual inspection of reads. To estimate our false positive rate for single heterozygotes, we selected 84 SNPs for which a single heterozygote was found and resequenced the heterozygous individual using the same primer combination used in the first round of sequencing. Heterozygosity was confirmed for 72 (85.7%) of these SNPs, indicating 14.3% of our single heterozygote SNPs are false positives. This implies that of the 936 single heterozygote SNPs for which we could ascertain the derived allele, 125 are false positives. Though single heterozygotes are the largest category of SNPs in our data set, they make up a small fraction (936/46690, 2.0%) of the total number of heterozygous sites we sequenced. We do not anticipate that error in identifying heterozygotes should have a large effect on the derived allele frequencies (DAFs) of more common alleles, as there is a probability of 0.02 of seeing two false positive heterozygotes at a single locus, a probability of 0.0029 of seeing three false positive heterozygotes at a single locus, and this probability rapidly becomes negligible as the number of expected false positive heterozygotes increases. We did not observe an increased tendency to misidentify heterozygotes in any one population, relative to the others. Processing of SNP and indels After visual inspection of SNP calls, 2688 SNPs and 144 indels were confirmed in at least one individual. From this set we removed any SNP or indel that was not genotyped in at least 80% of the individuals (ie, having a “no-call” rate of > 20%) in every one of the 5 sample populations or was at a position with a neutral rate lower than 0.25 substitutions per site, as these positions are too shallow for reliable measurement of constraint (see ‘Measures of Constraint’ below). As well, all data from two individuals (NA19098, from the YRI population, and NA19085, from the JPT population) was removed from the study, as the genotyping of these individuals failed at > 20% of their loci. After this step, a set of 2573 SNPs and 134 indels remained. A summary of the number of SNPs found in each sample population is given in Supplemental Table 9. The location within the ENCODE regions of all SNPs, indels and resequenced regions was determined using Blat19. The derived allele for each SNP was ascertained by comparison to the aligned position in baboon and chimpanzee (see ‘Alignments’ below). SNPs at positions where the sequence differed between baboon and chimpanzee, or where data was missing for one of the species, were removed from further analysis. Likewise, resequenced regions without at least one SNP for which the derived allele could be ascertained were also excluded. This left us with 2214 SNPs and 124 indels in 516 regions totaling 227,252 bp in length. All analyses described in this paper were performed using this trimmed data set. The 2214 SNPs in the final data set are listed in Supplemental Table 10, with their locations, derived allele frequencies in each of the sample populations and annotations. Alignments Multisequence alignments for each of the 44 ENCODE regions were constructed using the Threaded Blockset Aligner (TBA) program (Blanchette et al 2004; Margulies et al 2007), using the September 2007 ENCODE sequence data freeze. In this set of alignments, the non-human sequences have been reordered to be syntenic with the human sequence, such that all human bases are present in each alignment and have at most one aligned nucleotide from each other species. Further details about the alignments can be found in Margulies et al (Margulies et al 2007). We processed these alignments by removing all non-mammalian species and any species which had extensive gaps or was which could be aligned to <50% of the ENCODE regions. Each alignment contains 24 – 26 mammalian species. As well, we removed from the alignment any positions that were gapped in human, in order to make the human sequence continuous. Gaps were permitted in other species, however. Measures of constraint A recently finished, improved version of Genomic Evolutionary Rate Profiling program (GERP 2.1, http://mendel.stanford.edu/SidowLab/downloads/GERP/index.html) was used to obtain the site-specific estimates of constraint and constrained elements (CEs) used in this study. The processed alignments described in the previous section were used as input for GERP, along with the neutral tree for the species in the alignment. The branch lengths of the neutral tree were calculated using four-fold degenerate sites. For each position in an alignment GERP calculates a “Rejected Substitutions” (RS) score, which is the difference between the number of substitutions expected at that position given the neutral distance between the species present in the alignment at that position and the number of substitutions observed at that position. The neutral distances to species that are gapped (ie not present in the alignment) at that position do not contribute to the calculation of the RS score. A positive RS score reflects a deficit of substitutions, implying that a position is under constraint, at the evolutionary scale covered by the alignment. For the alignments used here, the maximum possible RS score is 4.22, reflecting the depth of the neutral tree relating the aligned sequences. Adjacent constrained positions are aggregated and labeled as a constrained element if their cumulative score exceeds that expected by chance for a random sample of positions with the same size as length of the putative element. Further details on the calculation of RS scores and identification of constrained elements can be found by downloading the GERP 2.1 documentation at http://mendel.stanford.edu/SidowLab/downloads/gerp/index.html. Annotations Annotations from the “RefSeq Genes” track were downloaded from the from the UCSC ENCODE browser (Karolchik et al 2008), for version hg18 of the human genome, and used to annotate each base in each ENCODE region. Each base was annotated as belonging to a coding exon, an intron, a UTR or as intergenic if it did not fall into any of these three categories. SNPs in coding exons were further divided in synonymous and non-synonymous, based on the derived allele. Statistical analysis All statistical analyses were conducted using R (http://www.r-project.org/) Supplemental Tables 5 – 11 Supplemental Table 5: Estimates of nucleotide diversity () and for the ENCODE resequencing data for evolutionarily constrained sites (RS > 1) and sites that are not evolutionarily constrained (RS 1). was estimated from allele frequencies and was estimated using Watterson’s estimator (Watterson 1975). Sequence (confidence interval) Watterson's estimator for Constrained 4.24 x 10-7 1.25 x 10-6 (RS > 1) (7.22 x 10-6 - 8.41 x 10-7) Unconstrained 9.54 x 10-7 (RS 1) (1.87 x 10-6 - 3.60 x 10-8) 1.85 x 10-6 Supplemental Table 6: Description of resequenced individuals. Listed are the Coriell Catalog ID of each sample and the Coriell Plate ID of the plate each sample was taken from. Samples came from the following populations: Yoruba from Ibadan (YRI), Nigeria, Luhya from Webuye (LWK), Kenya, CEPH (CEU), Han Chinese from Beijing (CHB) and the Japanese from Tokyo (JPT). Catalog ID Population Sex Plate ID NA06984 NA06986 NA06989 NA07031 NA07037 NA07045 NA07051 NA07346 NA07347 NA07435 NA11829 NA11830 NA11831 NA11832 NA11843 NA11891 NA11892 NA11893 NA11894 NA11917 NA11918 NA11919 NA11920 NA11930 NA11931 NA11992 NA11993 NA11994 NA11995 NA12003 NA12004 NA12005 NA12006 NA12045 NA12058 NA12154 NA12155 NA12156 NA12236 NA12272 NA12273 NA12275 NA12282 NA12283 CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU Male Male Female Female Female Female Male Female Male Male Male Female Male Female Male Male Female Male Female Male Female Male Female Male Female Male Female Male Female Male Female Male Female Male Female Male Male Female Female Male Female Female Male Female HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT06 HAPMAPPT06 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT01 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 NA12286 NA12287 NA12340 NA12341 NA12342 NA12343 NA12347 NA12348 NA12383 NA12399 NA12400 NA12413 NA12414 NA12489 NA12546 NA12718 NA12748 NA12749 NA12774 NA12775 NA12776 NA12777 NA12778 NA12827 NA12828 NA12829 NA12830 NA12842 NA12843 NA12889 NA12890 NA18524 NA18526 NA18529 NA18530 NA18532 NA18534 NA18536 NA18537 NA18540 NA18542 NA18543 NA18544 NA18545 NA18546 NA18547 NA18548 NA18549 NA18550 NA18552 CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CEU CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB Male Female Male Female Male Female Male Female Female Male Female Male Female Female Male Female Male Female Male Male Female Male Female Male Female Male Female Male Female Male Female Male Female Female Male Female Male Male Female Female Female Male Male Female Male Female Male Male Female Female HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 XC01451 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT06 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 NA18555 NA18557 NA18558 NA18559 NA18561 NA18562 NA18563 NA18564 NA18566 NA18570 NA18571 NA18572 NA18595 NA18596 NA18597 NA18599 NA18602 NA18603 NA18605 NA18606 NA18608 NA18609 NA18610 NA18611 NA18612 NA18613 NA18614 NA18615 NA18616 NA18617 NA18618 NA18619 NA18620 NA18625 NA18626 NA18627 NA18628 NA18630 NA18631 NA18634 NA18638 NA18639 NA18640 NA18641 NA18642 NA18643 NA18645 NA18647 NA18740 NA18745 CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB CHB Female Male Male Male Male Male Male Female Female Female Female Male Female Female Female Female Female Male Male Male Male Male Female Male Male Male Female Female Female Female Female Female Male Female Female Female Female Female Female Female Male Male Female Female Female Male Male Male Male Male HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 NA18747 NA18748 NA18749 NA18757 NA18939 NA18940 NA18941 NA18942 NA18943 NA18944 NA18945 NA18946 NA18947 NA18948 NA18949 NA18951 NA18952 NA18953 NA18954 NA18955 NA18956 NA18957 NA18959 NA18960 NA18961 NA18962 NA18963 NA18964 NA18965 NA18966 NA18967 NA18968 NA18969 NA18972 NA18973 NA18975 NA18976 NA18977 NA18979 NA18992 NA18993 NA19001 NA19002 NA19005 NA19009 NA19010 NA19054 NA19055 NA19056 NA19057 CHB CHB CHB CHB JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT Male Male Male Male Female Male Female Female Male Male Male Female Female Male Female Female Male Male Female Male Female Female Male Male Male Male Female Female Male Male Male Female Female Female Female Female Female Male Female Female Female Female Female Male Male Female Female Male Male Female HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT02 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 NA19058 NA19059 NA19060 NA19062 NA19063 NA19064 NA19065 NA19066 NA19067 NA19068 NA19070 NA19072 NA19074 NA19075 NA19076 NA19077 NA19078 NA19079 NA19080 NA19081 NA19082 NA19083 NA19084 NA19085 NA19086 NA19087 NA19088 NA19027 NA19028 NA19031 NA19035 NA19036 NA19038 NA19041 NA19044 NA19046 NA19307 NA19308 NA19309 NA19310 NA19311 NA19313 NA19314 NA19315 NA19316 NA19317 NA19318 NA19319 NA19321 NA19324 JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT JPT LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK Male Female Male Male Male Female Female Male Male Male Male Male Female Male Male Female Female Male Female Female Male Male Female Male Male Female Male Male Male Male Male Female Female Male Male Male Male Male Male Female Male Female Female Female Female Male Male Male Female Female HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPPT05 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 NA19327 NA19328 NA19332 NA19334 NA19346 NA19347 NA19350 NA19352 NA19359 NA19360 NA19371 NA19372 NA19373 NA19374 NA19375 NA19376 NA19377 NA19379 NA19380 NA19381 NA19382 NA19383 NA19384 NA19385 NA19390 NA19391 NA19393 NA19394 NA19396 NA19397 NA19398 NA19399 NA19403 NA19404 NA19428 NA19429 NA19430 NA19431 NA19434 NA19435 NA19436 NA19437 NA19438 NA19439 NA19440 NA19443 NA19444 NA19445 NA19446 NA19448 LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK Female Female Female Male Male Male Male Male Male Male Male Male Male Male Male Male Female Female Male Female Male Male Male Male Female Female Male Male Female Male Female Female Female Female Male Male Male Female Female Female Female Female Female Female Female Male Male Female Female Male HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 NA19449 NA19451 NA19452 NA19455 NA19456 NA19457 NA19462 NA19463 NA19466 NA19467 NA19468 NA19469 NA19470 NA19471 NA19472 NA19473 NA19474 NA18485 NA18486 NA18487 NA18488 NA18489 NA18498 NA18499 NA18501 NA18502 NA18504 NA18505 NA18507 NA18508 NA18510 NA18511 NA18516 NA18517 NA18519 NA18520 NA18522 NA18523 NA18852 NA18853 NA18855 NA18856 NA18858 NA18859 NA18861 NA18862 NA18867 NA18868 NA18870 NA18871 LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK LWK YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI Female Male Male Male Female Female Female Female Male Female Female Female Female Female Female Female Female Male Male Male Female Female Male Female Male Female Male Female Male Female Male Female Male Female Male Female Male Female Female Male Female Male Female Male Female Male Female Male Female Male HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPV12 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 NA18873 NA18874 NA18907 NA18908 NA18909 NA18910 NA18912 NA18913 NA18916 NA18917 NA18923 NA18924 NA18933 NA18934 NA19092 NA19093 NA19095 NA19096 NA19098 NA19099 NA19101 NA19102 NA19107 NA19108 NA19113 NA19114 NA19116 NA19117 NA19118 NA19119 NA19121 NA19122 NA19127 NA19128 NA19130 NA19131 NA19137 NA19138 NA19140 NA19141 NA19143 NA19144 NA19146 NA19147 NA19149 NA19150 NA19152 NA19153 NA19159 NA19160 YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI Female Male Female Male Female Male Female Male Female Male Male Female Female Male Male Female Female Male Male Female Male Female Male Female Male Female Female Male Female Male Male Female Female Male Male Female Female Male Female Male Female Male Male Female Female Male Female Male Female Male HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 NA19171 NA19172 NA19175 NA19176 NA19178 NA19179 NA19181 NA19182 NA19184 NA19185 NA19189 NA19190 NA19192 NA19193 NA19197 NA19198 NA19200 NA19201 NA19203 NA19204 NA19206 NA19207 NA19209 NA19210 NA19213 NA19214 NA19222 NA19223 NA19225 NA19226 NA19235 NA19236 NA19238 NA19239 NA19247 NA19248 NA19256 NA19257 YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI YRI Male Female Male Female Male Female Male Female Male Female Male Female Male Female Female Male Male Female Male Female Female Male Female Male Male Female Female Male Female Male Female Male Female Male Female Male Male Female HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT03 HAPMAPPT03 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 HAPMAPPT04 Supplemental Table 7: List of the Coriell plates from which samples were drawn: Population(s) Number of samples Plate ID on plate obtained from plate HAPMAPPT01 CEU 16 HAPMAPPT02 JPT & CHB 55 HAPMAPPT03 YRI 60 HAPMAPPT04 YRI 61 HAPMAPPT05 JPT & CHB 91 HAPMAPPT06 CEU 58 HAPMAPV12 LWK 90 XC01451 CEU 1 Supplemental Table 8: Chromosomal locations and lengths (in bp) of the regions selected for PCR amplification and resequencing. Region Length (bp) Chromosome Start Coordinate End Coordinate 00421.ENm001 00504.ENm001 00535.ENm001 00579.ENm001 00580.ENm001 00690.ENm001 00696.ENm001 00714.ENm001 00721.ENm001 00731.ENm001 00863.ENm001 00873.ENm001 00876.ENm001 00877.ENm001 00878.ENm001 00899.ENm001 00900.ENm001 00902.ENm001 00918.ENm001 00919.ENm001 00925.ENm001 00926.ENm001 00932.ENm001 00933.ENm001 00934.ENm001 00940.ENm001 00942.ENm001 00945.ENm001 00947.ENm001 00953.ENm001 00955.ENm001 00960.ENm001 00964.ENm001 00967.ENm001 00970.ENm001 00973.ENm001 00974.ENm001 00975.ENm001 00977.ENm001 00983.ENm001 00987.ENm001 00989.ENm001 00991.ENm001 00993.ENm001 00999.ENm001 01001.ENm001 01002.ENm001 01005.ENm001 01010.ENm001 204 299 144 220 343 230 401 75 368 359 588 444 139 441 195 120 573 405 120 189 392 271 233 188 561 390 311 197 263 325 440 308 317 135 87 171 481 121 343 301 123 211 542 109 271 211 363 157 234 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 115816627 116100829 116170093 116309134 116315164 116757847 116795718 116903453 116931382 116965288 115933155 116769552 117336777 117372175 117473189 116038950 116224628 116326267 116437695 116195511 116182487 117041806 116016419 117017589 116237542 116030318 117434701 115853627 116222211 117155278 117084100 117308247 116448294 115908431 115928125 116016557 116161424 115824554 116713393 117092667 117435024 117439725 116225368 116139834 116162101 117183808 116641511 116749940 116554448 115816830 116101127 116170236 116309353 116315506 116758076 116796118 116903527 116931749 116965646 115933742 116769995 117336915 117372615 117473383 116039069 116225200 116326671 116437814 116195699 116182878 117042076 116016651 117017776 116238102 116030707 117435011 115853823 116222473 117155602 117084539 117308554 116448610 115908565 115928211 116016727 116161904 115824674 116713735 117092967 117435146 117439935 116225909 116139942 116162371 117184018 116641873 116750096 116554681 01011.ENm001 01012.ENm001 01013.ENm001 01014.ENm001 01019.ENm001 01026.ENm001 01028.ENm001 01033.ENm001 01036.ENm001 01039.ENm001 01043.ENm001 01044.ENm001 01045.ENm001 01048.ENm001 01049.ENm001 01050.ENm001 01053.ENm001 01055.ENm001 01058.ENm001 01059.ENm001 01060.ENm001 01063.ENm001 01064.ENm001 01070.ENm001 01071.ENm001 01072.ENm001 01073.ENm001 01075.ENm001 01077.ENm001 01078.ENm001 01080.ENm001 01082.ENm001 01084.ENm001 01086.ENm001 01087.ENm001 01088.ENm001 01089.ENm001 01092.ENm001 01094.ENm001 01095.ENm001 01097.ENm001 01099.ENm001 01100.ENm001 01103.ENm001 01104.ENm001 01107.ENm001 01108.ENm001 01110.ENm001 01118.ENm001 01122.ENm001_2_1 01123.ENm001 01126.ENm001 01133.ENm001 207 770 366 363 283 417 492 481 342 499 580 1093 525 321 377 191 399 225 290 214 526 314 499 233 90 418 154 322 435 543 306 1286 104 520 537 344 620 223 417 122 242 398 358 386 225 112 267 330 312 473 447 309 403 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 116042789 116645488 116167244 115955016 116264961 116543487 117233233 115677402 116169380 116809953 116556726 116740308 116331346 117420039 116961400 116019147 116807113 117390691 116649613 117185101 116453162 116051067 116199640 116704008 116340693 116129228 115952809 116643903 116975856 116033264 116198696 115987299 117202441 117297949 115847153 116204549 116109354 116299449 117151683 116201686 115789011 117069655 117212383 116601865 115933928 116906311 115782844 115936064 116343206 115649897 116713694 117371795 116119919 116042995 116646257 116167609 115955378 116265243 116543903 117233724 115677882 116169721 116810451 116557305 116741400 116331870 117420359 116961776 116019337 116807511 117390915 116649902 117185314 116453687 116051380 116200138 116704240 116340782 116129645 115952962 116644224 116976290 116033806 116199001 115988584 117202544 117298468 115847689 116204892 116109973 116299671 117152099 116201807 115789252 117070052 117212740 116602250 115934152 116906422 115783110 115936393 116343517 115650369 116714140 117372103 116120321 01139.ENm001 01140.ENm001 01142.ENm001 01148.ENm001 01149.ENm001 01154.ENm001 01155.ENm001 01158.ENm001 01161.ENm001 01166.ENm001 01170.ENm001 01171.ENm001 01178.ENm001 01179.ENm001 01180.ENm001 01182.ENm001 01184.ENm001 01185.ENm001 01186.ENm001 01189.ENm001 01190.ENm001 01191.ENm001 01196.ENm001 01201.ENm001 01202.ENm001 01205.ENm001 01206.ENm001 01207.ENm001 01209.ENm001 01211.ENm001 01213.ENm001 01214.ENm001 01216.ENm001 01221.ENm001 01239.ENm001 01244.ENm001 01245.ENm001 01247.ENm001 01248.ENm001 01249.ENm001 01252.ENm001 01253.ENm001 01257.ENm001 01259.ENm001 01260.ENm001 01261.ENm001 01263.ENm001 01266.ENm001 01267.ENm001 01270.ENm001 01273.ENm001 01276.ENm001 01281.ENm001 336 408 286 615 425 273 363 173 124 586 217 306 343 344 321 312 171 432 341 369 359 386 386 469 366 410 212 388 231 455 238 174 435 143 197 286 367 429 299 554 390 328 925 464 208 126 305 281 289 263 469 112 135 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 116908746 116180176 116790831 116644885 116705284 117143034 115910546 115959099 115845845 116616446 117178012 116442264 116957596 116724855 116505778 115949662 115765683 116849251 117047007 117255210 116744622 117152226 117207689 116198948 115826966 117209929 116230206 116203915 115767796 116546635 116224295 115822969 117288913 115987204 116745439 117468372 116873501 117033743 116969195 116636785 116991799 117095368 116697193 117162422 116002122 116002203 115661757 116862957 115706383 115972353 116718484 116448727 116599798 116909081 116180583 116791116 116645499 116705708 117143306 115910908 115959271 115845968 116617031 117178228 116442569 116957938 116725198 116506098 115949973 115765853 116849682 117047347 117255578 116744980 117152611 117208074 116199416 115827331 117210338 116230417 116204302 115768026 116547089 116224532 115823142 117289347 115987346 116745635 117468657 116873867 117034171 116969493 116637338 116992188 117095695 116698117 117162885 116002329 116002328 115662061 116863237 115706671 115972615 116718952 116448838 116599932 01283.ENm001 01287.ENm001 01289.ENm001 01290.ENm001 01292.ENm001 01296.ENm001 01297.ENm001 01302.ENm001 01309.ENm001 01313.ENm001 01316.ENm001 01317.ENm001 01318.ENm001 01319.ENm001 01322.ENm001 01323.ENm001 01327.ENm001 01329.ENm001 01335.ENm001 01339.ENm001 01340.ENm001 01342.ENm001 01345.ENm001 01346.ENm001 01356.ENm001_1_0 01357.ENm001 01359.ENm001 01360.ENm001 01361.ENm001 01363.ENm001 01366.ENm001 01370.ENm001 01371.ENm001 01374.ENm001 01375.ENm001 01378.ENm001 01381.ENm001 01382.ENm001 01384.ENm001 01387.ENm001 01388.ENm001 01389.ENm001 01390.ENm001 01393.ENm001 01398.ENm001 01405.ENm001 01406.ENm001 01409.ENm001 01411.ENm001 01412.ENm001 01413.ENm001 01414.ENm001 01422.ENm001 102 198 1428 150 183 275 288 414 638 713 571 367 507 374 167 794 391 188 250 337 772 121 483 547 440 385 236 566 399 281 121 376 116 237 300 1059 175 233 390 202 577 493 178 354 134 254 331 219 106 328 355 371 312 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 116166099 116751491 116564645 117427138 116562236 115783714 116167513 116958060 116060989 116043186 116222842 117001268 116742285 116584730 117254987 116652870 117145134 116813195 116449318 115927484 116600078 115735081 116191471 115782278 115638715 116847306 117322673 117067041 117390812 115970994 116000252 115679506 116010348 116026300 117178131 117218278 117043943 116105985 116184856 116823340 116159204 117038665 116640396 116138738 117177202 116060514 115699277 116908395 116960257 116962487 116986662 117030761 116600871 116166200 116751688 116566072 117427287 116562418 115783988 116167800 116958473 116061626 116043898 116223412 117001634 116742791 116585103 117255153 116653663 117145524 116813382 116449567 115927820 116600849 115735201 116191953 115782824 115639154 116847690 117322908 117067606 117391210 115971274 116000372 115679881 116010463 116026536 117178430 117219336 117044117 116106217 116185245 116823541 116159780 117039157 116640573 116139091 117177335 116060767 115699607 116908613 116960362 116962814 116987016 117031131 116601182 01426.ENm001 01427.ENm001 01428.ENm001 01429.ENm001 01431.ENm001 01432.ENm001 01433.ENm001 01435.ENm001 01436.ENm001 01438.ENm001 01439.ENm001_5_0 01444.ENm001 01447.ENm001 01449.ENm001 01454.ENm001 01456.ENm001 01459.ENm001 01466.ENm001 01467.ENm001 01468.ENm001 01470.ENm001 01471.ENm001 01474.ENm001 01476.ENm001 01481.ENm001 01482.ENm001 01484.ENm001 01488.ENm001 01489.ENm001 01490.ENm001 01499.ENm001 01501.ENm001 01515.ENm001 01516.ENm001 01517.ENm001 01518.ENm001 01519.ENm001 01520.ENm001 01521.ENm001 01522.ENm001 01525.ENm001 01526.ENm001 01527.ENm001 01528.ENm001 01533.ENm001 01534.ENm001 01535.ENm001 01536.ENm001 01544.ENm001 01545.ENm001 01546.ENm001 01550.ENm001 01555.ENm001 443 90 498 393 1075 435 407 486 263 400 477 312 324 548 283 265 612 182 300 123 321 431 121 591 420 553 149 212 256 490 347 127 272 96 299 386 74 215 889 552 181 350 329 320 298 377 338 168 198 191 301 339 439 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 115684537 116114650 116185610 116603461 116617253 116158761 116236825 115972721 117071721 116118210 115713836 116462540 117208207 116669570 116168076 116136791 116228551 116646350 116936161 115845359 115854904 117146712 115653237 117204598 115920748 117420360 117323860 117455387 116644502 116751196 115866940 116223468 115965415 116100376 117386161 117099786 117266169 116463769 115781355 116747631 115820734 117014942 117329433 117162092 117363795 116019834 116141269 117300068 116920240 116325996 116106904 115733249 117410338 115684979 116114739 116186107 116603853 116618327 116159195 116237231 115973206 117071983 116118609 115714312 116462851 117208530 116670117 116168358 116137055 116229162 116646531 116936460 115845481 115855224 117147142 115653357 117205188 115921167 117420912 117324008 117455598 116644757 116751685 115867286 116223594 115965686 116100471 117386459 117100171 117266242 116463983 115782243 116748182 115820914 117015291 117329761 117162411 117364092 116020210 116141606 117300235 116920437 116326186 116107204 115733587 117410776 01557.ENm001 01559.ENm001 01562.ENm001 01565.ENm001 01568.ENm001 01571.ENm001 01574.ENm001 01576.ENm001 01578.ENm001 01586.ENm001 01587.ENm001 01590.ENm001 01592.ENm001 01593.ENm001 01595.ENm001 01600.ENm001 01601.ENm001 01604.ENm001 01605.ENm001 01606.ENm001 01608.ENm001 01610.ENm001 01613.ENm001 01616.ENm001 01621.ENm001 01622.ENm001 01623.ENm001 01624.ENm001 01629.ENm001 01631.ENm001 01632.ENm001 01634.ENm001 01642.ENm001 01644.ENm001 01650.ENm001 01652.ENm001 01653.ENm001 01655.ENm001 01657.ENm001 01667.ENm001 01668.ENm001 01670.ENm001 01672.ENm001 01677.ENm001 01680.ENm001 01681.ENm001 01682.ENm001 01683.ENm001 01688.ENm001 01693.ENm001 01694.ENm001 01696.ENm001 01698.ENm001 345 929 569 495 161 219 211 365 301 875 131 378 551 167 292 537 350 145 314 365 348 458 278 570 441 420 197 408 495 138 289 735 539 256 478 156 504 111 452 116 466 139 469 643 678 246 395 273 475 187 214 201 513 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 116964759 116495797 115820930 117236218 116704397 115984435 116223779 116694543 117173044 115744153 117468758 116559069 116190229 117071858 116020765 116604509 116432939 116213117 117303746 117288344 116204958 116210459 117177287 115939206 117315926 116753158 115985441 117238001 115676255 115967232 116647069 117054436 117091682 117256178 116650021 117303585 116656938 116076485 116752283 116656295 115827276 116681372 117194192 116106427 116700504 115981791 116907163 116649115 116668476 115881873 116072971 116888476 115969285 116965103 116496725 115821498 117236712 116704557 115984653 116223989 116694907 117173344 115745027 117468888 116559446 116190779 117072024 116021056 116605045 116433288 116213261 117304059 117288708 116205305 116210916 117177564 115939775 117316366 116753577 115985637 117238408 115676749 115967369 116647357 117055170 117092220 117256433 116650498 117303740 116657441 116076595 116752734 116656410 115827741 116681510 117194660 116107069 116701181 115982036 116907557 116649387 116668950 115882059 116073184 116888676 115969797 01708.ENm001 01711.ENm001 01717.ENm001 01722.ENm001 01727.ENm001 01739.ENm001 01742.ENm001 01758.ENm001 01759.ENm001 01761.ENm001 01766.ENm001_9_1 01771.ENm001 01776.ENm001 01782.ENm001 01785.ENm001 01791.ENm001 01792.ENm001 01799.ENm001 01802.ENm001 01810.ENm001 01815.ENm001 01816.ENm001 01828.ENm001 01833.ENm001 01848.ENm001 01852.ENm001 01876.ENm001 01883.ENm001 01891.ENm001 01894.ENm001 01906.ENm001 01915.ENm001 01917.ENm001 01931.ENm001 01936.ENm001 01943.ENm001 01944.ENm001 01945.ENm001 01948.ENm001 01949.ENm001 01957.ENm001 01963.ENm001 01976.ENm001 01977.ENm001 01981.ENm001 01987.ENm001 01988.ENm001 01996.ENm001 01999.ENm001 02013.ENm001 02023.ENm001 02025.ENm001 02028.ENm001 204 429 495 628 521 1142 379 526 501 391 508 553 511 549 627 639 380 580 409 542 373 636 436 603 629 391 636 565 520 627 434 578 538 416 555 626 559 302 480 620 538 628 543 930 1016 558 505 641 472 431 497 931 512 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 116994335 116164534 116103062 115663865 116180406 116585899 116744740 116808081 115807298 117037728 116222616 116144743 116196811 115955268 117265788 116202758 116657787 115698519 115687130 116229582 117079899 115755406 115654850 115919745 117121171 117094119 115986112 115743455 115998236 116714876 117291059 115935416 116633013 116186436 116447269 117212990 116812728 116333459 117410526 117402119 115943902 115942147 116675796 116132041 116743210 115685833 116230437 116564113 116479109 117439936 116967183 116209145 115685764 116994538 116164962 116103556 115664492 116180926 116587040 116745118 116808606 115807798 117038118 116223123 116145295 116197321 115955816 117266414 116203396 116658166 115699098 115687538 116230123 117080271 115756041 115655285 115920347 117121799 117094509 115986747 115744019 115998755 116715502 117291492 115935993 116633550 116186851 116447823 117213615 116813286 116333760 117411005 117402738 115944439 115942774 116676338 116132970 116744225 115686390 116230941 116564753 116479580 117440366 116967679 116210075 115686275 02034.ENm001 02037.ENm001 02038.ENm001 02045.ENm001 02047.ENm001 02048.ENm001 02049.ENm001 02054.ENm001 02057.ENm001 02058.ENm001 02060.ENm001 02075.ENm001 02079.ENm001 02085.ENm001 02090.ENm001 02101.ENm001 02116.ENm001 02250.ENm001 02255.ENm001 02256.ENm001 01023.ENm002_4_1 01032.ENm002_24_0 01093.ENm002_2_1 01096.ENm002_14_1 01492.ENm002_15_1 01594.ENm002_5_1 01674.ENm002_35_1 00971.ENm003_8_0 00978.ENm003_8_0 01553.ENm003_6_1 01685.ENm003_5_0 01925.ENm003_2_0 01047.ENm004_5_0 01218.ENm004_28_0 01268.ENm004_8_0 01315.ENm004_30_0 01379.ENm004_13_0 01434.ENm004_3_0 01609.ENm004_31_0 01611.ENm004_27_0 01857.ENm004_24_0 01924.ENm004_30_0 01961.ENm004_4_1 02148.ENm004_5_0 02198.ENm004_26_1 02227.ENm004_15_1 02282.ENm004_16_0 00929.ENm005_7_0 01008.ENm005_28_0 01021.ENm005_29_1 01031.ENm005_28_0 01127.ENm005_28_0 01219.ENm005_37_1 665 631 561 437 550 612 434 638 492 419 530 481 571 454 490 629 587 611 564 518 446 454 478 582 517 480 565 493 501 472 498 585 479 475 558 482 601 505 505 463 495 508 521 630 615 617 528 520 470 486 512 498 509 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 5 5 5 5 5 5 11 11 11 11 11 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 21 21 21 21 21 21 117219507 115667186 117382177 116749590 115645428 116491306 115669362 117018519 116557592 116339232 116188999 116509206 116001739 117173141 116205937 115600113 115962693 117304520 115951967 117236495 131423869 131907028 131366775 131722153 131752123 131437023 132229848 116246063 116237178 116205451 116199121 116151060 30388101 31190347 30409225 31237715 30671254 30170555 31732597 31138161 31096937 31228325 30298758 30346917 31112433 30768781 30885180 32906693 33907635 33936159 33935786 33880157 34242660 117220171 115667816 117382737 116750026 115645977 116491917 115669795 117019156 116558083 116339650 116189528 116509686 116002309 117173594 116206426 115600741 115963279 117305130 115952530 117237012 131424314 131907481 131367252 131722734 131752639 131437502 132230412 116246555 116237678 116205922 116199618 116151644 30388579 31190821 30409782 31238196 30671854 30171059 31733101 31138623 31097431 31228832 30299278 30347546 31113047 30769397 30885707 32907212 33908104 33936644 33936297 33880654 34243168 01272.ENm005_36_0 01303.ENm005_25_0 01671.ENm005_14_0 02020.ENm005_17_0 00992.ENm006_40_1 01083.ENm006_46_0 01131.ENm006_3_1 01251.ENm006_23_0 01306.ENm006_24_1 01343.ENm006_28_1 01469.ENm006_35_0 01478.ENm006_20_1 01660.ENm006_24_1 01908.ENm006_21_0 02000.ENm006_37_0 02262.ENm006_25_0 01173.ENm007_15_0 01288.ENm007_11_0 01295.ENm007_17_0 01331.ENm007_12_1 01365.ENm007_4_1 01367.ENm007_26_0 01418.ENm007_33_0 01645.ENm007_17_0 01648.ENm007_38_1 01720.ENm007_19_1 01743.ENm007_11_0 01772.ENm007_21_0 01853.ENm007_12_1 01900.ENm007_42_1 01965.ENm007_28_0 01992.ENm007_9_0 02035.ENm007_16_0 02196.ENm007_16_0 00861.ENm008_1_1 00928.ENm008_19_0 01081.ENm008_23_0 01116.ENm008_29_1 02055.ENm008_26_0 00883.ENm009_1_1 00949.ENm009_7_0 01067.ENm009_17_1 01349.ENm009_14_0 01423.ENm009_8_0 01641.ENm009_20_1 01770.ENm009_7_0 00963.ENm010_5_0 01117.ENm010_6_1 01125.ENm010_10_1 01254.ENm010_14_0 01275.ENm010_6_1 01305.ENm010_19_1 01679.ENm010_4_0 507 636 532 600 502 469 488 476 476 514 489 424 446 477 489 462 462 495 481 489 444 473 484 593 550 481 507 695 462 496 507 495 556 452 521 625 481 581 503 536 465 594 483 621 496 519 538 485 499 391 455 151 457 21 21 21 21 X X X X X X X X X X X X 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 16 16 16 16 16 11 11 11 11 11 11 11 7 7 7 7 7 7 7 34209940 33785786 33319086 33417675 153423240 153632930 152820760 153280254 153282468 153313399 153366134 153186390 153279544 153236386 153392844 153290745 59355165 59310772 59384085 59324392 59107508 59541623 59676159 59385284 59846726 59396061 59310158 59433353 59310276 59989167 59568182 59297664 59368608 59360164 1113 199637 336353 387909 363937 4965533 5232541 5602336 5487630 5483321 5667179 5232475 27125610 27127656 27152897 27191295 27121114 27248564 27108816 34210446 33786421 33319617 33418274 153423741 153633398 152821247 153280729 153282943 153313912 153366622 153186813 153279989 153236862 153393332 153291206 59355626 59311266 59384565 59324880 59107951 59542095 59676642 59385876 59847275 59396541 59310664 59434047 59310737 59989663 59568688 59298158 59369163 59360615 1633 200261 336833 388489 364439 4966068 5233005 5602929 5488112 5483941 5667674 5232993 27126147 27128140 27153395 27191685 27121568 27248714 27109272 01338.ENm011_12_0 01416.ENm011_7_1 01596.ENm011_9_1 01599.ENm011_24_1 01662.ENm011_24_1 01788.ENm011_3_0 02162.ENm011_5_1 00981.ENm013_9_1 01495.ENm013_7_1 01797.ENm013_7_1 01863.ENm013_3_1 01038.ENm014_1_0 01419.ENr111_2_0 00930.ENr112_1_1 00856.ENr121_3_0 01654.ENr121_3_0 01712.ENr121_1_1 01626.ENr122_4_1 01834.ENr122_5_1 01902.ENr122_5_1 01164.ENr131_1_1 01194.ENr131_1_1 01235.ENr131_6_0 01243.ENr131_1_1 01399.ENr131_5_1 01630.ENr131_7_1 01849.ENr131_3_0 01885.ENr131_1_1 01979.ENr131_1_1 00924.ENr132_12_1 00959.ENr132_12_1 01057.ENr132_9_1 01666.ENr132_12_1 01731.ENr132_4_0 01037.ENr133_3_0 01798.ENr133_5_0 01162.ENr212_1_0 01234.ENr212_1_0 01165.ENr213_1_0 01242.ENr221_3_1 01175.ENr222_2_1 01372.ENr222_4_0 00889.ENr223_12_0 01364.ENr223_10_0 01842.ENr223_4_0 01919.ENr223_5_1 01922.ENr223_11_1 02081.ENr223_6_0 01358.ENr231_2_1 01394.ENr231_6_0 01464.ENr231_13_1 01560.ENr231_7_0 01733.ENr231_13_1 498 471 527 512 533 559 597 661 487 543 558 468 581 448 469 450 645 506 526 500 587 593 492 441 584 572 577 540 505 473 512 491 576 579 521 685 471 478 487 590 519 479 112 565 557 540 564 581 496 467 530 424 502 11 11 11 11 11 11 11 7 7 7 7 7 13 2 2 2 2 18 18 18 2 2 2 2 2 2 2 2 2 13 13 13 13 13 21 21 5 5 18 5 6 6 6 6 6 6 6 6 1 1 1 1 1 1974812 1858091 1900281 2279416 2280060 1735195 1811789 89870140 89813527 89819624 89712006 126670477 29631676 51801380 118482602 118426372 118288159 59528001 59593182 59570906 234264659 234302098 234427824 234333282 234401530 234554758 234317237 234291811 234265913 112746556 112681545 112604108 112794109 112456883 39558665 39640057 141980997 142045722 23786201 56240339 132264368 132494548 74285803 74218534 74076420 74075577 74245934 74161465 149437256 149566717 149813039 149586262 149778992 1975309 1858561 1900807 2279927 2280592 1735753 1812385 89870800 89814013 89820166 89712563 126670944 29632256 51801827 118483070 118426821 118288803 59528506 59593707 59571405 234265245 234302690 234428315 234333722 234402113 234555329 234317813 234292350 234266417 112747028 112682056 112604598 112794684 112457461 39559185 39640741 141981467 142046199 23786687 56240928 132264886 132495026 74285914 74219098 74076976 74076116 74246497 74162045 149437751 149567183 149813568 149586685 149779493 01760.ENr231_1_0 01280.ENr232_8_1 01547.ENr232_3_1 01732.ENr232_5_1 01756.ENr232_5_1 01168.ENr233_5_0 01228.ENr233_16_0 01446.ENr233_13_0 01807.ENr233_11_0 02082.ENr233_1_0 01773.ENr312_3_0 01837.ENr312_1_1 01494.ENr323_5_1 01699.ENr323_2_1 01879.ENr323_7_1 01958.ENr323_3_0 01514.ENr324_2_1 01541.ENr324_1_0 00968.ENr331_11_0 00988.ENr331_15_1 01337.ENr331_15_1 01348.ENr331_8_1 01647.ENr331_7_0 00181.ENr332_5_1 00948.ENr332_20_0 01134.ENr332_11_0 01452.ENr332_20_0 01493.ENr332_12_0 01510.ENr332_24_0 01554.ENr332_23_0 01570.ENr332_17_0 01618.ENr332_12_0 01619.ENr332_24_0 01726.ENr332_16_0 02089.ENr332_11_0 02146.ENr332_7_0 01062.ENr333_10_0 01691.ENr333_16_0 01768.ENr333_5_0 02210.ENr333_4_0 00927.ENr334_1_1 01250.ENr334_5_0 01274.ENr334_4_0 01369.ENr334_10_0 01475.ENr334_2_1 01659.ENr334_2_1 01781.ENr334_8_1 01938.ENr334_6_0 01663.ENm007_10_1 490 541 472 511 614 477 473 518 457 508 568 630 530 537 451 530 585 478 584 490 572 504 480 481 477 457 490 478 529 447 647 442 495 535 453 602 660 461 597 578 538 623 497 491 466 492 599 624 520 1 9 9 9 9 15 15 15 15 15 11 11 6 6 6 6 X X 2 2 2 2 2 11 11 11 11 11 11 11 11 11 11 11 11 11 20 20 20 20 6 6 6 6 6 6 6 6 19 149429193 130944212 130786688 130862915 130870934 41797089 41904105 41879447 41826496 41572542 130915807 130745243 108593574 108384842 108722399 108491988 122694309 122694479 220132262 220170439 220178719 220086861 220079834 64114476 64380308 64283934 64379042 64291728 64424932 64417419 64355432 64302734 64427278 64334628 64284699 64133766 33629985 33793500 33489224 33463220 41411076 41767053 41766680 41879620 41670272 41662756 41862871 41823014 59297510 149429682 130944752 130787159 130863425 130871547 41797565 41904577 41879964 41826952 41573049 130916374 130745872 108594103 108385378 108722849 108492517 122694893 122694956 220132845 220170928 220179290 220087364 220080313 64114956 64380784 64284390 64379531 64292205 64425460 64417865 64356078 64303175 64427772 64335162 64285151 64134367 33630644 33793960 33489820 33463797 41411613 41767675 41767176 41880110 41670737 41663247 41863469 41823637 59298030 Supplemental Table 9: Summary of SNP and indel data for each population. “Private alleles” are SNPs found in a single population. “Single heterozygotes” are SNPs in the rarest class, those found on only one chromosome in a single individual. The numbers of SNPs and indels in the “Overall” row are less than sum of the numbers of SNPs and indels found in each population, as SNPs and indels may be segregating in more than one population. Population Individuals SNPs Private alleles Single heterozygotes Indels YRI CHB JPT CEU LWK 120 73 72 75 90 1362 719 649 772 1327 409 227 187 301 358 256 192 136 191 246 86 48 44 55 79 Overall 430 2573 1482 1021 134 Supplemental Table 10: Please see GoodeEtAl_TableS10.xls for the set of 2214 SNPs for which the derived allele could be ascertained. Supplemental Table 11: The set of 124 indels for which the derived allele was ascertained. “Start Coordinate” gives the position of the first base of a deletion, or the base 5’ of an insertion site. Allele frequencies are given as percentages. Indel IDs have the ENCODE region the indel was found in and it's position within the ENCODE region. Indel ID Start Chromosome Coordinate Type Ancestral Allele Frequency Derived Allele Frequency Length (bp) ENm001_1003300 chr7 116601056 Insertion 99.88 0.12 1 ENm001_101040 chr7 115698796 Insertion 72.85 27.15 1 ENm001_101834 chr7 115699590 Deletion 98.49 1.51 2 ENm001_1019220 chr7 116616976 Insertion 24.35 75.65 1 ENm001_1059603 chr7 116657359 Deletion 99.88 0.12 1 ENm001_1096931 chr7 116694687 Deletion 89.24 10.76 1 ENm001_1116305 chr7 116714061 Deletion 33.21 66.79 20 ENm001_1117135 chr7 116714891 Deletion 96.47 3.53 13 ENm001_1147044 chr7 116744800 Insertion 52.44 47.56 1 ENm001_1210727 chr7 116808477 Deletion 99.88 0.12 2 ENm001_1309745 chr7 116907501 Deletion 99.88 0.12 2 ENm001_135558 chr7 115733314 Deletion 80.58 19.42 2 ENm001_1369622 chr7 116967378 Deletion 99.88 0.12 8 ENm001_1389126 chr7 116986882 Deletion 99.65 0.35 3 ENm001_1389242 chr7 116986998 Insertion 99.53 0.47 10 ENm001_1417472 chr7 117015228 Insertion 99.88 0.12 1 ENm001_1433168 chr7 117030924 Deletion 99.88 0.12 2 ENm001_1436328 chr7 117034084 Insertion 99.77 0.23 1 ENm001_1496675 chr7 117094431 Deletion 99.88 0.12 31 ENm001_1497669 chr7 117095425 Deletion 99.88 0.12 1 ENm001_1607317 chr7 117205073 Deletion 99.4 0.6 1 ENm001_1612488 chr7 117210244 Deletion 99.88 0.12 2 ENm001_1620567 chr7 117218323 Deletion 0.36 99.64 1 ENm001_1691303 chr7 117289059 Deletion 99.88 0.12 2 ENm001_1725036 chr7 117322792 Deletion 99.88 0.12 4 ENm001_1842400 chr7 117440156 Deletion 97.53 2.47 3 ENm001_184364 chr7 115782120 Deletion 99.53 0.47 21 ENm001_184666 chr7 115782422 Deletion 99.64 0.36 2 ENm001_185147 chr7 115782903 Deletion 3.95 96.05 1 ENm001_185286 chr7 115783041 Deletion 94.95 5.05 3 ENm001_223039 chr7 115820795 Deletion 98.47 1.53 2 ENm001_2713 chr7 115600469 Deletion 90.81 9.19 4 ENm001_323370 chr7 115921126 Deletion 99.76 0.24 1 ENm001_329989 chr7 115927745 Deletion 90.56 9.44 1 ENm001_338353 chr7 115936109 Deletion 99.76 0.24 17 ENm001_341673 chr7 115939424 Deletion 99.88 0.12 2 ENm001_354639 chr7 115952395 Insertion 96.13 3.87 1 ENm001_354719 chr7 115952475 Insertion 98.31 1.69 2 ENm001_357336 chr7 115955092 Deletion 97.74 2.26 1 ENm001_357413 chr7 115955169 Deletion 98.77 1.23 1 ENm001_357557 chr7 115955313 Insertion 36.66 63.34 1 ENm001_357866 chr7 115955621 Insertion 99.88 0.12 1 ENm001_404411 chr7 116002167 Deletion 99.64 0.36 2 ENm001_41344 chr7 115639100 Deletion 55.97 44.03 1 ENm001_418776 chr7 116016532 Deletion 95.43 4.57 2 ENm001_436021 chr7 116033777 Deletion 99.76 0.24 1 ENm001_463370 chr7 116061125 Insertion 0.48 99.52 3 ENm001_48217 chr7 115645972 Deletion 86.32 13.68 1 ENm001_534840 chr7 116132596 Deletion 99.88 0.12 4 ENm001_543766 chr7 116141522 Deletion 99.88 0.12 1 ENm001_547458 chr7 116145214 Deletion 99.65 0.35 4 ENm001_563759 chr7 116161515 Insertion 3.18 96.82 1 ENm001_587421 chr7 116185177 Deletion 99.65 0.35 4 ENm001_591581 chr7 116189337 Deletion 98.66 1.34 2 ENm001_593923 chr7 116191679 Deletion 98.82 1.18 1 ENm001_601329 chr7 116199085 Insertion 93.22 6.78 2 ENm001_615365 chr7 116213121 Deletion 99.18 0.82 2 ENm001_625404 chr7 116223160 Deletion 99.88 0.12 4 ENm001_632768 chr7 116230524 Insertion 99.88 0.12 2 ENm001_64298 chr7 115662054 Insertion 71.22 28.78 2 ENm001_66351 chr7 115664107 Deletion 98.93 1.07 2 ENm001_88096 chr7 115685852 Deletion 56.44 43.56 1 ENm001_88303 chr7 115686059 Deletion 99.64 0.36 4 ENm001_893723 chr7 116491479 Deletion 93.81 6.19 1 ENm002_152927 chr5 131437240 Deletion 99.77 0.23 1 ENm002_622925 chr5 131907238 Deletion 98.01 1.99 1 ENm004_1004585 chr22 31138538 Insertion 80.92 19.08 1 ENm004_1094826 chr22 31228779 Deletion 99.88 0.12 2 ENm004_165297 chr22 30299250 Deletion 4.83 95.17 1 ENm004_36745 chr22 30170698 Deletion 99.3 0.7 3 ENm004_635304 chr22 30769257 Deletion 0.24 99.76 1 ENm005_1117639 chr21 33785875 Deletion 0.35 99.65 1 ENm005_1118150 chr21 33786386 Deletion 99.53 0.47 1 ENm005_650940 chr21 33319176 Deletion 97.56 2.44 4 ENm006_419349 chrX 153186840 Insertion 98.14 1.86 1 ENm006_514992 chrX 153282473 Deletion 99.17 0.83 313 ENm006_655434 chrX 153423278 Deletion 97.54 2.46 1 ENm006_865534 chrX 153633378 Deletion 99.76 0.24 3 ENm007_286690 chr19 59310274 Deletion 99.19 0.81 3 ENm007_287433 chr19 59311017 Deletion 99.88 0.12 1 ENm007_345043 chr19 59368626 Insertion 65.34 34.66 1 ENm007_345102 chr19 59368685 Insertion 0.85 99.15 15 ENm007_360614 chr19 59384198 Deletion 99.77 0.23 2 ENm007_362045 chr19 59385629 Deletion 99.65 0.35 1 ENm007_372784 chr19 59396370 Insertion 99.04 0.96 1 ENm007_410183 chr19 59433767 Deletion 64.69 35.31 3 ENm007_518461 chr19 59542045 Insertion 0.24 99.76 3 ENm007_823333 chr19 59846917 Deletion 99.88 0.12 1 ENm009_496888 chr11 5227883 Deletion 88.54 11.46 4 ENm009_501984 chr11 5232979 Deletion 96.57 3.43 6 ENm009_756832 chr11 5487827 Insertion 28.8 71.2 2 ENm009_871434 chr11 5602429 Deletion 99.64 0.36 1 ENm009_936269 chr11 5667264 Insertion 99.53 0.47 3 ENm009_936586 chr11 5667581 Insertion 99.88 0.12 1 ENm010_197155 chr7 27121200 Deletion 94.79 5.21 2 ENm010_201661 chr7 27125706 Deletion 99.77 0.23 1 ENm010_204029 chr7 27128074 Deletion 99.88 0.12 1 ENm010_324951 chr7 27248996 Deletion 5.19 94.81 1 ENr122_116072 chr18 59528372 Deletion 0.12 99.88 1 ENr131_108512 chr2 234265075 Deletion 66.12 33.88 5 ENr131_145749 chr2 234302311 Insertion 93.26 6.74 14 ENr132_119389 chr13 112457453 Insertion 0.72 99.28 4 ENr133_395759 chr21 39640199 Deletion 23.32 76.68 1 ENr133_395774 chr21 39640239 Deletion 99.87 0.13 5 ENr212_100955 chr5 141981105 Deletion 39.11 60.89 2 ENr212_165801 chr5 142045951 Deletion 14.82 85.18 1 ENr222_46093 chr6 132264632 Insertion 73.67 26.33 1 ENr222_46233 chr6 132264772 Deletion 74.81 25.19 59 ENr223_429072 chr6 74219024 Insertion 99.74 0.26 2 ENr231_12970 chr1 149437654 Insertion 97.18 2.82 1 ENr231_142418 chr1 149567102 Deletion 99.88 0.12 1 ENr231_354372 chr1 149779056 Deletion 91.51 8.49 35 ENr231_354628 chr1 149779312 Deletion 99.88 0.12 4 ENr231_354683 chr1 149779367 Deletion 99.88 0.12 1 ENr232_137971 chr9 130863093 Deletion 99.6 0.4 13 ENr232_146181 chr9 130871303 Deletion 99.88 0.12 1 ENr232_219141 chr9 130944263 Deletion 98.22 1.78 2 ENr233_359735 chr15 41879823 Insertion 98.35 1.65 1 ENr323_120959 chr6 108492355 Deletion 97.29 2.71 5 ENr323_222491 chr6 108593887 Insertion 99.53 0.47 1 ENr331_146953 chr2 220132542 Deletion 99.39 0.61 1 ENr332_362186 chr11 64303074 Deletion 99.88 0.12 1 ENr332_484516 chr11 64425404 Deletion 99.64 0.36 1 ENr333_488867 chr20 33793795 Deletion 99.76 0.24 1 Supplemental References Asthana S, Roytberg M, Stamatoyannopoulos J, Sunyaev S. 2007. Analysis of sequence conservation at nucleotide resolution. PLoS Comput Biol 3:e254. doi:10.1371/journal.pcbi.0030254 Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED et al. 2004. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14: 708-715. Boyko AR, Williamson SH, Indap AR, Degenhardt JD, Hernandez RD, Lohmueller KE, Adams MD, Schmidt S, Sninsky JJ, Sunyaev SR, et al. 2008. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet 4: e1000083. doi: 10.1371/journal.pgen.1000083. Chen CT, Wang JC, Cohen BA. 2007. The strength of selection on ultraconserved elements in the human genome. Am J Hum Genet 80: 692-704. Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, Sidow A. 2005. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res 15: 901–913. Cooper SJ, Trinklein ND, Anton ED, Nguyen L, Myers RM. 2006. Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome. Genome Res 16: 1-10. Drake JA, Bird C, Nemesh J, Thomas DJ, Newton-Cheh C, Reymond A, Excoffier L, Attar H, Antonarakis SE, Dermitzakis ET et al. Conserved noncoding sequences are selectively constrained and not mutation cold spots. 2006. Nat. Genet. 38: 223-227. Eddy SR 2005. A Model of the Statistical Power of Comparative Genome Sequence Analysis. PLoS Biol 3: e10. doi:10.1371/journal.pbio.0030010 Ewing B, Green P. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8: 186-194. Ewing B, Hillier L, Wendl MC, Green P. 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8: 175-185. Gordon D, Abajian C, Green P. 1998. Consed: a graphical tool for sequence finishing. Genome Res 8: 195-202. Karolchik D, Kuhn RM, Baertsch R, Barber GP, Clawson H, Diekhans M, Giardine B, Harte RA, Hinrichs AS, Hsu F et al. 2008. The UCSC Genome Browser Database: 2008 update. Nucl. Acids Res. 36 (Database issue): D773-779. Katzman S, Kern AD, Bejerano G, Fewell G, Fulton L, Wilson RK, Salama SR, Haussler D. 2007. Human genome ultraconserved elements are ultraselected. Science 317: 915. Lohmueller KE, Indap AR, Schmidt S, Boyko AR, Hernandez RD, Hubisz MJ, Sninsky JJ, White TJ, Sunyaev SR, Nielsen R, et al. 2008. Proportionally more deleterious genetic variation in European than in African populations. Nature 451: 994–997. Margulies EH, Cooper GM, Asimenos G, Thomas DJ, Dewey CN, Thomas DJ, Dewey, CN, Siepel A, Birney E, Keefe D et al. 2007. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res 17: 760-774. Marth GT, Czabarka E, Murvai J, Sherry ST. 2004. The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166: 351-372. Rozen S and Skaletsky. HJ 2000. Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S (eds) Bioinformatics Methods and Protocols: Methods in Molecular Biology. Humana Press, Totowa, NJ, pp 365-386. Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S,et al. 2005. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15: 1034-1050. Smit AFA, Hubley R and Green P. 1996-2004. RepeatMasker Open-3.0. <http://www.repeatmasker.org>. Stephens M, Sloan JS, Robertson PD, Scheet P, Nickerson DA. 2006. Automating sequence-based detection and genotyping of SNPs from diploid samples. Nat. Genet. 38: 375-81. The ENCODE Consortium. 2007. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799-816. Watterson GA. 1975. On the number of segregation sites. Theoretical Population Biology 7: 256-276.