1 Supporting Information 2 Supporting Tables Table S1: Strains and Sequence Data Used in This Study Strain Name CCMV strain Heberling RhCMV strain 68-1 RhCMV Strain 180.92 Strain 3301 Strain 6397 Strain BE/9/2010 Strain BE/10/2010 Strain BE/11/2010 Strain BE/21/2010 Strain BE/27/2010 Strain Davis Strain HAN Strain HAN1 Strain HAN2 Strain HAN3 Strain HAN8 Strain HAN12 Strain HAN16 Strain HAN19 Strain HAN22 Strain HAN28 Strain HAN31 Strain JHC Strain JP Strain PAV16 Strain PAV18 Strain PAV21 Strain Toledo Strain TR Strain U01 FINAL Organism GenBank Accession Number Panine herpesvirus 2 NC_003521.1 Macacine herpesvirus 3 Macacine herpesvirus 3 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 NC_006150.1 DQ120516 GQ466044.1 JX512197.1 KC519319.1 KC519320.1 KC519321.1 KC519322.1 KC519323.1 JX512198.1 KJ426589.1 JX512199.1 JX512200.1 JX512201.1 JX512202.1 JX512203.1 JX512204.1 JX512205.1 JX512206.1 JX512207.1 JX512208.1 HQ380895.1 GQ221975.1 KJ872539.1 KJ872540.1 KJ872542.1 GU937742.1 KF021605.1 JN379814 Source Throat Swab Urine Lung Tissue Urine Urine Urine Urine Urine Urine Urine Liver Unknown Bronchoalveolar Lavage Bronchoalveolar Lavage Bronchoalveolar Lavage Bronchoalveolar Lavage Bronchoalveolar Lavage Urine Bronchoalveolar Lavage Bronchoalveolar Lavage Bronchoalveolar Lavage Bronchoalveolar Lavage Blood Prostate Amniotic Fluid Amniotic Fluid Amniotic Fluid Urine Vitreous Humor Urine 1 Strain U04 FINAL Strain U33 FINAL Strain UKNEQAS1 Strain VR1814 Strain Merlin Strain HAN20 Strain HAN38 Strain 3157 Strain HAN13 Strain PAV20 Strain TB/40E Strain AF1 Strain U11 Strain U8 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 Human herpesvirus 5 JN379815 JN379816 KJ361971.1 GU179289.1 NC_006273.2 GQ396663.1 GQ396662.1 GQ221974.1 GQ221973.1 KJ872541.1 EF999921 GU179291.1 GU179290.1 GU179288.1 Urine Urine Urine Cervical Secretions Urine Bronchoalveolar Lavage Bronchoalveolar Lavage Urine Bronchoalveolar Lavage Amniotic Fluid Throat Wash Amniotic Fluid Urine Urine 3 2 Table S2: Linear model fit of intraspecies diversity modela Estimate Standard Error 4 t Pr(>|t|) Intercept -1.87962 0.06348 -29.61 2 x 10-16 Divergence 0.66219 0.10703 6.187 1.71 x 10-09 Recombination Rate 0.16581 0.03164 5.241 2.75 x 10-07 -16 2 a. F-statistic: 45.75 on 2 and 470 DF, p-value < 2.20 X 10 , Adjusted R = 0.2018 3 5 6 7 Figure S1: HCMV genome-wide recombination rates. Population-scaled recombination rates (2Ner) were calculated in 500 base pair windows, log transformed and plotted by genomic coordinate. 8 4 9 10 11 Figure S2: Log transformation of intraspecies diversity (π). Distribution of intraspecies diversity (π) prior to (A) and after log transformation (C). Normal Q-Q Plots of untransformed (B) and log transformed (D) intraspecies diversity. 12 5 13 14 15 16 Figure S3: Log transformation of population-scaled recombination rates (2Ner). Distribution of population-scaled recombination rates (2Ner) prior to (A) and after log transformation (C). Normal Q-Q Plots of untransformed (B) and log transformed (D) populationscaled recombination rates (2Ner). 17 6 18 19 20 Figure S4: Log transformation of interspecies divergence (Dxy). Distribution of interspecies divergence (Dxy) prior to (A) and after log transformation (C). Normal Q-Q Plots of untransformed (B) and log transformed (D) interspecies divergence (Dxy). 21 7 22 23 24 25 26 27 28 29 Figure S5: CLR values from neutral simulations. Presented are results from two sets of neutral simulations. The first set consisted of 1000 simulations performed with the ms program, with values of theta and rates of recombination set equal to the HCMV genome wide averages and a constant population size. The second set of simulations was similar to the first but included a 99% reduction of population size 500 generations in the past followed by exponential growth to the current size. Both sets of results are presented on the same scale for ease of comparison. Red lines indicate the 99% significance threshold obtained from the simulated data. 30 8 31 32 33 34 35 36 37 38 39 40 41 42 Figure S6: Correlation of HCMV intraspecies diversity, interspecies divergence and recombination rates from a thinned dataset. Intraspecies diversity (π), interspecies divergence (Dxy), and population recombination rates (2Ner) were calculated in 500 bp. A thinned dataset was selected by including only windows located every 5,000 bps across the genome. Scatter plots of (A) population recombination rates (2Ner) and intraspecies diversity (π) (Pearson’s R = 0.44, P = 1.6 x 10-03) (B) interspecies divergence (Dxy) and intraspecies diversity (π) (Pearson’s R = 0.49, P = 2.8 x 10-04) (C) population recombination rates (2Ner) and interspecies divergence (Dxy) (Pearson’s R = 0.35, P = 1.3 x 10-02) and (D) the corrected levels of intraspecies diversity (π/Dxy) and population recombination rates (2Ner) (Pearson’s R = 0.28, P = 6.0 x 10-02). Blue lines represent linear regressions of the data and grey shading indicates the 95% confidence intervals. All statistics are log transformed prior to calculation of correlations and plotting. 43 9 44 45 46 47 48 49 50 Figure S7: Haplotype Maps of Putative Regions of Selective Sweeps. Shown are haplotype maps of 100 bp regions within the putative selective sweep regions identified in Figure 6. The maps are shown for the regions corresponding to (A) lncRNA2.7, (B) lncRNA4.9, and (C) lncRNA5. For each panel, the top sequence is CCMV and all sequences below are the HCMV sequences analyzed. Mismatches within the alignment are colored (G = Yellow, C = Blue, A = Red, T = Green). 51 10