Supplementary Text Introduction The longitudinal expanse of land stretching from Central Europe to the Pacific Ocean has witnessed numerous demographic movements in both prehistoric and historic past.1,2 Prehistoric dispersals include Bronze-age migration from west to east (as substantiated by east/west admixture of mtDNA haplotypes and the presence of a West Eurasian paternal lineage, R1a1a-M198, in East Eurasia)3 and the Neolithic diffusion of agriculture along with seasonal nomadism of hunters and herders.1,2 Sections of this vast region were occupied at various times by many different populations including the Persians4, Macedonians5, GraecoBactrians6, Parthians7, Indians, and Han Chinese8 among others. Further, in historical times, this region also encompassed a network of trade corridors known collectively as the Silk Road8,9 that enabled the trans-continental movement of both goods and people established during the Han Dynasty (206—220 AD). Results The YSNP markers that were found to be non-polymorphic are: Ancestral: M356, M93, M407, P53.1, P20, P76, P16, P286, M370, M52, M197, M39, M97, APT, P254, M258, M47, M67, M68, M319, M339, M419, M340, M12, PK3, M128, P67, MEH2, M25, M323, M3, and S116. The YSNP markers that were found to be informative (i.e., derived) were: N1, P99, M217, M357, M178, M134, M479, P25, M335, SRY1532, Z93, and M479. Discussion Haplogroups D, H and R exhibit the highest gene diversity or Vp indices (0.97, 0.71 and 0.67, respectively) as well as the oldest genealogical time estimates (TE) (17 ± 7.5, 13 ± 2.8 and 12 ± 1.5 and kya, respectively) in Ladakh. These high Vp indices and large TE values likely reflect multiple migration events from different source populations. O appears to be less diverse (0.42, TE Gen of 7.2 ± 1.2). This inference is also supported by the results of the CA plot in which Ladakh segregates at an intermediate position between the East Asian and West Asian clusters. Extensive inter-population gene flow among West/South/Central Asian populations is also suggested by the MDS analysis. The Euclidean proximity to both the West (GAR, LAV) and South Asian populations (LIN, VOK, J&K, KAT) is also congruent with an East-West genetic continuity. Further, the lack of geographical portioning and the limited number of mutational steps connecting haplotypes in all networks is congruent with extensive gene flow resulting from various demographic episodes. All together, our data suggest that the paternal ancestry of Ladakh is a genetically diverse mosaic resulting from a multitude of 1 migrations at various time intervals from different sources, a conclusion compatible with major demographic episodes such as the Silk Road. When discussing the haplogroup distribution of the Ladakh region, it is important to keep in mind that the deep branching haplogroups (i.e., C, D, F, H) emerged well prior to the last ice age since this high altitude region was colonized more recently by expanding populations (with post glacial coalescent estimates). From the TE/gene diversity data (Table S3), it is apparent that haplogroups D, H and R exhibit the highest gene diversity or Vp indices (0.97, 0.71 and 0.67, respectively) as well as the oldest genealogical time estimates (TE) (17 ± 7.5, 13 ± 2.8 and 12 ± 1.5 and kya, respectively) but O appears to be less diverse (0.42, TE Gen of 7.2 ± 1.2).Although, it is tempting to connect the order of the Vp and TE values with the relative age of the haplogroup source population, the resulting conclusions may be misleading since high Vp indices and large TE values may also reflect multiple migration events from different seed populations. Thus, at this point, it is difficult to gauge which of the three most diverse haplogroups represent the earliest inhabitants since multiple episodes of gene flow can artificially increase estimations of the time to the most recent common ancestor. Further, although the low gene diversity and relatively small TE values may indicate that O is a more recent introduction into Ladakh compared to the Y haplogroups D, R and H, a major founder effect or severe genetic bottleneck impacting the genetic diversity of the O haplogroup may also account for the differences. In addition, interpretation is constrained by the course level of resolution and wide time frames due to the lack of distinguishing Y chromosomal haplotypes as well as the availability of sample data. It is interesting that all of the Middle East, West Asia and four of the five western-most South Asian groups (LIN, PAK, PUN and VOK) cluster in the upper right of the CA (Figure 2 of main text). This is due, most likely, to the abundance of R (27% to 69%) and J (4% in VOK to 58% in IRQ) haplogroups combined with the near or total absence of O-specific markers. In terms of the diversity and time estimate results, results should be viewed with caution since a low sample number may inflate Vp values. Thus, the high diversity index of H* may be a consequence of a small sample size (N = 5) as opposed to an intrinsically diverse set of H* lineages. With a few exceptions including that of H* (40.6 ± 12.4 kya), the relative order of the TE values, in general, follow that of Vp indices. D1a-N1 (N = 58, TE Evo: 42.2 ± 16.9 kya and TE Gen: 16.3 ± 6.5 kya) is estimated to be the oldest haplogroup and, D3-P99 (N = 19, TE Evo: 10.2 ± 2.0 kya and TE Gen: 3.9 ± 0.8 kya), the youngest. Conclusions Ladakh is a remote region within the Himalayan range. Yet, it is remarkable that in spite of its remote location, this region exhibits extreme Y chromosomal diversity. Ladakh, Southern Iran and Pakistan are the most genetically heterogeneous of all Asian populations examined in this study. It is also interesting that these three populations lie in a region of what seems to be a genetic confluence, geographically located in a West to East Asian corridor. Another interesting discovery from our study is that the four major polymorphic Y chromosomal 2 haplogroups detected in Ladkakh (O =19%, D = 32%, R =22%, and H = 11%) are each representative of a different geographical region of Asia (East, Central, West and South, respectively). These distribution patterns are also evident in the Y haplogroup contour gradient maps. Also, haplogroups D, H and R exhibit the highest gene diversity or Vp indices (0.97, 0.71 and 0.67, respectively) as well as the oldest genealogical time estimates (TE) (17 ± 7.5, 13 ± 2.8 and 12 ± 1.5 and kya, respectively) in Ladakh. These high Vp indices and large TE values likely reflect multiple migration events from different source populations. O appears to be less diverse (0.42, TE Gen of 7.2 ± 1.2). This inference is also supported by the results of the CA plot in which Ladakh segregates at an intermediate position between the East Asian and West Asian clusters. Extensive inter-population gene flow among West/South/Central Asian populations is also suggested by the MDS analysis ((Supplementary Figure 8 of main text).. The Euclidean proximity to both the West (GAR, LAV) and South Asian populations (LIN, VOK, J&K, KAT) is also congruent with an EastWest genetic continuity. Further, the lack of geographical portioning and the limited number of mutational steps connecting haplotypes in all networks is congruent with extensive gene flow resulting from various demographic episodes. All together, our data suggest that the paternal ancestry of Ladakh is a genetically diverse mosaic resulting from a multitude of migrations at various time intervals from different sources, a conclusion compatible with major demographic episodes such as the Silk Road. References 1 Frachetti MD: Pastoralist Landscapes and Social Interaction in Bronze Age Eurasia. Berkeley: University of California Press, 2008. 2 Frachetti MD:The Multi-Regional Emergence of Mobile Pastoralism and the Growth of Non-Uniform Institutional Complexity Across Eurasia. Current Anthropology. 2012; 53: 2-38. 3 3 Li C, Li H, Cui Y et al: Evidence that a West-East admixed population lived in the Tarim Basin as early as the early Bronze Age. BMC Biology 2010; 8:15. 4 Dandamayev MA: Media and Achaemenid Iran in the History of Civilizations of Central Asia, Volume II edited by J Harmatt. Paris: UNESCO Publishing, 1994. 5 Dani AH and Bernard P: Alexander and His Successors in Central Asia in the History of Civilizations of Central Asia, Volume II edited by J Harmatta, UNESCO Publishing, Paris. 6 Bernard P, 1994, The Greek kingdoms of Central Asia in the History of civilizations of Central Asia, Volume II edited by J Harmatta. Paris: UNESCO Publishing, 1994. 7 Koshelenko GA and Pilipko VN: Parthia in the History of Civilizations of Central Asia, Volume II edited by J Harmatta. Paris: UNESCO Publishing, 1994. 8 Yong M and Yutang S: The western regions under the Hsiung and the Han in the History of civilizations of Central Asia, Volume II edited by Harmatta J. Paris: UNESCO Publishing, 1994. 9 Thorley J: The Silk Trade between China and the Roman Empire at Its Height, Circa A. D. 90-130. Greece and Rome. 1971; 18, 1:71-80. 3 4