Supplementary Materials Selection of autochthonous surnames A list of selected surnames for each community was collected based on extended archival research. All surnames which are known to be present in each community since the first occurrences of surnames until the year 1575, were compiled using 'buitenpoorter' (extern burgher) lists of several Flemish cities (e.g. lists of Aalst and Geraardsbergen) and using real property or notarial records in archived documents specific for each region, e.g. the so called 'Penningkohieren' for Idegem, the 'Gichten' in Limburg (Alken and Tongeren) and the 'Ommelopers' for West-Flanders (Snellegem and Oudenburg). For some regions, large parts of these surname lists were already compiled by earlier historic studies as for Oudenburg1 and Velzeke2. Finally, surnames were considered to be autochthonous within the regional samples when the name occurred before the year 1575 in the regional archival documents, as documented in Flemish anthroponymical sources3. Selection of communities Six villages and towns, further referred as 'communities', were selected within contemporary Flanders based on their geography and historical development (Fig. 1). The six communities are classified geographically in three pairs with a pair by the coast in the province West-Flanders, namely Oudenburg and Snellegem; a pair in central Flanders in the province East-Flanders, namely Velzeke and Idegem; and a pair in the most eastern part of Flanders in the province Limburg, namely Tongeren and Alken (Fig. 1). Within each of these three pairs, one locality is well known to be populated since the Roman period (58 BC - circa 410 AD; further referred to as 'Gallo-Roman' or 'GR' research group); namely Oudenburg (presumably portus Epatiacus as mentioned in the early 5th century in the Notitia Dignitatum), where an important Roman castellum with an accompanying settlement was situated4; Velzeke (presumably Feliciacum), which was a well-known small town or vicus on the crossroad of two important Roman 1 roads2,5; and Tongeren, which was the only Roman town (Municipium Tungrorum) within the borders of contemporary Flanders6. The other community in each pair is known to be a settlement which mainly developed since the Early Middle Ages (further referred to as 'Early medieval' or 'EM' research group); namely Snellegem, which was a known Carolingian fiscus (Fiscus Snetlinghehem) with indications of Norman invasions7,8; Idegem and Alken which are both mentioned in several archival records since the end of the 10th century9,10. Y-chromosome genotyping A buccal swab sample from each selected participant was collected for DNA extraction by using the Maxwell 16 System (Promega Corporation, Madison, WI, USA) followed by real-time PCR quantification (Quantifiler Human DNA kit Thermo Fisher Scientific, Waltham, Massachusetts, USA). Y-STR loci were genotyped for all samples as described in previous studies11,12 but instead of PowerPlex® Y the recently developed PowerPlex® Y23 System (Promega Corporation) was used13,14. As such a set of 42 Y-STR loci was genotyped for these samples instead of 38 Y-STRs like in the previous studies on the Y chromosomal diversity in Flanders11,12. The whole process was reproduced with new primer sets for all individuals that showed non-amplified loci to exclude technical errors or mutations in the standard primer positions. All haplotypes were submitted to Whit Athey’s Haplogroup Predictor15 to obtain probabilities for the inferred haplogroups. The samples were assigned to specific Y-SNPs assays to confirm the inferred haplogroup and to assign the sub-haplogroup according to the Y chromosomal tree used in previous studies on the Flemish population to make the required comparison between the communal samples with earlier genotyped regional data16. The sub-haplogroups were called using the nomenclature proposed in van Oven et al.17. A total of 17 multiplex systems with 120 Y-SNPs were developed using SNaPshot mini-sequencing assays (Thermo Fisher Scientific) according to previously published protocols18,19. All primer sequences and concentrations for the analysis of the Y-SNPs are available from the authors upon request. 2 Relatedness analysis Differences in the rate of positive Y chromosomal matches (i.e. number of positive Y chromosomal matches to the total number of combinations) between three types of couples of DNA donors were calculated. The subsets of DNA donors are defined in the figure: Figure A Representation of all DNA donors in the relatedness analysis; with the donors within ‘communities’ A-F; with the donors within the ‘regions’ G (here defined without donors of A and B!), H (here defined without donors of C and D!) and I (here defined without donors of E and F!); and with the donors within ‘Flanders’ J (here defined without donors of A, B, C, D, E, F, G, H and I!). The three types of couples of DNA donors are: 1. All couples with a DNA donor of a community and another DNA donor of the same community. I.e. all couples of DNA donors within subsets A, B, C, D, E and F (defined as in legend of Figure A). In total there were 5,376 combinations. 3 2. All couples with a DNA donor of a community and another DNA donor of the region to which the community belongs (including the DNA donors of the other community of the region but excluding the DNA donors of its own community). I.e. all couples of DNA donors with one from subset A and one from subset B (A & B); A & G; B & G; C & D; C & H; D & H; E & F; E & I; F & I (defined as in legend of Figure A). There were 27,166 combinations. 3. All couples with DNA donors of Flanders (excluding between DNA donors of the same region). I.e. all couples of DNA donors with one from A and one from C (A & C); A & D; A & E; A & F; A & H, A & I, A & J, B & C, B & D; B & F; B & F; B & H; B & I; B & J; C & E; C & F; C & G; C & I; C & J; D & E; D & F; D & G; D & I; D & J; E & J, F & J, G & H; H & I; G & I; G & J, H & J, I & J, and all couples of DNA donors within J (defined as in legend of Figure A). There were 153,227 combinations. 4 Results of the redundancy analysis (RDA) The RDA was performed with the R software and the vegan package20 to study the influence of geography - namely the three regions NW-Flanders, SE-Flanders, and S-Brabant - versus history (GR and EM research groups) on the distribution of the Y chromosomal lineages in the six communities. The RDA was not powerful due to the low number of degrees of freedom. Nevertheless, the difference in the results of Model 1 and Model 2 gave clear indications for an influence of geography on the distribution of the Y chromosomal lineages instead of the influence of the history of the communities (Table A). Table A Results of the RDA analysis. Sub- Constrained R² haplogroups variables R² adj Pvalue model Model 1 History 19.58 % 0.00 % 0.5033 (neg. result) Model 2 Region 61.06 % 35.10 % 0.0656 R²: the coefficient of determination; R² adj, adjusted R² or the variance in frequencies of haplotypes explained by the constrained variables. 5 Supplementary figure Figure S1 Rarefaction curves of the Y chromosomal diversity of the four regions (NW-Flanders, SE-Flanders, S-Brabant & S-Limburg) and six communities under study. 6 Supplementary tables Table S1 Overview of all applied criteria in the sampling protocol to select or exclude DNA donors until obtaining the ultimate dataset for the six selected communities, the four regions and Flanders. ORPA: Oldest reported paternal ancestor; Y-chr: Y chromosomal. Community Region / Flanders Surname criterions Occurrence of surname in community before 1575 No toponym in surname from outside community No surname in dialect/language from outside community Occurrence of surname in region/Flanders before 1575 No toponym in surname from outside region/Flanders No surname in dialect/language from outside region/Flanders ORPA criterions ORPA lived in community (radius 5 km) before 1800 ORPA is not a foundling or illegal child ORPA lived in region/Flanders before 1800 ORPA is not a foundling or illegal child EPP criterion No donors with same ORPA but no Y-chr match No donors with same ORPA but no Y-chr match Patrilineal criterion One donor per couple with same surname and Y-chr match One donor per couple with same surname and Y-chr match Table S2 Results of the nine independent couples of the DNA donors with a genealogical common ancestor (GCA). As the individuals within each couple were assigned to the same subhaplogroup at the highest phylogenetic resolution and as their haplotypes revealed no more than seven Y-STR differences out of 38 YSTR loci, then the GCA is also the biological common ancestor (BCA). Pair Number of meioses Subhaplogroup Y-STR differences GCA = BCA? 1 2 3 4 5 6 7 8 9 7 7 9 9 10 10 15 17 17 R-U106* J-M241* R-Z195* R-M529* R-P312* R-L48 R-L2* E-V13* R-U198 2 3 1 0 0 0 2 1 0 Yes Yes Yes Yes Yes Yes Yes Yes Yes 7 Table S3 Distribution (N) and frequency (f) of the Y chromosomal subhaplogroups within the six selected communities and four regions within Flanders. The sub-haplogroups were called using the nomenclature proposed in van Oven et al. 17 . (Sub-)haplogroup Oudenburg Snellegem Idegem Velzeke Tongeren Alken NW-Flanders SE-Flanders S-Brabant S-Limburg N f N f N f N f N f N f N f N f N f N f E-M34* E-V12* E-V13 E-V65 E-M123* E-M215* E-M81* E-V22* G-P15* G-U8* G-M406* I-M223* I-M227* I-M253* I-P215* I-P37.2 I-P109 I-P95 I-M284 J-M12* J-M241* J-M410* J-M92* J-P58* J-M267* J-M319 J-M67* L-M317* N-M231 Q-P36.2* R-L2* R-L20 R-L48 R-M198* R-M269* R-M412* R-M529 R-P297* R-P310* R-P312* R-SRY2627 R-U106* R-U152* R-U198 R-Z18 R-Z195* R-Z381* R-P25* R-SRY10831.2* T-L208* T-L131* 0 0 0 0 0 0 0 0 0 0 0 0 0 6 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 10 1 0 0 3 1 0 5 0 0 1 0 1 2 1 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 16.7 2.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 8.3 2.8 27.8 2.8 0.0 0.0 8.3 2.8 0.0 13.9 0.0 0.0 2.8 0.0 2.8 5.6 2.8 0.0 0.0 0.0 0.0 1 0 0 0 0 0 0 0 0 0 0 1 0 7 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 8 3 0 1 5 0 2 7 0 1 1 2 2 1 3 0 0 0 0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.0 0.0 14.3 2.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 4.1 0.0 16.3 6.1 0.0 2.0 10.2 0.0 4.1 14.3 0.0 2.0 2.0 4.1 4.1 2.0 6.1 0.0 0.0 0.0 0.0 0 0 1 0 0 0 0 0 0 1 0 1 0 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 8 0 0 0 2 0 1 6 3 0 0 3 1 1 3 0 0 0 0 0.0 0.0 2.3 0.0 0.0 0.0 0.0 0.0 0.0 2.3 0.0 2.3 0.0 20.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 7.0 0.0 18.6 0.0 0.0 0.0 4.7 0.0 2.3 14.0 7.0 0.0 0.0 7.0 2.3 2.3 7.0 0.0 0.0 0.0 0.0 0 0 1 1 0 0 0 0 0 1 0 0 0 6 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 8 2 1 0 0 0 1 3 1 1 0 0 0 2 4 0 0 0 0 0.0 0.0 2.8 2.8 0.0 0.0 0.0 0.0 0.0 2.8 0.0 0.0 0.0 16.7 2.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.8 0.0 0.0 0.0 0.0 0.0 0.0 5.6 0.0 22.2 5.6 2.8 0.0 0.0 0.0 2.8 8.3 2.8 2.8 0.0 0.0 0.0 5.6 11.1 0.0 0.0 0.0 0.0 0 0 3 0 0 0 0 0 1 0 0 1 1 3 1 0 0 0 0 0 1 2 1 0 0 0 0 0 1 0 1 0 6 2 0 0 1 0 1 5 1 0 2 2 2 1 3 0 0 1 0 0.0 0.0 7.0 0.0 0.0 0.0 0.0 0.0 2.3 0.0 0.0 2.3 2.3 7.0 2.3 0.0 0.0 0.0 0.0 0.0 2.3 4.7 2.3 0.0 0.0 0.0 0.0 0.0 2.3 0.0 2.3 0.0 14.0 4.7 0.0 0.0 2.3 0.0 2.3 11.6 2.3 0.0 4.7 4.7 4.7 2.3 7.0 0.0 0.0 2.3 0.0 0 0 3 0 0 0 0 0 0 1 0 1 0 2 1 1 0 0 0 1 1 0 0 2 0 0 0 0 0 0 2 1 2 5 0 0 4 0 0 8 1 0 3 2 1 2 2 0 0 0 0 0.0 0.0 6.5 0.0 0.0 0.0 0.0 0.0 0.0 2.2 0.0 2.2 0.0 4.3 2.2 2.2 0.0 0.0 0.0 2.2 2.2 0.0 0.0 4.3 0.0 0.0 0.0 0.0 0.0 0.0 4.3 2.2 4.3 10.9 0.0 0.0 8.7 0.0 0.0 17.4 2.2 0.0 6.5 4.3 2.2 4.3 4.3 0.0 0.0 0.0 0.0 1 0 1 0 0 0 0 0 0 2 0 1 0 8 2 0 2 0 0 0 0 3 0 0 0 0 0 0 0 0 7 2 11 3 1 0 4 1 0 12 2 0 2 0 1 0 2 0 0 0 0 1.5 0.0 1.5 0.0 0.0 0.0 0.0 0.0 0.0 2.9 0.0 1.5 0.0 11.8 2.9 0.0 2.9 0.0 0.0 0.0 0.0 4.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.3 2.9 16.2 4.4 1.5 0.0 5.9 1.5 0.0 17.6 2.9 0.0 2.9 0.0 1.5 0.0 2.9 0.0 0.0 0.0 0.0 1 0 1 0 0 1 0 0 0 4 0 5 0 23 1 3 1 0 0 0 0 1 0 2 1 0 0 0 0 0 11 1 13 3 1 0 9 0 0 10 0 1 5 1 2 3 14 1 0 0 1 0.8 0.0 0.8 0.0 0.0 0.8 0.0 0.0 0.0 3.3 0.0 4.2 0.0 19.2 0.8 2.5 0.8 0.0 0.0 0.0 0.0 0.8 0.0 1.7 0.8 0.0 0.0 0.0 0.0 0.0 9.2 0.8 10.8 2.5 0.8 0.0 7.5 0.0 0.0 8.3 0.0 0.8 4.2 0.8 1.7 2.5 11.7 0.8 0.0 0.0 0.8 1 1 9 0 0 0 0 1 0 4 0 6 0 14 6 1 2 1 1 0 1 4 0 2 1 2 3 1 0 2 7 4 20 7 6 0 8 0 2 17 1 1 5 1 2 3 11 0 0 0 0 0.6 0.6 5.7 0.0 0.0 0.0 0.0 0.6 0.0 2.5 0.0 3.8 0.0 8.9 3.8 0.6 1.3 0.6 0.6 0.0 0.6 2.5 0.0 1.3 0.6 1.3 1.9 0.6 0.0 1.3 4.4 2.5 12.7 4.4 3.8 0.0 5.1 0.0 1.3 10.8 0.6 0.6 3.2 0.6 1.3 1.9 7.0 0.0 0.0 0.0 0.0 1 0 1 0 1 0 2 1 2 1 1 3 0 8 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 5 1 4 1 0 0 4 0 1 13 1 2 4 1 3 2 6 0 1 0 0 1.4 0.0 1.4 0.0 1.4 0.0 2.8 1.4 2.8 1.4 1.4 4.2 0.0 11.1 0.0 0.0 0.0 1.4 0.0 0.0 0.0 1.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6.9 1.4 5.6 1.4 0.0 0.0 5.6 0.0 1.4 18.1 1.4 2.8 5.6 1.4 4.2 2.8 8.3 0.0 1.4 0.0 0.0 Total 36 49 43 36 43 46 68 120 158 72 References 1 Gysseling M: Toponymie van Oudenburg. Brussel, Naaml. Venn. Standaard-Boekhandel, 1950. 8 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 van Durme L: Toponymie van Velzeke-Ruddershove en Bochoute. Gent, Secretariaat van de Koninklijke Academie voor Nederlandse Taal- en Letterkunde, 1986. Debrabandere F: Woordenboek van de familienamen in België en Noord-Frankrijk. Amsterdam/Antwerpen, L.J. Veen/Het Taalfonds, 2003. Vanhoutte S: Onrust aan de kust: een Romeins castellum in Oudenburg tijdens de 3de en de 4de eeuw; in: De Clercq W (ed): Over vlees en bloed - Menapische boeren en soldaten aan de rand van het Romeinse Rijk. Velzeke, pamVelzeke, 2011, pp 82-85. Rogge M: Een legerplaats uit de vroeg-Romeinse tijd te Velzeke. Hermeneus: maandblad voor de antieke cultuur 1980; 52: 71-75. Lendering J, Bosman A: De rand van het Rijk. De Romeinen in de Lage Landen. Amsterdam, Athenaeum-Polak & van Gennep, 2010. Noterdaeme J: De fiscus Snellegem en de vroegste kerstening in het westen van Brugge. Handelingen der Maatschappij voor Geschiedenis en Oudheidkunde te Gent 1957; 11: 49-128. Koch ACF: Vikingen in Vlaanderen? Een 10de-eeuwse lijst met persoonsnamen uit Snellegem (bij Brugge). Naamkunde 1984; 16: 183-200. Decrits M, Van Isterdael H: Idegem (1610-1800). Brussel, Algemeen rijksarchief, 2000. S.N.: Alken - Tussen mammoet en computer. De geschiedenis van een gemeente. Alken, Gemeentebestuur Alken, 1994. Larmuseau MHD, Vanoverbeke J, Gielis G, Vanderheyden N, Larmuseau HFM, Decorte R: In the name of the migrant father - Analysis of surname origin identifies historic admixture events undetectable from genealogical records. Heredity 2012; 109: 90-95. Larmuseau MHD, Vanderheyden N, Jacobs M, Coomans M, Larno L, Decorte R: Microgeographic distribution of Y-chromosomal variation in the central-western European region Brabant. Forensic Science International-Genetics 2011; 5: 95-99. Thompson JM, Ewing MM, Frank WE et al: Developmental validation of the PowerPlex (R) Y23 System: A single multiplex Y-STR analysis system for casework and database samples. Forensic Science International-Genetics 2013; 7: 240-250. Purps J, Siegert S, Willuweit S et al: A global analysis of Y-chromosomal haplotype diversity for 23 STR loci. Forensic Science International-Genetics 2014; 12: 12-23. Athey WT: Haplogroup prediction from Y-STR values using a Bayesian-allele-frequency approach. Journal of Genetic Genealogy 2006; 2: 34-39. Larmuseau MHD, Vanderheyden N, Van Geystelen A, van Oven M, Kayser M, Decorte R: Increasing phylogenetic resolution still informative for Y chromosomal studies on WestEuropean populations. Forensic Science International-Genetics 2014; 9: 179-185. van Oven M, Van Geystelen A, Kayser M, Decorte R, Larmuseau MHD: Seeing the wood for the trees: a minimal reference phylogeny for the human Y chromosome. Human Mutation 2014; 35: 187-191. Caratti S, Gino S, Torre C, Robino C: Subtyping of Y-chromosomal haplogroup E-M78 (E1b1b1a) by SNP assay and its forensic application. International Journal of Legal Medicine 2009; 123: 357-360. van Oven M, Ralf A, Kayser M: An efficient multiplex genotyping approach for detecting the major worldwide human Y-chromosome haplogroups. International Journal of Legal Medicine 2011; 125: 879-885. R version 2.13.0 (2011), The R Foundation for Statistical Computing. 9