Supplementary Material Behavioural tests Mice trapped as adults in the field were maintained in quarantine under controlled conditions (12/12 photoperiod, food ad libitum) for two to four months in malefemale pairs before being tested. Preference of males and females was assessed during two-way choice tests carried out at the end of the light phase. The test apparatus was a Y-shaped transparent tubular Plexiglas device connected to three boxes, the start box and two peripheral boxes (Nunes et al. 2009; Latour et al. 2014). A few days before each test the mouse was left (~10min) to explore the entire empty apparatus, to reduce stress, neophobia and spatial investigation not directed towards the stimuli during the experiment. Urine stimuli (10µl each) were spread over 2cm² delimited areas at the extremity of each peripheral box. The tubes containing urine were labelled so that behaviour recording was blind. The left and right positions of the two stimuli were shifted between tests to avoid any effect of laterality. A given test mouse was introduced in the start box, separated from the rest of the apparatus with a perforated transparent sliding door; the slide door was then opened and the test started as soon as the mouse entered the Y maze. A choice test lasted 5 minutes during which the time spent by the mouse in contact sniffing or touching the stimuli were recorded with ‘The Observer’ 5.0.31 software. The apparatus was thoroughly cleaned between each test. Urine samples were obtained directly upon handling of a mouse or after leaving it to urinate in a cleaned box from where urine drops were pipetted and kept in Eppendorf tubes at -20°C before being used. Sampling took place over several days and times of the day to capture diurnal variation in urine composition. Urine donors were adult mice of the two subspecies trapped at the border of the hybrid zone, maintained in similar standardised conditions to avoid any odour heterogeneity due to differences in housing conditions or food. The stimuli were pools of urine from 8-9 mice sampled in distinct sites. Pooling was per sex and subspecies. All females were tested while in oestrus; pregnant females were neither tested nor used to obtain urine for the stimuli. 1 The difference in time spent in close contact with the musculus stimulus versus the domesticus one was used to compare patterns of preference across sexes and geography (two-ways ANOVA after checking residuals distribution and homoscedasticity), and to test whether assortative preference occurred (Student t test with H0: X = sniff(mus-dom) = 0). Genomic regions and markers analysed Search of microsatellite loci and design of PCR primers We used an automated pipeline (adapted from the msfinder Perl pipeline originally developed by Dr. Till Bayer from the GEOMAR Helmholtz Center for Ocean Research) to select in silico the microsatellite loci to study and to design PCR primer pairs flanking them. Microsatellite loci were searched in the regions of interest (the 49 clusters defined above, sequence obtained from build 37 of the mouse reference genome) using program Tandem Repeats Finder (TRF) (Benson 1999). We selected only loci with repeat motif sizes between 2 and 8 nucleotides, with 5 to 50 repeat motifs, and an alignment match (as compared to a perfect repeat) of at least 80%. We then defined PCR primers flanking each of these loci, using program PRIMER 3.0 (Untergasser et al. 2012), with default settings, except for PCR product size. The program was run four times with four different PCR product size ranges (50-150, 151-250, 251-350 and 351-450 base pairs). Each primer pair for a given PCR product size range was tested by electronic PCR on the whole mouse genome using the e-PCR program (Rotmistrovsky et al. 2004). When more than one PCR product was thus predicted, a different primer pair was searched in the same size range, and again tested by electronic PCR. If this second attempt failed, the locus was discarded for this size range. For each retained primer pair, the predicted PCR fragment was searched for tandem repeats other than the focal microsatellite, again using TRF program. This was intended to eliminate PCR primer pairs encompassing additional tandem repeats potentially interfering with the focal microsatellite. Tandem repeats with more than a certain fraction of overlap with the focal microsatellite (measured as the ratio of the overlap to the size of the focal microsatellite) were considered not interfering (i.e., part of the same tandem array). We used a threshold of 90% for this step. This is intended not to eliminate slightly imperfect microsatellites, which are rather common. Repeats with non-null overlap smaller than this threshold were 2 considered interfering. Those with no overlap were considered interfering if at least one of the following conditions was met: (i) distance to the focal microsatellite less a certain value (10 base pairs was used); (ii) match to a perfect repeat greater than a certain value (100% was used, so the condition was never met); (iii) number of repeats greater than a certain number (5 was used). The PCR primer pairs generating fragments containing interfering repeats were discarded. The PCR primer search section of the above algorithm was run once using the genomic sequence as template, and a second time on the sequence masked for repeated sequences. All steps other than primer search were identical. For each combination of locus and PCR size range, we selected only one primer pair, if possible from the masked template. When the pair had to be chosen from the unmasked template, we checked for potential overlap of the primers with repeated sequences, and discarded those pairs for which both primers overlapped. Choice of loci to PCR-amplify The in silico process described above recovered all microsatellite loci in the genomic regions of interest predicted to be amplifiable by PCR, based on the reference genome sequence, and obeying the various criteria described. In each of the chromosome segments of interest (“clusters” in the main text), we applied the following strategy to choose a subset of loci to study experimentally. For each gene in the chromosome segment (excluding pseudogenes), we chose the closest microsatellite. After removing these loci, as well as those overlapping with them, the process was reiterated until we reached the planned size of the experiment, 1,248 loci. In fact this was enough to saturate the regions of interest, i.e. additional rounds of the choice process would add loci all lying outside of the interval between the first and last genes of the genomic fragment considered. Of these 1,248 microsatellite loci, 531 had one of the two primers overlapping with a repeated element. Genotyping General methods For each microsatellite locus, we chose one PCR size range among the possibilities offered by the in silico design, while distributing equally (and randomly) the loci among the four ranges. The corresponding primer pairs were synthesized, with the addition of an 18-bp 5’tail (5’TGTAAAACGACGGCCAGT3’) to one primer of each 3 pair. Since the in silico parameters used to design the primers were identical for all loci, we used a single PCR condition. An initial denaturation for 5 min at 95°C was followed by 35 cycles of amplification. Denaturation during each cycle lasted 30 s at 95°C and elongation 30 s at 72°C. Annealing lasted 1.5 min at a temperature that decreased from 62°C to 56°C by steps of 0.3°C during the first 20 cycles, and was maintained at 56°C for the remaining 15 cycles. A final elongation at 60°C was applied for 30 min. The reactions were performed in 384-well plates in a total volume of 10 µL per well, containing a 1:2 dilution of the commercial Qiagen « Type-it Microsatellite PCR Kit », 30 ng of genomic DNA, and the primers. The latter included the locus-specific primer with the 18-bp tail (0.5 pmol), the other locusspecific primer without tail (2 pmol) and a fluorescently labelled primer with the sequence of the 18-bp tail (1.5 pmol). Each of the four fragment size ranges received a different fluorescent label for this latter primer (FAM, NED, PET and VIC from the smallest to the largest size range, respectively). This provided a way to fluorescently label the PCR products of the many loci to be typed in the main experiment (1,248) using only four fluorescently labelled oligonucleotides (method described for instance Schuelke 2000), rather than one specific labelled primer for each locus, as is usually done in this type of experiment but would have represented a prohibitive cost for our experiment (1,248 different labelled primers). For each sample, the PCR products were multiplexed four by four in equal volumes, combining loci with different colours (also corresponding to the four different PCR range sizes), and diluted 1:100 in water. A 3 µL aliquot of this dilution was added into 7 µL of formamide denaturating solution, together with 0.2 µL of a fluorescent size marker (600 LIZ, ABI), and the result was loaded onto an ABI 3130 16-capillary automated sequencer. In this way the loci were separated both by size and colour in the electrophoregrams, to avoid as much as possible interference between loci, given the complexity of the profiles expected from the pooled samples. Although the PCR product size ranges targeted during the design were adjacent (50-150, 151-250, 251350 and 351-450 bp), the actual product sizes tended to naturally cluster at the upper bound of their ranges (since longer flanks offer more possibilities of primer design), and were thus overall well separated across ranges. We were however careful not to pool together loci with close PCR product sizes, by ranking loci inside each size range according to predicted PCR size, and pooling loci with identical ranks. 4 The numerous pipetting steps needed, from handling of genomic DNA, primers and reagents to loading of the reactions on the sequencer were performed using pipetting robots (Tecan Genesis RSP, TM) equipped with washable pipetting needles and for low volume manipulation. We used two robots in different rooms for unamplified and amplified DNA, respectively. We thus minimized the risk of cross-contamination, of errors in sample and locus identification along the process, and of errors in the preparation of reaction mixes. References Benson, G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27:573–580. Boissinot, S., and P. Boursot. 1997. Discordant phylogeographic patterns between the Y chromosome and mitochondrial DNA in the house mouse: selection on the Y chromosome. Genetics 146:1019–1034. Latour, Y., M. Perriat-Sanguinet, P. Caminade, P. Boursot, C. M. Smadja, and G. Ganem. 2014. Sexual selection against natural hybrids may contribute to reinforcement in a house mouse hybrid zone. Proc. R. Soc. B Biol. Sci. 281. Nunes, A. C., M. da L. Mathias, and G. Ganem. 2009. Odor preference in house mice: influences of habitat heterogeneity and chromosomal incompatibility. Behav. Ecol. 20:1252–1261. Rotmistrovsky, K., W. Jang, and G. D. Schuler. 2004. A web server for performing electronic PCR. Nucleic Acids Res. 32:W108–12. Schuelke, M. 2000. An economic method for the fluorescent labeling of PCR fragments. Nat. Biotechnol. 18:233–4. Nature America Inc. Untergasser, A., I. Cutcutache, T. Koressaar, J. Ye, B. C. Faircloth, M. Remm, and S. G. Rozen. 2012. Primer3--new capabilities and interfaces. Nucleic Acids Res. 40:e115. 5