Supplement Information (SI) - Online Material Tracing Hepatitis B virus to the 16th century in a Korean mummy Material Sampling: An endoscopic examination of a mummified child from the 16th century AD excavated in Yangju, Korea (1, 2) was performed in an operation room of Dankook University College hospital using a sterilized fiber-optic endoscope (GIF XQ230, Olympus Co., Japan). Biopsies were taken from the dry parenchymatous organ were subjected to microscopic and aDNA analysis. The liver biopsies were immersed in sterile tubes to prevent contamination and were opened only in the ancient DNA (aDNA) laboratories (3). Tissue handling: Tissue handling and DNA extraction were performed in dedicated hoods which were disinfected before and after use; disposables and all waste were autoclaved prior to discarding; nondisposable equipment was sterilized before and after use. Tissue samples surface was subjected to sodium hypochchlorite 1,000ppm prior to the extraction to eliminate possible surface contamination. Methods: Identification of ancient HBV: The exploratory phase for HBV DNA detection was conducted independently in three laboratories in Korea, Israel and the UK: In Korea, DNA was extracted using phenol/chloroform/isoamyl alcohol (25:24:1) (4). Two gene regions the PreC/Core and the preS/S were amplified cloned and sequenced using published primers (5, 6). In Israel, specimens received from Korea were subjected to sodium hypochlorite 1000p.p.m in water to destroy any contamination on the surface. DNA extractions were conducted using three methods: commercial kits for DNA tissue extraction (Qiagen®, Germany and Machery-Nagel®, Germany) and guanidinium thiocyanate (GuSCN) followed by silica capturing (7, 8). PCR amplification was performed with HBV Monitor kit (Roche, USA) amplifying a PreC/C region of 245bp. In addition, a short fragment of the PreS/S of 223bp was amplified using two published primer sets (9, 10). An HBV Genotyping kit, INNO-LIPA HBV Genotyping Assay (Roche, USA) was used for amplification of a fragment of the polymerase gene (11). Positive amplifications were directly sequenced and cloned. Genetic profile of the mummy: Several attempts were carried out to amplify mitochondrial and nuclear human DNA from liver and lung samples. The entire hypervariable region I (HVR-I) of the human mitochondrial control region was amplified in two over-lapping fragments of 271-bp and 1 232-bp length following primers and PCR conditions described previously (12). The two overlapping fragments (represented by 2-3 sequences obtained from different PCR attempts) yielded a contiguous sequence with the following 2 substitutions relative to the CRS (13): 16223T and 16362C. In addition the sequence of the lab technician (LH) differed from the mummy sequence by 4 substitutions: 16223C — 16274A —16325C and 16362T supporting the authenticity of the mummy sequence and lack of contamination. The obtained mummy HVR-I sequence matches human haplogroup D, Asian origin, probably Japanese. (http://www.bioanth.cam.ac.uk/mtDNA/toc.html). DNA extracted from the same mummy lung and liver biopsies was further used to amplify nuclear DNA using the commercial MiniFiler® kit of Short Tandem Repeat (STR) (Applied Biosystems). Only a partial profile (7 out of 15 authosomal STR’s) was obtained due to degradation of the human DNA. The determination of the Y chromosome STR (AMEL) confirms the initial morphologic identification of the mummy as a male. Four alleles identified in the authosomal STR’s that were successfully amplified, were not found in the positive control profile. The human genetic profile of the mummy differed from that of the investigators (Table SI1). For three STR’s (D3S1358, D1S1656, D21S11) only one allele was determined, causing difficulties to distinguish homozygosity from heterozygosity in case of an allele drop. Moreover, the partial profile prevents comparisons to other population profiles to determine the exact origin of the sample. Therefore, we searched the Earth Human STR Allele Frequencies Database using the Most Probable Geographical Origin model (http://www.ehstrafd.org/modules/MPGO) and found that the geographical origin estimated for the partial profile is probably in Asia. The first 10 populations that were found with high similarities are from India, China, Vietnam, South Korea and Japan. The identification of the mummy as a male while all the investigators involved in the technical-laboratory work were females together with the partial autosomal profile support the authenticity of the results and lack of contamination. Whole genome analysis: Once HBV was identified in the liver biopsies, we proceeded to determine the complete genome sequence. Analysis was performed in a dedicated ancient DNA laboratory at a different campus of the Hebrew University and in Korea. Primer design: Overlapping primer sets were designed using Primer 3 software (http://primer3.sourceforge.net/) to amplify the entire genome based on published HBV genomes (Table SI2). Sequences obtained from the liver samples were used to design new primer sets to amplify and sequence the overlapping regions (Table SI2). Overall, 44 primer sets, were used to 2 amplify and sequence the entire HBV genome. To increase the overlapping of the sequences, we amplified larger fragments, using combined primer sets (Table SI2). DNA extraction and amplification: In Israel: Liver biopsies divided into small equal portions, ~10 mg each, were used for separate extractions of DNA. Prior to DNA extraction, samples were incubated in 1,000ppm bleach for (5 min), followed by incubation in double distilled water (15 min) to eliminate contemporary contamination during sampling. DNA extraction was conducted with guanidinium thiocyanate (GuSCN) followed by silica capturing (7, 8). All PCR’s were performed in a volume of 25 l (10 PCR buffer, 0.2 mM of dNTPs, 2.5 mM of MgCl2, 0.4 µM of each primer and 0.5 Units/reaction of AmpliTaq Gold (Applied Biosystems, Inc. USA) using a touchdown PCR method (14) consisting of an initial denaturation at 95°C for 10 min followed by a total of 45 cycles of 15 sec denaturation at 94°C, 45 sec annealing for two cycles each at 60°C, 58°C, 56°C, 54°C, 52°C, and 35 cycles at 50°C or 48°C, and 45 sec elongation at 72°C, with a final elongation step of 10 min at 72°C. The PCR’s were performed using high fidelity AmpliTaq Gold to minimize polymerase errors. In Korea: Liver samples (0.1-0.2g) were incubated in 1 ml of lysis buffer (EDTA 50 mM, pH 8.0; 1mg/ml of proteinase K; SDS 1%; 0.1M DTT) at 56C for 24 hr. Total DNA was extracted using a phenol/chloroform/isoamyl method (4). DNA isolation and purification was performed using a QIAmp PCR purification kit (QIAGEN, Hilden, Germany). Purified DNA was eluted in 35 l of EB buffer (QIAGEN, Germany). Quantity of extracted DNA was measured with Nano -Drop ND-1000 spectrophotometer (Thermo Fisher Scientific, USA). Overlapping primer sets (Table SI 2) and 40 ng of the total DNA extract were subjected to PCR amplifications. The conditions of PCR were as follows: pre-denaturation at 94C for 10 min; 40 or 45 cycles of denaturation at 94C for 45 sec; annealing at 54 to 56C for 45 sec; extension at 72C for 45 sec; final extension at 72C for 10 min. PCR amplification was performed using a PTC-200 DNA Engine (Bio-Rad. Laboratories, CA). PCR products were separated on a 2.5% agarose gel electrophoresis and visualized under UV by staining with ethidium bromide. PCR product electrophoresized was purified with QIAquick Gel Extraction Kit (Qiagen, Germany). Sequencing: In Israel: PCR products were analyzed using electrophoresis; positive amplifications were purified using Exonuclease Shrimp Alkaline Phosphatase (Exo-Sap IT, HDV Pharmacia) or Accura Kit (Bioneer® Country), with direct sequencing of the products. Both sense and anti-sense strands were sequenced using the BigDye Terminator system (Applied Biosystems, USA) resolved on an ABI PRISM 3700 (Applied Biosystems, USA) at the Center for Genomic Technologies, The Hebrew University of Jerusalem. 3 Cloning: In Israel: Several of the purified PCR products of regions (PreS/S and PreC/C gene) were cloned into a TOPO-TA vector and grown in One Shot TOP10 chemically competent E. coli cells (Invitrogen, USA). At least 8-10 isolated colonies from each transformation were grown. DNA was isolated using a QIAprep kit (Qiagen, Germany). Sequences were generated using the BigDye Terminator Cycle Sequencing kit (ABI, USA) and resolved on an ABI PRISM 3700 DNA Analyzer. For each positive clone, primers kz64 and kz77 (15) were used for sequencing the insert. In Korea: The positive amplifications were cloned using pGEM-T Easy Vector (Promega, US) and competent cells (ECOS-101, Yeastern Biotech, Taiwan). Plasmid isolation was performed using QIAprep spin miniprep kit (Qiagen, Germany). Sequencing was performed on ABI 3730xl Genetic Analyzer (Applied Biosystems, USA). Cytosine deamination: Ancient DNA is frequently extensively damaged as manifested by cytosine deamination converting into uracil residues. Such reactions can cause nucleotide misincorporations during PCR, generating so-called type II errors (ie. GC>AT mismatches) (16). The nucleotide misincorporations principle could have provided a means of establishing that DNA is ancient but this approach was found to be limited, therefore best used as a quantitative rather then a qualitative difference (17, 18). In our study, treatment of extracted DNA from the liver biopsies with Uracil Nglycosylase, the enzyme which removes deaminated cytosine from DNA, prevents DNA amplification, confirmed that the ancient DNA was damaged and nucleotide misincorporations, (C <> T and G <> A) which are often seen in ancient DNA were expected. Therefore, our consensus sequence was determined from multiple overlaping sequences that were obtained from different PCR reactions using different extracted DNA from different tissue biopsies (Table SI4, Figure SI1). The consensus sequence represents either identical or the majority identical regions among the overlapping sequences. Ambiguity of one nucleotide in one sequence among the multiple overlapping sequences was treated as type II errors (ie. GC>AT mismatches) and was ignored. The four different consensus sequences represent four substitutions that were found in several sequences among the overlapping sequences. These substitutions may be authentic, representing different viral strains. The authenticity is supported by previous reports that identified natural mutations occurring in the same liver cell during replication, enabling more than one variant per cell (19, 20). The representation of four aHBV sequences in the analysis is taking into account possible variation among the HBV sequences, which represent the diversity existing in the different cells. As indicated in the manuscript (P.10 L.219-220), the sequences were obtained from 24 liver samples. 4 Phylogenetic and allele sharing analysis: A total of 161 sequences, both sense and antisense, were found to be of high quality and used to determine the entire HBV genome. The number of sequences of each region in the genome varied from two to 16 sequences, with 21bp up to 250bp overlap between them (Table SI4). The phylogenetic relationships inferred for the aHBV DNA sequences using alignment of each gene separately, representing all four ORF’s, distinguished genotype C from the other genotypes supported by high bootstrap values. The focus of the analysis was to determine the phylogenetic relationship of the aHBV sequences with other genotype C sequences, in contrast to a number of representative of other genotypes which were used as an outgroup. The genotype C sequences are separated into two clusters, HBV- C1 and HBV- C2 with high bootstrap support. Among the HBV- C2 clade there are two groups that can be identified but this separation is not supported by high bootstrap values. In all gene phylogenetic analysis the aHBV sequences establish a distinct cluster within the HBV/C2 clade, which is supported by high bootstrap values (Figure 4 and Table SI 3). The clustering of the aHBV consensus sequences distinguishes the aHBV DNA C2 from the other contemporary sequences studied. Estimation of tMRCA: tMRCA was estimated under relaxed molecular clocks model using the BEAST software. The aHBV sequences from the Korean mummy were found to cluster together. The aHBV cluster has a posterior distribution to some of the contemporary HBV/C2 sequences indicating that the aHBV sequences are ancestral (Figure SI3). If the HBV virus co-diverged with its host, then our estimate of the mummy tMRCA, should have a similar age to the origin of the virus. As we do not have a specific calibration we used the estimated divergence time available in the literature. Different theories have been proposed by investigators on HBV origin. The common theory proposed that the evolutionary history of HBV corresponds to the spread of anatomically modern humans as they migrated from Africa ~100, 000 years ago and different genotypes infecting humans evolved since this dispersal (Norder et al., 2004). Alternatively, Gunther and colleagues suggested that the HBV genotypes might have evolved later than, and independent of, human migration (Gunther et al., 1999). Based on 22 years of observations of nucleotide substitutions among HBV sequences Orito and colleagues estimated that the origin of the HBV is ~3000 years (Orito et al., 1989; Zhou et al., 2007 and Jazayeri et al., 2010). The results of the relaxed molecular clock models indicates tMRCA of the aHBV is similar to the outgroup tMRCA (Figure SI3b). Therefore, the aHBV sequence represents and ancient sequence probably one of the first viruses migrated to Asia from Africa. 5 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. Lee IS, Kim MJ, Yoo DS, Lee YS, Park SS, Bok GD, Han SH, et al. Three-dimensional reconstruction of medieval child mummy in Yangju, Korea, using multi-detector computed tomography. Annals of Anatomy-Anatomischer Anzeiger 2007;189:558-568. Shin DH, Choi YH, Shin KJ, Han GR, Youn M, Kim CY, Han SH, et al. Radiological analysis on a mummy from a medieval tomb in Korea. Ann Anat 2003;185:377-382. Kim MJ, S.S. Park., G.D. Bok, Y.H. Choi, I.S. Lee, K.J. Shin, G.R. Han, M. Youn, S.H. Han, I.W. Kang, B.S. Chang, Y.J. Cho, Y.H. Chung, and D.H. Shin Medieval Mummy from Yangju Archaeology, Ethnology & Anthropology of Eurasia 2006a;4:122-129. Kemp BMaDGS. Use of bleach to eliminate contaminating DNA from the surface of bones and teeth. Forensic Sci Int. 2005;154:53-61. Cha B, Lee SM, Park JC, Hwang KS, Kim SK, Lee YS, Ju BK, Kim TS. Detection of Hepatitis B Virus (HBV) DNA at femtomolar concentrations using a silica nanoparticle-enhanced microcantilever sensor. . Biosens Bioelectron 2009;25:130-135. Gunther S, Li BC, Miska S, Kruger DH, Meisel H, Will H. A novel method for efficient amplification of whole hepatitis B virus genomes permits rapid functional analysis and reveals deletion mutants in immunosuppressed patients. J Virol 1995;69:5437-5444. Boom R, Sol CJ, Salimans MM, Jansen CL, Wertheim-van Dillen PM, van der Noordaa J. Rapid and simple method for purification of nucleic acids. J Clin Microbiol 1990;28:495-503. Hoss M, Paabo S. DNA extraction from Pleistocene bones by a silica-based purification method. Nucleic Acids Res 1993;21:3913-3914. Lindh M, Gonzalez JE, Norkrans G, Horal P. Genotyping of hepatitis B virus by restriction pattern analysis of a pre-S amplicon. Journal of Virological Methods 1998;72:163-174. Mizokami M, Nakano T, Orito E, Tanaka Y, Sakugawa H, Mukaide M, Robertson BH. Hepatitis B virus genotype assignment using restriction fragment length polymorphism patterns. Febs Letters 1999;450:66-71. Klein A, Mark Spigelman, Paul Grant, Orit Pappo, Myeung J. Kim, Dong Hoon Shin, Daniel Shouval. Tracing hepatitis B Virus DNA back to the 16th century in a Korean mummy Hepatology 2007;46:Abs 925. Faerman M, Nebel A, Filon D, Thomas MG, Bradman N, Ragsdale BD, Schultz M, et al. From a dry bone to a genetic portrait: a case study of sickle cell anemia. Am J Phys Anthropol 2000;111:153-163. Anderson S, Bankier AT, Barrell BG, de Bruijn MH, Coulson AR, Drouin J, Eperon IC, et al. Sequence and organization of the human mitochondrial genome. Nature 1981;290:457-465. Don R, Cox PT, Wainwright BJ, Baker K, Mattick JS Touchdown PCR to circumvent spurious priming during gene amplification. Nucleic Acids Res 1991;19:4008. Weber DS, Stewart BS, Schienman J, Lehman N. Major histocompatibility complex variation at three class II loci in the northern elephant seal. Molecular Ecology 2004;13:711-718. Hofreiter M, Jaenicke V, Serre D, Haeseler Av A, Paabo S. DNA sequences from multiple amplifications reveal artifacts induced by cytosine deamination in ancient DNA. Nucleic Acids Res 2001;29:4793-4799. Briggs AW, Stenzel U, Johnson PL, Green RE, Kelso J, Prufer K, Meyer M, et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci U S A 2007;104:14616-14621. Green RE, Briggs AW, Krause J, Prufer K, Burbano HA, Siebauer M, Lachmann M, et al. The Neandertal genome and ancient DNA authenticity. EMBO J 2009;28:2494-2502. Baumert TF, Marrone A, Vergalla J, Liang TJ. Naturally occurring mutations define a novel function of the hepatitis B virus core promoter in core protein expression. J Virol 1998;72:67856795. Parekh S, Zoulim F, Ahn SH, Tsai A, Li J, Kawai S, Khan N, et al. Genome replication, virion secretion, and e antigen expression of naturally occurring hepatitis B virus core promoter mutants. J Virol 2003;77:6601-6612. 6 Table SI1: Summery of the authosomal STR alleles characterizing the Korean mummy. Sample/STR Amel HBV12 (Liver) HBV14 (Liver) HBV 15 (Lung) HBV 15a (Lung) D3 S1358 D19 S433 D2 D22 D16 S1338 S1045 S539 D18 D1 D10 S51 S1656 S1248 D2 S441 TH01 vWA D21 S11 D12 D8 FGA S391 S1179 16 23, ? ?, 29 Y, ? ?, 12 ?, 29 16, ? 9, 12 14, 16 aHBV ?, Y 16, ? Consensus 9, 12 14, 16 ?, 16 9, 13 16, 18 1, 13 Positive control X,Y 17,18 13,14 22,25 Investigator 1 X,X 15,18 13,13 20,23 Investigator 2 X,X 16,18 12,15.2 19,24 16 ?, 29 13, 15 15,19 10,11 13,13 11, 13 14, 15 15,16 8, 9 15,17 12, 18.3 12, 13 19,? 13, 15 19, 23 29, 31.2 18, 23 14, 15 20, 23 6, 6 17,19 29,30 17, 20 13,16 20, 25 8, 9 16,18 29,30 18,23 12, 14 24, 25 10, 14 6, 9.3 11.3, 11.3 10, 14 16, 19 12, 13 ?= unknown allele 7 Table SI2: Summary of primer sets designed and used in the study Primer L1 L2 L3 L7 F1R1 F2R2 F3R3 F7R7 F8R8 F11R11 S1 S2 X2 X3 Hep1 Hep2 Hep3 Hep4 Hep5 Hep6 Hep7 Hep8 Hep9 Position 750 914 850 1057 981 1222 1629 1821 135 339 280 F R F R F R F R F R F Sequence (5' - 3') GGTATTGGGGGCCAAGTCTG TGCGGTAAAGTACCCCAAC AAACGTTGGGGCTACTCCCT GCATCAAGGCAGGATAGCCA GAAAGTATGTCAAAGAATTGTGG CTATGGCCAAGCCCCATC CACCAGGTCTTGCCCAAGGT AGTTGCATGGTGCTGGTGA CTGGGGACCCTGCACCGAAC GGTGAGTGATTGGAGGTTGG AGGGGGAGCACCCACGTGTC 467 567 770 1770 1977 2098 2306 2457 2657 3195 202 422 611 2910 3085 3042 39 R F R F R F R F R F R F R F R F R GCAACATACCTTGGTAGTCC GCTGTACAAAACCTTCGGACG ACAGACTTGGCCCCCAATAC GTACTAGGAGGCTGTAGGCA GAAGGAAAGAAGTCAGAAGG GAATCTGGCCACCTGGGTGGGA TTGGTGGTCTGTAAGCGGGAGG CCTTGGACTCATAAGGTGGG AGGATAGAACCTAGCAGGCA TCCTCAGGCCATGCAGTGGA CTGTAACACGAGAAGGGGTCCTAG TGCCTCATCTTCTTGTTGGT GGATGGGAATACAAGTGCAG TCTGGGATTCTTTCCCGATCACC GTTGTCAATATGCCCTGAGCCTG AGGGTTCACCCCACCACACG ACTCTGGGATCTAGCAGAGCTTG 90 271 249 420 337 478 459 608 573 734 710 900 876 1033 1013 1132 1153 F R F R F R F R F R F R F R F R F CTGTTCCGACTACTGCCTCAC GAGAGAAGTCCACCACGAGTCTA TAGACTCGTGGTGGACTTCTCTC AGCAGCAGGATGAAGAGGAA ACCAACCTCTTGTCCTCCAA AGGACAAACGGGCAACATAC GTATGTTGCCCGTTTGTCCT TGGGAATACAAGTGCAGTTTC AAAACCTTCGGACGGAAACT CTGAAAGCCAAACAGTGGGGGAAAG CTTTCCCCCACTGTTTGGCTTTCAG CCAACTTCCAATTACATATCCCATG CATGGGATATGTAATTGGAAGTTGG GTGTAAAAGGGGCAGCAAAGC GCTTTGCTGCCCCTTTTACAC CAACGGGGTAAAGGTTCAGATA TATCTGAACCTTTACCCCGTTG Product 164 207 241 192 204 187 Annealing Temp. 54 54 54 54 54 54 54 54 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 203 207 208 200 222 189 175 210 181 61.8 62.4 62.4 57.3 57.3 57.3 57.3 55.9 55.3 64.6 64.6 59.7 59.7 59.8 59.8 58.4 58.4 171 141 149 161 190 157 119 128 8 Hep10 Hep11 Hep12 Hep13 Hep14 Hep15 Hep16 Hep17 Hep18 Hep19 Hep20 Hep21 Hep22 Hep23 OHep1 OHep2 OHep3 OHep4 OHep5 OHep6 1281 1261 1436 1417 1546 1527 1680 1660 1816 1787 1942 1922 2078 2054 2275 2256 2370 2345 2543 2524 2656 2635 2803 2784 2914 2895 3102 3084 3205 R F R F R F R F R F R F R F R F R F R F R F R F R F R F R GAGTTCCGCAGTATGGATCG CGATCCATACTGCGGAACTC GACGGGACGTAGACAAAGGA TCCTTTGTCTACGTCCCGTC AGACCGCGTAAAGAGAGGTG CACCTCTCTTTACGCGGTCT TTGCTGAGAGTCCAAGAGTCC GGACTCTTGGACTCTCAGCAA GGCAGAGGTGAAAAAGTTGC GCATAAATTGGTCTGTTCACCAG CTCCACAGAAGCTCCAAATTC GAATTTGGAGCTTCTGTGGAG TCATCAACTCACCCCAACACAG CATACAGCACTCAGGCAAGC CCACACTCCAAAAGACACCA TGGTGTCTTTTGGAGTGTGG GAGGCGAGGGAGTTCTTCTT GTTAGACGACGAGGCAGGTC AAAGGAGGGAGTTTGCCACT AGTGGCAAACTCCCTCCTTT GGATAGAACCTAGCAGGCATAA TTATGCCTGCTAGGTTCTATCC GCGCTGCGTGTAGTTTCTCT AGAGAAACTACACGCAGCGC CCAGAGGATTGGGAACAGAA TTCTGTTCCCAATCCTCTGG CAATATGCCCTGAGCCTGA TCAGGCTCAGGGCATATTG AGACAGTCATCCTCAGGCCA 659 875 942 966 1044 1235 1212 1436 1861 2588 2771 F R F R F R F R F R F CGTTTCTCCTGGCTCAGTTTAC AAGTTAAGGGAGTAGCCCCAAC TTTTCGGAAACTGCCTGTAAAT ATGTTTTCGGAAACTGCCTGTA ACCTGCCTTGATGCCTTTAT ATGCGCTGATGGCCTATG CTTGGCCATAGGCCATCAG GACGGGACGTAGACAAAGGA TGTTCAAGCCTCCAAGCTGT ATATGTGGGCCCTCTCACAG CAGCCTTGCCCACAAAGTAT 59.4 59.4 59.4 59.4 59.4 59.4 59.8 59.8 57.3 58.9 57.9 57.9 60.3 59.4 57.3 57.3 59.4 61.4 57.3 57.3 58.4 58.4 59.4 59.4 57.3 57.3 56.7 56.7 59.4 175 129 153 156 155 156 221 114 198 132 168 130 207 121 216 58.8 59.4 59.7 60.1 61.2 60.3 59.8 59.6 60.1 60.1 60.9 191 224 183 9 Table SI3: Published contemporary HBV sequences used in the analyses of the aHBV DNA genome. Accession no. DQ536410 DQ536412 AY247030 AY247031 X14193 Genotype C2 C2 C2 C2 C2 C2 Location Korea Korea Korea Korea Korea Korea Y18857 EU916241 AF182805 EU579443 FJ032361 EU589345 AF533983 C2 C2 C2 C2 C2 C2 C2 China China China China China China China Guo and Hou (1999) Fang and Gu Direct submission Lin et al. (2001) Liu et al. Direct submission Liu et al. Direct submission Liu et al. Direct submission Dong, and He Direct submission D50520 D23681 AB298721 AB113879 AB033553 AB368297 X04615 C2 C2 C2 C2 C2 C2 C2 Japan Japan Japan Japan Japan Japan Japan Asahina et al. (1996) Horikita et al. (1994) Inoue et al. (2008) Michitaka and Tran, Direct submission Okamoto et al. (1987) Nakajima and Abe, Direct submission Okamoto et al. (1986) AB111946 AB112065 AB031262 C1 C1 C1 Vietnam Vietnam Vietnam Huy et al. (2004) Huy et al. (2004) Yuasa et al. (2000) AB112066 AB112348 AB112408 C1 C1 C1 Myanmar Myanmar Myanmar Huy et al. (2004) Huy et al. (2004) Huy et al. (2004) AF068756 AB112472 AB112471 AB074755 C1 C1 C1 C1 Thailand Thailand Thailand Thailand Monkongdee et al. (1998) Huy et al. (2004) Huy et al. (2004) Sugauchi et al. (2002) AB205118 AY128092 AB205126 AB120308 AM422939 AB205191 AB205010 A A D D D E H Japan Canada Japan Japan France Ghana Japan Nakajima et al. (2005) Osiowy and Giles (2003) Nakajima et al. (2005) Michitaka et al. (2006) Mrani et al. Direct submission Huy et al. (2006) Nakajima et al. (2005) AY641563 Reference Kim et al. (2007) Kim et al. (2007) Song et al. (2005) Odgerel et al. (2003) Odgerel et al. (2003) Rho et al. (1989) 10 Table SI4: Summary of sequences obtained for each primer set and the overlap coverage. Nucleotide Position 1 90 135 249 272 337 347 280 422 574 567 710 750 874 850 981 1013 1132 1262 1418 1528 1629 1661 1770 1820 1923 2049 2098 2256 2346 2457 2525 2,689 2785 2841 2894 2910 3042 3195 Primer KPS/S Hep1 F1R1 Hep2 SAT-KZ Hep3 Hep4 Hep3-Hep4 F2R2 S2 Hep5 F3R3 Hep6 L1 Hep7 Hep6-Hep7 L2 L3 Hep8 Hep9 Hep8-Hep9 Hep10 Hep11 Hep12 L7 Hep13 F7R7 Hep14 Hep15 Hep16 F8R8 Hep17 Hep18 F11R11 Hep19 Hep20 Hep19-Hep20 Hep21 PS+M13 Hep22 X2 X3 S1 Product size 178 183 204 173 438 142 150 263 187 189 161 203 192 164 161 304 207 241 142 151 268 175 129 153 192 176 207 124 176 222 208 136 199 200 133 121 234 131 436 147 175 210 222 No. Sequences 9 6 1 4 3 4 3 6 1 12 4 1 3 1 4 2 1 1 4 4 4 2 2 8 1 4 1 2 6 5 1 6 5 1 6 4 4 3 12 4 3 2 1 Overlap (bp) 60; 89 24 165 150, 85 272, 263, 35 35 148 24 27 22 23 20 19 20 20 17 21 43 21 46 20 20, 76, 21 210 144 149 Direct/ Cloned C D D D C D D D D C D D D C D D C C D D D D D D C D D D D D D D D D D D D D C D C C C No PCR’s 2 3 1 2 2 2 3 3 1 3 2 1 2 1 2 1 1 1 3 3 2 1 1 4 1 2 1 2 3 3 1 3 3 1 3 4 4 2 3 2 1 1 1 Gene in annotation Pre-S2/S S gene S gene S gene GRE S gene S gene S gene S gene S gene S gene S gene S gene S gene S gene S gene Enhancer Enhancer Enhancer X gene X gene DR1, P gene P gene PreC P gene, PreC DR2, X, C , Poly A Poly A, C gene C gene P gene C gene P gene TATA box PreS1 PreS1 PreS1/2 PreS1/2 PreS/S D = direct sequencing; C = cloned sequencing 11