SimAdmix V1.0 (9/4/15) Introduction SimAdmix is a c++ program for simulating chromosome admixture of different populations. The input files are a pool of haplotypes from multiple pure ancestries, the proportion of each ancestral population in simulated families, and the family structure in PLINK format. The output includes datasets in PLINK format as well as truth in 3 formats described as below. Current Version as of 9/4/15 (1.0) Download (attach the binary file here) Basic Example In a typical command line, a few options need to be specified together with the input files. Here are two examples of how SimAdmix works: Example command line(2-way): ./SimAdmix --pedfile test.ped --datfile test.dat --cm $cm --nfam 10 –hapfile CEU_chr1.hap,YRI_chr1.hap –snplist CEU_chr1.snps,YRI_chr1.snps --prop 0.4,0.6 --nrep 2 -prefix test.cm$cm Example command line(3-way): ./SimAdmix --pedfile test.ped --datfile test.dat --cm $cm --nfam 10 –hapfile CEU_chr1.hap,YRI_chr1.hap,CHB_JPT_chr1.hap –snplist CEU_chr1.snps,YRI_chr1.snps,CHB_JPT_chr1.snps --prop 0.2,0.7,0.1 --nrep 2 --prefix test.cm$cm Example Files (attach test.ped) (attach test.dat) (attach *.hap) (attach *.snps) Options --pedfile pedigree file in PLINK format --datfile dat file in PLINK format --nfam number of replications of families --hapfile a pool of simulated or real haplotypes, one chromosome per row --snplist snp names in the order of haplotypes in hapfile, one snp per row --nrep the number of replications --cm the interval (in centimorgan) for admixure to happen --prop proportions of each input populations. --prefix prefix of output files (e.g. prefix.rep1.ped, prefix.rep2.ped) Ouput Format Truth.full (no header) Column Annotation: Family ID Individual ID Haplotype SNP1_from_population1 SNP1_from_population2 … SNPk_from population1 … NOTE: in k-way admixture, each SNP takes k columns, with only one columns being 1(indicating this snp is from this corresponding population) and the rest being 0. Example output 1(2-way): Fam1 Person1 HAP1 Fam1 Person1 HAP2 Fam1 Person2 HAP1 Fam1 Person2 HAP2 0 0 1 0 1 1 0 1 ... ... ... ... 1 0 0 0 0 1 1 1 … … … … Example output 1(3-way): Fam1 Person1 HAP1 Fam1 Person1 HAP2 Fam1 Person2 HAP1 Fam1 Person2 HAP2 0 0 1 0 1 0 0 1 0 1 0 0 ... ... ... ... 1 0 0 0 0 1 0 1 0 0 1 0 … … … … Truth.compact Column Annotation: Family ID Individual ID Haplotype Positions_where_admixture_happens Original_population_between_these_admixtures Haplotype_index_in_sampled_population Individual_ancestry_proportions Founder_or_not NOTE: in k-way admixture, the section Original_population_between_these_admixtures will take value from 1 to k. Correspondingly, Individual_ancestry_proportions section will have k values, adding up to 1. Example output 1(2-way): Fam1 Person1 HAP1 Fam1 Person1 HAP2 Fam1 Person2 HAP1 Fam1 Person2 HAP2 1,3821,5235,… 1,153,2152,… 1,2214,4271,… 1,2591,6913,… 1,1,2… 2,1,2… 1,1,2… 1,2,1… 18,5,… 0.21,0.79 8,1,… 0.19,0.81 1,3,… 0.20,0.80 9,7,… 0.18,0.82 Example output 1(3-way): Fam1 Person1 HAP1 Fam1 Person1 HAP2 Fam1 Person2 HAP1 Fam1 Person2 HAP2 1,3821,5235,… 1,153,2152,… 1,2214,4271,… 1,2591,6913,… 1,3,2… 3,1,2… 1,3,2… 1,3,2… 18,5,… 0.31,0.12,0.57 8,1,… 0.28,0.12,0.60 1,3,… 0.30,0.13,0.57 9,7,… 0.29,0.10,0.61 FOUNDER FOUNDER CHILD CHILD FOUNDER FOUNDER FOUDNER FOUNDER Truth.pop Column Annotation: Family ID Individual ID Haplotype Original_population_at_this_SNP Example output 1(2-way): Fam1 Person1 HAP1 Fam1 Person1 HAP2 Fam1 Person2 HAP1 Fam1 Person2 HAP2 1 2 2 2 1 2 2 2 1 2 2 2 … … … … 1 2 2 2 2 2 2 2 2 2 1 2 2 2 1 2 … … … … Example output 1(3-way): Fam1 Person1 HAP1 Fam1 Person1 HAP2 Fam1 Person2 HAP1 Fam1 Person2 HAP2 3 1 2 2 3 1 2 2 3 1 2 2 … … … … 3 2 2 3 3 2 2 3 1 2 2 3 1 2 2 3 … … … …