Documentation

advertisement
SimAdmix
V1.0 (9/4/15)
Introduction
SimAdmix is a c++ program for simulating chromosome admixture of different populations. The input files are a pool of
haplotypes from multiple pure ancestries, the proportion of each ancestral population in simulated families, and the
family structure in PLINK format. The output includes datasets in PLINK format as well as truth in 3 formats described as
below.
Current Version as of 9/4/15 (1.0)
Download (attach the binary file here)
Basic Example
In a typical command line, a few options need to be specified together with the input files. Here are two examples of
how SimAdmix works:
Example command line(2-way):
./SimAdmix --pedfile test.ped --datfile test.dat --cm $cm --nfam 10 –hapfile
CEU_chr1.hap,YRI_chr1.hap –snplist CEU_chr1.snps,YRI_chr1.snps --prop 0.4,0.6 --nrep 2 -prefix test.cm$cm
Example command line(3-way):
./SimAdmix --pedfile test.ped --datfile test.dat --cm $cm --nfam 10 –hapfile
CEU_chr1.hap,YRI_chr1.hap,CHB_JPT_chr1.hap –snplist
CEU_chr1.snps,YRI_chr1.snps,CHB_JPT_chr1.snps --prop 0.2,0.7,0.1 --nrep 2 --prefix test.cm$cm
Example Files
(attach test.ped)
(attach test.dat)
(attach *.hap)
(attach *.snps)
Options
--pedfile
pedigree file in PLINK format
--datfile
dat file in PLINK format
--nfam
number of replications of families
--hapfile
a pool of simulated or real haplotypes, one chromosome per row
--snplist
snp names in the order of haplotypes in hapfile, one snp per row
--nrep
the number of replications
--cm
the interval (in centimorgan) for admixure to happen
--prop
proportions of each input populations.
--prefix
prefix of output files (e.g. prefix.rep1.ped, prefix.rep2.ped)
Ouput Format
Truth.full (no header)
Column Annotation:
Family ID
Individual ID Haplotype
SNP1_from_population1 SNP1_from_population2 …
SNPk_from population1 …
NOTE: in k-way admixture, each SNP takes k columns, with only one columns being 1(indicating
this snp is from this corresponding population) and the rest being 0.
Example output 1(2-way):
Fam1
Person1 HAP1
Fam1
Person1 HAP2
Fam1
Person2 HAP1
Fam1
Person2 HAP2
0
0
1
0
1
1
0
1
...
...
...
...
1
0
0
0
0
1
1
1
…
…
…
…
Example output 1(3-way):
Fam1
Person1 HAP1
Fam1
Person1 HAP2
Fam1
Person2 HAP1
Fam1
Person2 HAP2
0
0
1
0
1
0
0
1
0
1
0
0
...
...
...
...
1
0
0
0
0
1
0
1
0
0
1
0
…
…
…
…
Truth.compact
Column Annotation:
Family ID
Individual ID Haplotype
Positions_where_admixture_happens
Original_population_between_these_admixtures
Haplotype_index_in_sampled_population
Individual_ancestry_proportions Founder_or_not
NOTE: in k-way admixture, the section Original_population_between_these_admixtures will take
value from 1 to k. Correspondingly, Individual_ancestry_proportions section will have k
values, adding up to 1.
Example output 1(2-way):
Fam1
Person1 HAP1
Fam1
Person1 HAP2
Fam1
Person2 HAP1
Fam1
Person2 HAP2
1,3821,5235,…
1,153,2152,…
1,2214,4271,…
1,2591,6913,…
1,1,2…
2,1,2…
1,1,2…
1,2,1…
18,5,… 0.21,0.79
8,1,… 0.19,0.81
1,3,… 0.20,0.80
9,7,… 0.18,0.82
Example output 1(3-way):
Fam1
Person1 HAP1
Fam1
Person1 HAP2
Fam1
Person2 HAP1
Fam1
Person2 HAP2
1,3821,5235,…
1,153,2152,…
1,2214,4271,…
1,2591,6913,…
1,3,2…
3,1,2…
1,3,2…
1,3,2…
18,5,… 0.31,0.12,0.57
8,1,… 0.28,0.12,0.60
1,3,… 0.30,0.13,0.57
9,7,… 0.29,0.10,0.61
FOUNDER
FOUNDER
CHILD
CHILD
FOUNDER
FOUNDER
FOUDNER
FOUNDER
Truth.pop
Column Annotation:
Family ID
Individual ID Haplotype
Original_population_at_this_SNP
Example output 1(2-way):
Fam1
Person1 HAP1
Fam1
Person1 HAP2
Fam1
Person2 HAP1
Fam1
Person2 HAP2
1
2
2
2
1
2
2
2
1
2
2
2
…
…
…
…
1
2
2
2
2
2
2
2
2
2
1
2
2
2
1
2
…
…
…
…
Example output 1(3-way):
Fam1
Person1 HAP1
Fam1
Person1 HAP2
Fam1
Person2 HAP1
Fam1
Person2 HAP2
3
1
2
2
3
1
2
2
3
1
2
2
…
…
…
…
3
2
2
3
3
2
2
3
1
2
2
3
1
2
2
3
…
…
…
…
Download