mStruct: Structure under Mutations

advertisement
mStruct:
Inference
of population
mStruct:
Structure
under
structure in the presence of genetic
admixingmutations
and allele mutations
Suyash Shringarpure and Eric Xing
Carnegie Mellon University
Significance
2
Genetic Population Structure
• Structure (Pritchard et al, 2000)
Ancestral
proportion
Africa
Europe
Mid-East
Cent./S. Asia
East Asia
Oceania
Genetic structure of Human Populations (Rosenberg et al. 2002)
3
Generative model- Structure
All the alleles observed at this locus
α (for the dataset)
0.8
0.2
0.8
0.2
0.3
0.7
Modeling allele similarity
• Microsatellite
– Repeats of a small DNA unit, say
Allele - 2
Allele - 9
Allele - 10
•Allele 9 is much more similar to allele 10 than allele 2.
•Allele 10 might be a mutation of allele 9.
•Mathematically encode the idea in the model
•mStruct – Structure under mutations
Hypothesis
• Individual genomes in modern populations are
a result of
– Admixture of ancestral populations.
– Mutations from ancestral alleles.
• Ancestral populations have fewer alleles
– (Mostly) True for microsatellites
Generative model- mStruct
All the alleles observed at this locus
α (for the dataset)
0.8
0.2
0.8
0.3
0.2
0.7
δ1
δ2
Mutation models
• How to derive descendant alleles from
ancestral alleles?
• Distribution based on the single step
model
• P(b|a) α δabs(b-a) , δ < 1
• Computationally “easy”
• NOT conventional mutation rate.
Finding ancestral alleles
• Fit mixtures of
mutation distributions
• Try using 1,2,3…..
ancestral alleles
• Use information
theory to decide how
many ancestral alleles
are appropriate
Histogram of observed alleles
Comparing population structure maps
Phylogenetic Trees from the Structural
Maps
11
Phylogenetic Trees from the Structural
Maps
mStruct
Structure
12
HGDP SNP results
Implications of Inconsistency
•
•
•
•
Simplistic mutation model
SNP mutations harder to discover from data
The model reduces to Structure
Fundamental difference
– Different markers treated differently
• Structure’s treatment of alleles is almost
categorical
Contour of Empirical Mutation
Conclusion
• Generative model for population structure
• Modeling mutations from ancestral alleles
• Gives mutational information apart from
population structure.
• (in press) Genetics
• Online version up now.
Graphical model representations
Structure
mStruct
Download