Population Genetics The study of naturally occurring genetic differences among organisms

advertisement
Consortium for Comparative Genomics
University of Colorado School of Medicine
Population Genetics
The study of naturally occurring genetic differences
among organisms
Biochemistry and Molecular Genetics
Consortium for Comparative Genomics
Human Medical Genetics, Computational Bioscience
University of Colorado School of Medicine
David.Pollock@UCDenver.edu
www.EvolutionaryGenomics.com
Topics
Population Genetics
Molecular Evolution
Phylogenetics
Statistical Inference in Genetics
The Study of Polymorphism &
Divergence
Stabilizing Selection: Human
Birth Weight
Phenotype
Genotype
(Variation)
Mutagenic Lesion
Mutation
(Recombination, Indels,
Segmental Duplication)
Variation
Molecular Characteristics
Expression
Modification
Folding
Interaction
Function
Evolution
(Fixation)
Phenotypic Effects
Dominant } Pleiotropic Effects
Recessive
Background Genetics
(Compensatory Changes)
Environment Dependence
Genetic Essentials I
Genotype and Phenotype
Most complex traits (hair color, height, weight,
behavior, life span, reproductive fitness) are
influenced by many genes
Genetic Essentials I
Alleles and Polymorphism
Mutagenic lesion, mutation, fixation
SNP, non-synonymous, synonymous, indel
Microsatellite, minisatellite, tandem repeat
Transposable elements, rearrangement, inversion
Reversion, multiple mutations, recombination
Evolution Creates Diversity
Populations (species)
split, multiply, and
diverge
And sometimes go
extinct
Schaal B A , Olsen K M PNAS 2000;97:7024-7029
Evolution Creates Diversity
Genes duplicate, multiply, and diverge
And sometimes are deleted
Uses of Polymorphism I
Estimate level of genetic variation and
understand patterns in different types
To understand
Uses of Polymorphism I
To understand evolutionary mechanisms that
produce variation
Uses of Polymorphism I
DNA fingerprinting: ID individuals
Uses of Polymorphism II
As indicators of population history
To understand evolutionary origin, global
expansion, and diversification of humans
European Origins
EUROPE
H, U, X
X
T, U,V, W
I, J, K
L
Africa
(origin
120-150KYA)
Asia, NA, SA,
Australia
35-50 KYA
N
North
America
M
FBZA
CDGY
• dispersal from
Africa 55-75
thousand
years ago
55-75 KYA
• Major European lineages (I, J, K, T, U,
V, W, H, X) split 35-50 thousand
years ago
34.02%
M
L
*
*
H
R
V
A
I
2.24%
U
*
W
x
J
• Some lineages were more
successful
T
12.00%
K
8.26%
Some Perspective
41K
MODERN
EUROPEANS
370K
AGA
Ancestral
Great Ape
195K
Uses of Polymorphism II
As indicators of population history
To understand evolutionary origin, global
expansion, and diversification of humans
To understand the origins of domestic
species
Uses of Polymorphism II
As indicators of population history
To understand evolutionary origin, global
expansion, and diversification of humans
To understand the origins of domestic
species
To determine origins of species (phylogeny)
and of morphology, behavior and other
adaptations
Looking into the Past (or Future)
Uniformitarianism
Past
Present
Future
Laws of Geology, Nature, Physics
Sexual reproduction
Diploid Parents
Haploid Gametes
(sperm, eggs)
Diploid Zygote (offspring)
Genetic Essentials II
Genotype-the combination of genes at a
locus (one from Ma, one from Pa)
What happens to genotypic variation?
Mixing, recombination, selection, drift
Homozygotes, heterozygotes, recessive,
dominant, codominant
Genotype => phenotype
Linkage disequilibrium
Measuring Variation
Number of individuals sampled
+/+: 76 people +/D32: 22 people D32/D32:2 people
Express as proportions => genotype
frequencies
+/+: 76/100=0.76
D32 is a CCR5 human
chemokine receptor gene
variant is strongly resistant to
infection by HIV-1
It is a major macrophage
coreceptor for HIV-1
+/D32:22/100=0.22
D32/D32:2/100=0.02
Measuring Variation
Allele frequency of D32 allele is
<freq of D32 allele> = (2*2 + 22)/(2*100) = 0.13
p = allele frequency, <p> is estimator
Estimated variance is <Var<p>>, sqrt => std error
95% within 2 std errors => 95% confidence interval
Sampled n individuals, 2n alleles; q = 1 - p
About 10% of Northern Europeans (black death?)
Example
Example
CCDR5  D32
q  0.018
se  0.0009
n  111
Pyrenees
19
p  1.7 10
Isolated small population in mountains ~18,000 years ago
Species and Geographical Structure
Biological species have nonrandom of spatial
distribution of organisms
Clumping, aggregation, herd (school, flock)
formation, colonies
Environmental patchiness
What’s a species anyway?
“Population” is easier to define
Local interbreeding units in restricted geographic
area, N individuals
Systematic changes in allele frequencies
Monday, July 25,
Biology 3040
Hardy-Weinberg ~1908
Mathematical model: non-overlapping
generations, sexual, diploid
First approximation
Oversimplified, but often good enough
Random mating, large N
With respect to the genotype under
consideration
For now, ignore mutation, selection,
dominance, migration
Hardy-Weinberg
Genotype frequencies achieve equilibrium
after ONE generation
Diploid parents => haploid gametes A and a,
frequencies are p + q = 1
AA: D’ = p2,
Aa: H’ = 2pq,
aa: R’ = q2
Random mating of individuals is usually
equivalent to random union of gametes
Graphical Genotype Frequencies
p
p
q
p
AA
Aa
p
q
Aa
aa
q
AA
Aa
q
Aa
aa
If one allele rare, almost all individuals that
have it are heterozygotes
Hardy-Weinberg
Genotype frequencies can only be calculated from
allele frequencies if H-W EQ is assumed
Don’t calculate p from sqrt(p2), as this assumes HW is true
Can test if there is deviation: observed genotype
frequencies versus expected
(p2, 2pq, q2)
Usually, no deviations. Common exception is due to
mixing of populations (Wahlund effect).
Heterozygotes under-represented
Experiment: Hair Color
• Experimental Outcome
Genotype DD
Db
bb
N
Count
13
3
36
20
p = 26.5/3 6 = 53/72 = 0.74
q = 1 - p = 9.5/36 = 19/72 = 0.26
HW
P2 * N
calculation
Expected 19.5
2pq * N
q2 * N
N
14
2.5
36
Mixing Populations
MM
MN
NN
Size
Aborigine 22
Navajo
305
Total
327
216
52
268
492
4
496
730
361
1091
Test for HW, pooled population
p = (327 + 268/2) / 1091 = 461/1091 = 0.423
q = 0.577
Expected
MM
MN
NN
Size
195
533
533
363
1091
Sexual reproduction
Diploid Parents
Haploid Gametes
(sperm, eggs)
Diploid Zygote (offspring)
What is Sex?
Mixing
Gender
Ploidy
Recombination
Multiple Loci: Linkage
2 loci, A and B, in a good population, each in HW
equilibrium
Not necessarily randomly associated
The gametes may have nonrandom associations
Gametes are A1B1, A1B2, A2B1, and A2B2
Allele frequencies of A and B, separately, are
p1 and p2 for A, q1 and q2 for B
Allele frequencies for A and B combined are
P11, P12, P21, P22, respectively
Download