PopGen

advertisement
PopGen
TM
for Windows
Ver. 1.0 (Beta)
Jouni Aspi & Jaakko Lumme (1995)
University of Oulu
Department of Genetics
Contents:
Contents:
What is PopGen
General description of the model
How to run the program?
Using Simulation Models
1. Selection
2. Genetic Drift
F-statistics
3. Migration
Stepping stones model
4. Hardy & Weinberg equation
Subdivided populations
Hardy-Weinberg with selfing
Sex differences in allele frequencies
Page
2
3
3
4
5
6
7
8
10
10
11
12
13
14
What is PopGen?
PopGen is a simulation program designed to clarify various population genetic
events. It has been programmed by Jouni Aspi using Microsoft's Visual Basic
(ver. 3.0). The code is based on previous GW-Basic and QuickBasic programs
by Jaakko Lumme and Jouni Aspi.
General description of the model
With PopGen you can simulate some deterministic and stochastic population
genetic processes in a simple one locus, two allele system. There are two alleles
A1 and A2. Frequency of allele A1 is p and frequency of allele A2 is q.
Genotypes of individuals and their frequencies are:
Genotype
A1A1
A1A2
A2A2
Frequency
p2
2pq
q2
With the models of PopGen you can study how these allele frequencies are
affected by:
1. Genetic drift
2. Selection
3. Migration (stepping stone)
You can also study sample sizes you need to detect significant deviations from
Hardy Weinberg proportions, when:
1. The studied population is divided to two subpopulations
2. Allele frequencies are not similar in different sexes
3. The mating is not random, but there is some inbreeding
in the population
How to run the program?
From Windows Program Manager select File from the Menubar and then Run.
Then write a:popgen to the command line and press Enter.
Figure 1. The opening window of the program
Just click the window and you should be in the main window
Figure 2. The main window of the program
You choose your model from the main window. To select the model choose
Model from the Menubar and the desired model. The selected model is written
in to the heading of the window (in the beginning it is Drift).
Generally you can change the parameters of the model by writing them to the
text boxes or by using horizontal scroll bar to change them. You can get Help
anytime by pressing F1-key or choosing Help and PopGen Help from the
menubar of the Main Window.
Using of simulation models
1. Selection
Natural selection is the driving force of evolution. By this simple one locus
model you can study how selection changes the allele frequencies. This model
is deterministic, it doesn't take into account the effect of genetic drift. You can
only vary the fitnesses of different genotypes. Normally to the best genotype
are given fitness one and the other fitnesses are relative to that.
Write the fitness values in text boxes of the frame titled: Fitness values of the
genotypes or use horizontal scrollbars to change them. Then press OK button
and the mean relative fitness of the population with respect of the frequency of
allele A1 (p) will be drawn in the Fitness function picture box. The magnitude
of a change in p with respect to its frequency during one generation will also be
drawn in the Delta p frame.
Figure 3. The main window while running the Selection model
To change the initial allele frequency of A1 allele (p), change Initial p from the
frame titled Population parameters. You can also change the number of
generations, but you can not change the population. In the Selection model of
PopGen the population size is indefinite.
To start simulation click Start simulation button.
The allele frequency of A1 allele will be drawn in the frame on the bottom of
the screen. The final frequency of the latest simulation is also written to the
frame.
You can clear the bottom frame by clicking the Clear button.
2. Genetic drift
This stochastic model demonstrates the chance fluctuations in allele
frequencies. Such fluctuations occur particularly in small populations as a
result of random sampling among gametes. Like in the Selection model you
can change (1) Initial frequency of A (p) and (2) The number of generations,
but you can also define (3) the number of individuals in the studied
populations.
If you are studying only the effects of random drift set all the fitness values as
1. To start simulation click Start simulation button. The program will
simulate the fate of 100 populations (unless you stop the simulation by clicking
the Stop simulation button) by starting from the give Initial p value. The
number of the simulation will be shown on the bottom right corner of the
window. The allele frequency of each simulated population will be shown in
the bottom frame. The distribution of final frequencies will be shown in the
frame on the upper right corner.
You can also study the joint effects of selection and drift by varying the
fitnesses of different genotypes.
Figure 4. The main window while running the Drift model
F-statistics
When you have finished your simulation, you can choose F-statistics from the
Menubar. F-statistics is a statistical method to describe population subdivision.
In this case it describes the accumulated differences between the simulated
populations, which has started from the same initial p value. Because
population subdivision entails an inbreeding-like effect, it is traditional to
measure the effects in terms of decrease in the proportion of heterozygous
genotypes. A subdivided population has three levels of complexity i.e. (1).
Individual organisms (I), (2). Subpopulations (S), and (3). Total population
The heterozygosity can also be measured in these three levels:
HI = the observed average heterozygosity of an individual in
subpopulations
HS = the expected average heterozygosity of an individual in
subpopulations
HT = the expected average heterozygosity of an individual in total
population with no population subdivision
FIS is the INBREEDING COEFFICIENT. It measures the reduction in
heterozygosity of an individual due to non-random mating within its
subpopulation. Thus,
F IS =
HS - HI
HS
Because the mating within subpopulations in our simulations is random, the
overall Fis should be near zero.
FST is the FIXATION INDEX. It is the reduction in heterozygosity of a
subpopulation due to the random genetic drift. Thus,
F ST =
HT - HS
HT
It quantifies the effects of population subdivision and is always positive.
FIT is the OVERALL INBREEDING COEFFICIENT. It includes a contribution
due to actual non-random mating within subpopulations (Fis) and another
contribution due to the subdivision itself (Fst). Thus,
F IT =
HT - H I
HT
FIT measures the reduction in heterozygosity of an individual relative to the
total population.
3. Migration
Migration, which refers to the movement of the individuals between
subpopulations, is the "glue" that holds subpopulations together, that sets a
limit to how much genetic divergence can occur. If the subpopulations are finite
in size, then genetic drift may result in random differences among them even
with migration.
Stepping-Stone Model
If migration is restricted to adjacent populations, then movement forms a
stepping-stone pattern. The stepping-stone model can be one- or
two-dimensional. The current simulation model is one-dimensional:
┌───┐ m ┌───┐ m ┌───┐ m ┌───┐ m ┌───┐
│ N │ -> │ N │ -> │ N │ -> │ N │ -> │ N │
└───┘ <- └───┘ <- └───┘ <- └───┘ <- └───┘
N = population size of the island
m = migration rate between the islands
In the Migration: stepping stone model of PopGen you can change:
(1) Population size of an island (N)
(2) initial frequency of A1 (p)
(3) Migration rate (m)
The PopGen model demonstrates fluctuations in allele frequencies in the
subpopulations. They are presented in the frame of the upper right corner titled
p in subpopulations.
The model also shows the change in total heterozygosity due to population
subdivision. The program calculates and draws the expected heterozygosity
(2pq = 1-p²-q²) and the expected minus observed heterozygosity in the frame on
the bottom during one hundred generations (unless you stop the simulations
from by clicking the Stop simulation button). The program also shows the
mean p of all populations in the bottom frame.
Figure 5. The main window while running the Migration: stepping stones
model
In the Migration: stepping stones model of PopGen you can also study the
joint effects of drift, selection and migration by varying the fitnesses of
different genotypes.
4. Hardy & Weinberg equation
With PopGen you can also simulate how large samples are generally needed to
detect statistically significant deviations from expected Hardy & Weinberg
proportions, when:
1. The studied population is divided into two subpopulations
2. Allele frequencies are not similar in different sexes
3. The mating is not random, but there is some inbreeding
in the population
To select the desired model choose Model from the Menubar. The selected
model is written in the frame on the upper left corner.
Subdivided populations (Wahlund effect)
Figure 6. The Hardy-Weinberg window while running the Wahlund effect
model
This model simulates the situation when the population you have sampled
consist in reality of two populations with different allele frequencies.
Write sample size and allele frequencies in different populations in text boxes
(or use horizontal scrollbars to change the optional values), and click the Start
sampling button.
The expected genotype frequencies in different populations are drawn in the
picture box in right upper corner (titled Genotype frequencies). The program
writes the number of different genotypes drawn randomly from the different
population in the text boxes in the frame titled Number of individuals
sampled. They are also drawn in the Genotype frequency picture box as circles.
The program writes the sampled and also the a priori (expected if allele
frequency of A1 is the mean of the two populations) and expected genotype
frequencies in the frame on the bottom left. A priori, expected and sampled
allele frequencies are also given. Note that the two latter are always the same.
The program also calculates the test value (G2) and probability of a loglikelihood test given the null hypothesis that there are no differences between
the sampled and expected genotype frequencies.
Hardy-Weinberg with selfing
With this model you can simulate the situation in which pairing of gametes is
not random, because of some inbreeding in the population. You can change the
selfing rate (between 0-1), allele frequency of A1 (p) and sample size.
The program draws the expected genotype frequencies in different sexes in the
picture box in right upper corner (titled Genotype frequencies). The program
writes the number of different genotypes, drawn from the population with the
given selfing rate, in the text boxes in the frame titled Number of individuals
sampled. They are also drawn in the Genotype frequency picture box as circles.
The program writes the sampled and also the a priori (expected if allele
frequency of A1 is the given) and expected genotype frequencies in the frame
on the bottom left. A priori, expected and sampled allele frequencies are also
given. Note that the two latter are always the same. Finally, the program also
calculates the test value (G2) and probability of a log-likelihood test given the
null hypothesis that there are no differences between the sampled and expected
genotype frequencies.
Figure 7. The Hardy-Weinberg window while running Hardy-Weinberg with
selfing model
Sex differences in allele frequencies
With this model you can simulate the situation in which allele frequencies in
different genotypes are different. You can change the allele frequencies of A1
(p) in females and males and sample size.
The program draws the expected genotype frequencies in different sexes in the
picture box in right upper corner (titled Genotype frequencies). The program
writes the number of different genotypes drawn from the population in the text
boxes in the frame titled Number of individuals sampled. They are also drawn
in the Genotype frequency picture box as circles. The program writes the
sampled and also the a priori (expected if allele frequency of A1 is the given)
and expected genotype frequencies in the frame on the bottom left. A priori,
expected and sampled allele frequencies are also given. Note that the two latter
are always the same. Finally, the program calculates the test value (G2) and
probability of a log-likelihood test given the null hypothesis that there are no
differences between the sampled and expected genotype frequencies.
Figure 8. The Hardy-Weinberg window while running Sex differences in
allele frequencies model
Acknowledgement: We thank David Clancy for his comments on this program.
Download