Visual Popgen - Department of Computing | East Tennessee State

advertisement
VisualPopGen
User’s Manual
This software was produced by students in Software Engineering I (CSCI 3250-001;
Fall 1998) and Software Engineering II (CSCI 3350-001; Spring 1999) at East
Tennessee State University. The course was taught by Dr. Don Gotterbarn, Department
of Computer and Information Sciences. The "client" for whom the software was
designed was Dr. Dan M. Johnson, Department of Biological Sciences. VisualPopGen
may be obtained free from the following web site: http://www-cs.etsu.edu/resrch.htm
For software related questions contact Dr. Don Gotterbarn at gotterba@etsu.edu.
For biological questions contact Dr. Dan M. Johnson at johnsodm@etsu.edu.
0
VisualPopGen 106754324
Table of Contents
Introduction
2
2
2
2
Overview of Software Package
System Requirements
Installing VisualPopGen
Biological Background
Evolutionary Forces, Genetic Parameters, and Default Values
Definition of Evolutionary Forces and Genetic Parameters
Equations Implemented in Sequential Order
Familiarizing Yourself With VisualPopGen
Overview
Hot Key List
Menu Options
File Menu Options
Generations Menu Options
Help Menu Options
Saving and Printing Graphs
Opening a Saved, Non-Bitmap, Graph
Printing a Graph
Saving Graph as a Bitmap Image
Example Scenarios
Simulating Hardy-Weinberg Equilibrium
Simulating Effects of User Entered Parameters
Viewing Composite Graphs
Glossary of Terms
4
4
5
7
8
8
10
11
11
11
12
13
13
13
15
16
16
17
18
19
1
VisualPopGen 106754324
.
Introduction
Overview of Software Package
VisualPopGen software package is designed for students in the Department of Biological
Sciences (Principles of Biology Lab III) at East Tennessee State University. This
program calculates the sequential effects of genetic drift, inbreeding, selection, gene
flow, and mutation on the proportion of a population’s gene pool comprised of the second
of two alleles (for definitions of these terms look under subheading Biological
Background). Simulations are usually for 100 generations, but they can be conducted for
as few as 10 or as many as 1,000 generations. VisualPopGen’s main purpose is to help
the user explore the effects of various evolutionary forces on a population’s gene pool by
providing graphs of the calculations, allowing the user to visualize the changes.
System Requirements
VisualPopGen is designed to run on IBM compatible machines with a Windows 95, 98,
NT 4.0, or compatible operating system. If the operating system is Windows NT, Service
Pack 3, which is available free of charge, must be installed on the system. The software
is designed to be used with a keyboard or a mouse, and is designed for use with a SVGA
color monitor set at 800x600 (256 colors) resolution or higher. Users with lower
resolution models or with monochrome monitors will be unable to run this software. The
software requires that a printer be correctly attached to any system to produce graph
printouts. It is highly recommended to use a color or laser printer for the best printout
quality. If a dot matrix printer is used, the distinction between graphs on a composite
printout may be unclear.
Installing VisualPopGen
In order to run VisualPopGen, the installation program will install necessary files onto
the system. The installation program detects the necessary requirements for each system.
If installing from a CD:
1. Place the CD in to the CD-ROM drive on the computer. At this point the CD should
automatically run. The install shield will walk the user through the installation
process. If the CD does not automatically run, follow the following steps for
drive installation.
2. Double left click the My Computer icon from the Windows desktop environment.
3. Double left click the Control Panel option.
2
VisualPopGen 106754324
4. Double left click the Add/Remove Programs option.
5. Choose the Install option then press Next.
6. If VisualPopGen is not on a floppy disk the browse option may be used to find it on
another drive.
7. Choose setup.exe and left click Next.
8. After the installation is complete the install shield will ask the user to reboot the
system. This reboot is necessary. The VisualPopGen icon will not appear until the
reboot is complete.
After the system has rebooted, the VisualPopGen menu option will be available on the
Programs section of your Windows environment. You can place the mouse pointer over
the VisualPopGen item, then double click on the submenu that appears.
3
Biological Background
Evolutionary Forces, Genetic Parameters, and Default Values
Evolutionary
Force
Genetic
Parameter
Name
Expected
Range
Default
Value
Graduation
Genetic Drift
N
Effective population
size
1 - 1,000,000
1,000,000
1
Inbreeding*
F
Inbreeding coefficient.
0 – 0.5
Selection
w11
w12
w22
Relative fitness of the
three possible
genotypes.
Proportion of new
migrants per generation
Proportion of allele 2
in migrants' gene pool
Forward mutation rate.
(A1->A2)
Back-mutation rate.
(A2->A1)
Initial proportion of A2
in the population's gene
pool
Number of generations
involved in
calculations.
0–1
0–1
0–1
(1/(2N)) or
0*
1.00
1.00
1.00
0–1
0.000
0.001
0–1
0.000
0.001
0–1
0.0000000
0.0000001
0–1
0.0000000
0.0000001
0–1
0.500
0.001
10 - 1000
100
10
m
Gene Flow
qm
u
Mutation
v
q0
Generation
s
*F=1/2N if Inbreeding is activated for simulation; F=0 if it is not.
4
0.001
0.01
0.01
0.01
Definition of evolutionary forces and genetic parameters:
Genetic Drift: Parameter (N) –Effective breeding population size.
When N is small, only a small number of alleles (2N) comprise the gene pool
passed from one generation to the next. Given these circumstances, allele
proportions may be affected by chance (random) sampling errors. Changes could
be positive or negative (flip of a coin). Genetic drift may lead to fixation of an
allele by accidentally failing to pass on alternative alleles during one generation.
Inbreeding:
Parameter (F) – Inbreeding coefficient.
Breeding among relatives causes some individuals who would otherwise have
been heterozygotes to be homozygous because the two alleles in their genotype
are “identical by descent.” The inbreeding coefficient, F, is the proportion by
which the expected heterozygote genotype proportion is reduced due to
inbreeding. Half of that proportion is added to each of the two homozygous
genotype proportions. The value of F increases by a factor of (1+F) each
generation. Inbreeding will not change allele proportions (p & q) unless
combined with selection--the only evolutionary force that acts directly on
genotypes.
Selection:
Parameter (w11) – Relative fitness of one homozygous genotype (A1A1)
Parameter (w12) – Relative fitness of the heterozygous genotype (A1A2).
Parameter (w22) – Relative fitness of one homozygous genotype (A2A2).
The most fit genotype must be assigned a relative fitness of 1, the others (those
being “selected against”) are assigned relative values (0 <= w <= 1).
Directional selection involves selection against one of the homozygotes but not
the other. This type of selection tends to favor one allele. This may result in
fixation of the favored allele due to complete elimination of the alternative allele.
Stabilizing selection involves selection against both homozygotes. This type of
selection results in an equilibrium (q stops changing). This situation results in a
“stable polymorphism,” where both alleles remain present in the gene pool.
Note: Selection tends to have a stronger effect on allele proportions when both
alleles are relatively frequent. If nearly all alleles are of one kind, there is little
genetic variation among individuals, and few experience their selective
disadvantage; thus changes in q become smaller per generation.
5
Gene flow:
Parameter (m) – the proportion of new migrants in a breeding population
each generation.
Parameter (qm) – the proportion of second allele (A2) among new migrants.
The effect of migrants entering the breeding population with their own
characteristic allele proportions may be very great if a high proportion of the
population (m) are recent migrants each generation, and those migrants have
allele proportions (qm) that are quite different from those in the rest of the
population. But if the proportion of migrants is low, or if allele proportions are
similar in residents and migrants, the effect of gene flow may be more similar to
that of mutations (an occasional source of alternate allele).
Mutation:
Parameter (u) – Forward mutation rate (A1 --> A2).
Parameter (v) – Backward mutation rate (A2 --> A1).
Mutation is not considered an important force in changing allele proportions. The
importance of mutations is as a source of alternative alleles, some of which might
become abundant in a population due to genetic drift or to selection.
Mutation & Selection. To illustrate the interaction of mutation (a source of
alternative alleles) and selection (a potentially strong force changing allele
proportions), start with q0 = 0 so that A2 must be introduced by mutation.
Purifying selection. Start with only one allele (A1) in the gene pool (set the initial
value of q to 0); then assign realistic mutation rates that cause introduction of the
second allele (A2); and implement selection against the mutant by assigning w22
the lowest value, and w11 = 1.
Progressive selection. Start with only one allele (A1) in the gene pool as above;
use the same realistic mutation rates; and implement selection that favors the
mutant allele (w22 = 1; w11 < 1).
6
Equations Implemented in Sequential Order
Allele proportions, conventionally called "gene frequencies," of the first generation
parent population, must be specified by the user unless the default value q0 = 0.5 is
accepted.
q0
p0 = 1 - q0
{q0, proportion of second allele (A2) in gene pool at start of simulation}
{p, proportion of first allele (A1) in gene pool}
{NOTE: p + q = 1 because only two alleles comprise the gene pool.}
{NOTE: Subscripts for q and p below indicate sequential calculations within each
generation. All equations are implemented each generation, but default parameter values
preclude some evolutionary forces from affecting allele proportions unless they are
activated and their genetic parameters are assigned other values by the user.}
Genetic Drift
q1 = q0 + (p0q0 / 2N)1/2
p1 = 1 – q1
{N, effective population size; default N = 1,000,000}
Inbreeding & Selection
q2 = (((p1q1 - Fp1q1)w12 + (q12 + Fp1q1)w22) /
((p12 + Fp1q1)w11 + (2p1q1 - 2Fp1q1)w12 + (q12 + Fp1q1)w22)))
p2 = 1 – q2
{F, inbreeding coefficient; default F = 0 if no inbreeding,
or F = 1/2N if inbreeding is being simulated}
{F increases each generation, Fg+1 = Fg (1+F)}
{w, relative fitness of genotype; default w11 = w12 = w22 = 1.00}
Gene Flow
q3 = q2 – m(q2 – qm)
p3 = 1 – q3
{m, proportion of new migrants per generation; default m = 0}
{qm, frequency of A2 in migrants; default qm = 1}
Mutation
q4 = q3 + up3 – vp3
p4 = 1 – q4
{u, forward mutation rate (A1->A2) per gene per generation; default u = 0}
{v, back mutation rate (A2->A1) per gene per generation; default v = 0}
The resulting gene pool represents the next generation parent population.
q0 = q4
Repeat for a specified number of generations. {default = 100 generations}
{Note: The equations above were adapted by Dan M. Johnson from Population Biology:
The Evolution and Ecology of Populations by Philip W. Hedrick, published by Jones and
Bartlett Publishers, Inc., in 1984.}
7
Familiarizing yourself with VisualPopGen
Overview
This section describes the graphical user interface of VisualPopGen, including the graph
tabs and menu options.
Open the VisualPopGen program by double clicking your mouse on the VisualPopGen
icon. You should see the VisualPopGen program open. The program will display one
parent window containing five property pages (tabs) representing four (4) individual
graph pages and one composite graph page. Only one graph page will be active
containing the default parameter values when no genetic forces are selected at initial
program startup. Only graph pages containing active graphs may be selected. The
inactive graphs will not be accessible until you select the File->New main menu item. If
you select the graph option New, from the File menu, the next available inactive graph
page will become active and its parameters will be set to the default values. After four
(4) graphs have been made active the File->New menu option will produce an error
message if selected. If two or more graphs are active at any given time the Composite
page will be active as well. Otherwise the composite page will remain inactive.
At the bottom left corner of each graph page will be 4 tabs containing the names of the
graph pages. These names will be inactive as long as the current parameter values have
been used to calculate the current graph. This tab becomes bolded then that graph page’s
current parameter values are not represented by the existing graph. This signals the user
to Plot the graph again to see the effects of the changed parameter values.
Each graph page will consist of the following:
•
A label identifying the name of the graph. The default is “PlotX” where X is the
graph number and the Composite page is named “Composite.” The label will be on
the tab that corresponds to the graph. You will have the option to rename this label (a
maximum of 15 characters long).
•
A list of evolutionary forces with a corresponding check box to the left of each. An
evolutionary force will be active when its box is checked and inactive when cleared.
•
A list of the genetic parameters with a corresponding text box to the right of each
parameter. The text boxes will contain up and down arrows for using the mouse to
increment/decrement the numeric values contained in the text box. If the user wishes
to enter the text from the keyboard the Tab button must be pressed before input will
be accepted. After a value in a text box has been changed, the new values will not be
accepted until the user places the mouse pointer somewhere outside of the text box
and clicks the left mouse button.
8
•
Each text box for a genetic parameter will initially be inactive until the relevant
evolutionary force check box has been selected.
•
A pane with a blank x/y graph. Axes of the graph will be labeled q, the proportion of
the second allele, (on the y-axis) and Generations Over Time (the x-axis).
•
A Reset button to reset a graph to its default parameters.
•
A Plot button to graph the results of the current simulation. If the Plot button is
selected, the mouse button can be held down, or the Enter button on the keyboard
can be held down, and the graph will repeatedly re-calculate graphs using current
parameter values. This is especially interesting if chance phenomena (genetic drift,
mutation) are activated so that each simulation is expected to be somewhat different.
The composite page will consist of the following:
•
A pane with a blank x/y graph. The axes of this graph will be labeled Proportion q
(the y-axis) and Generations Over Time (the x-axis).
•
A list of graphs held in the Data Module and a corresponding check box for each.
Checking this box for a graph will include that graph in the composite graph.
•
A pane listing the genetic parameters for any graph selected by the user. Only one
set of parameters may be selected at a time.
9
Hot Key List
File
New
Open
Save
Save As
Save All
Save As Bitmap
Print
Composite
Graph1
Graph2
Graph3
Graph4
ALT + F
CTRL + N
CTRL + O
CTRL + S
CTRL + B
CTRL + P
Exit
Generations
ALT + G
Help
Help (Online Help)
About
ALT + H
F1
Note: The user will be able to get through the main menu by using the tab key.
10
Menu options
File Menu Options
New
(Activates new graph page...if an inactive page exists. If this
option is selected and no inactive page is available an error
message will appear)
Open
(Opens a graph using genetic parameters stored in a disk file.)
Save
(Save active graph in focus to a disk file. Use this option if the
parameters may be changed at a later date.)
Save As Bitmap
(Save active graph in focus under a new name, 15 characters in
length, as Bitmap (.BMP) graphics format. Use this option if the
graph will be printed at a later date but not changed.)
Save All
(Save all active graphs using their respective tab names. All
graphs may be opened and changed at a later date.)
Print
(Activates popup submenu to select graph(s) to print. Then brings
up a Print Dialog box. The default setting will be for color
printing. If a color printer is not available, or the user wishes to
print in grayscale, this option must be selected. Graphs may only
be printed in Portrait format.)
This is the popup submenu when you select Print:
Print
------------Default Printer
Composite
Graph1
Graph2
Graph3
Graph4
Scale
Exit
(Print the selected graphs.)
(Shows the default printer the system uses.)
(Select composite page to print.)
(
Any active graphs that you wish
to print will be selected (checked) here.
)
(Allows user to select grayscale or color option for
printing)
(Quit program.)
Generations Menu Options
Set Number of Generations
(Allows user to select from 10 to 10,000
generations for the simulation)
11
Help Menu Options
Help
(Help: Online Users Guide to VisualPopGen.)
About
(About screen: The Team)
The VisualPopGen Team
Phase 1:
Phase 2:
Phase 3:
Project Manager
Roger Snodgrass
Project Manager
Roger Snodgrass
Project Manager
Roger Snodgrass
Configuration Managers
David Mumpower
Jennifer Smith
Configuration Managers
David Mumpower
Jennifer Smith
Configuration Manager
Jennifer Smith
Requirements Team
Melanie Timbs
Jay K. Singleton
Testing Team
Aaron Umbarger
Wendi Warden
Testing Team
Chris Simons
Jay K. Singleton
Melanie Timbs
Sushma Patel
Preliminary Design Team
Edward Ho
Sushma Patel
Chris Simons
Tools/User Manual Team
David Blair
Shannon Whitt
Jay K. Singleton
Arlene Miles
Tools/User Manual Team
Wendi Warden
Shannon Whitt
David Blair
Arlene Miles
Testing Team
Brandon Doran
Shannon Whitt
Eric Wilhoit
Detailed Design
Melanie Timbs
Sushma Patel
Brandon Doran
Code and Unit Testing
Aaron Umbarger
David Mumpower
Edward Ho
Brandon Doran
Eric Wilhoit
Tool/User Manual Team
David Blair
Aaron Umbarger
Arlene Miles
Wendi Warden
Code and Unit Testing
Edward Ho
Chris Simons
Eric Wilhoit
Other Contributors
Doug Harris
Todd Hawthorne
Rob Kubicki
Brian Luethke
Kate Tebeau
James Thomas
Dr. Dan M. Johnson
Dr. Don Gotterbarn
12
Saving and Printing Graphs
Opening a Saved, Non-Bitmap, Graph
To open a saved, non-bitmap, image the user must have a graph saved and the location of
the saved graph must be known to the user. First left click on the File menu option and
then left click on the Open option. From there a window will appear asking for the user
to select a file to open. Choose the correct file to open and press the OK button. The
graph will appear in the Visual PopGen environment in the first available inactive graph
page. If there are no inactive graph pages the user must close one active graph, making
room for the new graph.
Printing a Graph
When printing a graph, the user will left click once on the File menu option and left click
on the Print menu option. A window will appear asking the user to check the boxes next
to the graph(s) they wish to print. After the user chooses the graph(s), left click on the
Print button in this window. Another window will appear asking for the printing
destination. This option should default to the printer attached to the system (user cannot
perform the print option unless there is a working printer attached to the computer or
network the computer is on). Left click on the OK button once the correct printer is set
as the working printer. . If multiple printers are used on the system in which Visual
PopGen is installed the user must choose one by entering the Print Setup menu under
the File menu. This screen will also allow the user to print multiple copies if needed.
Select the correct printer from the Name box and left click on the OK button to activate
the print option.
Once the graphs are selected, the print button is pressed, and the printer selected, the
graph will be printed by the printer. A row of genetic parameter values used to generate
the each simulation included in the graph will be printed just below the image.
13
The following Composite graph was printed using the procedure described above. Note
that a row of genetic parameter values used to generate each simulation was printed
below the image. But this image had to be incorporated into this document the "oldfashioned" way--by literally cutting (with scissors) and pasting (with tape).
14
Saving Graph as a Bitmap Image
When saving a graph, the user has the option of saving the graph as a Bitmap image.
This option will allow the user to save a graph and open the graph image in another
application (such as a word processor) or at another computer for later use. The image
can be brought up in any Windows based environment that supports Bitmap imaging.
The user will be able to print out the image, as well as to add commenting text to the
graph for class assignments. In this example the Plot1 graph will be saved as an image.
To do this, position the mouse pointer over the File menu option and left click once on
the mouse button. This will cause a File menu to appear. Left click once again on the
Save As Bitmap option. This will pull up a window allowing the user to choose a
destination and give a file name to the saved graph. The file name should be less than
fifteen characters in length.
After choosing the option to save as a Bitmap image a popup menu will appear allowing
the user to save the file to a user chosen destination and to enter a file name for the file.
It is recommended that the file names be kept to a maximum of 15 characters. The user
should select a destination (if using a floppy disk choose A) and name the file. After this
left click on the Save button, as seen above, once and the graph will be saved.
Note that when a Bitmap image is retrieved and inserted into another application (or
printed), the row(s) of genetic parameter values used to generate each simulation will not
be printed below the image. Here is an example:
15
Example Scenarios
Simulating the Hardy-Weinberg Equilibrium
When any new graph tab is opened or the Reset button is pressed, a set of default genetic
parameter values will appear in the text boxes. These values have the effect of causing
no evolutionary forces to affect allele proportions, simulating what is known as HardyWeinberg Equilibrium. If the user presses the Plot button a graph of a straight line
should appear.
Example: (example from VisualPopGen)
The graph in this example is a (nearly) straight line because the proportion of the gene
pool, q, comprised by the second allele (A2) is not changed from its initial value, q0 = 0.5,
during 100 generations of simulation. The very slight deviations from a straight line are
attributable to genetic drift, even with N = 1,000,000.
16
Simulating effects of user entered parameters
To activate an evolutionary force, the user must mark the check box to the left of that
force by either left clicking the mouse button once while the mouse pointer is positioned
over the box, or by pressing the Tab key until it is accessed. This will cause text boxes
to the right of parameter names to become active. The user can then enter values for
those parameters and plot a graph showing the effects of those forces on the value of q.
To enter parameter values in these boxes, highlight the box and key in a valid value. To
activate that value, left click once anywhere on the graph page and VisualPopGen will
accept the value, check it for validity, and format the value to be acceptable in the
implemented equations.
Example: One graph page.
The graph above shows the combined effects of two evolutionary forces--genetic drift
(with parameter N = 100) and selection (against both genotypes that include the second
allele). The effect of selection is to reduce the proportion of the gene pool comprised of
A2 from q0 = 0.5 at the beginning to q = 0 (p = 1; so A1 has been "fixed" in the gene pool)
at Generation 58. If genetic drift were not activated, the effect of selection would be a
very smooth line; genetic drift is responsible for the irregularities in the line--random
changes in allele proportions due to chance.
17
Viewing Composite graphs
The Composite graph page allows any set of the graphs displayed in the Plot1 through
Plot4 graph pages to be graphed together for comparison. To access graph pages Plot2
through Plot4, the user must enter the File menu by left clicking the mouse button once
over the File option and choosing New; repeat this procedure to activate Plot3 and Plot 4.
Once two or more single graphs are active the user can left click on the Composite graph
page tab and select the graphs to compare. The graphs will appear in different colors,
allowing the user to distinguish between them. A key is displayed in the upper left hand
of the graph page to assist the user in identifying the separate lines. Left clicking on the
down arrow box below Plots Shown causes a chart to appear displaying the parameter
values for one of the graphs being shown.
Example: (Composite page)
The composite graph above compares three simulations. The first (drift/select) is like the
previous example, except that random effects of genetic drift are different due to chance.
The second simulation (selection) shows the smooth curve expected if only selection
were activated with the same parameter values used in the first simulation (and in the
previous example). The third simulation (select/geneflow) shows the combined effects of
selection (same as second simulation) and gene flow. The values of all parameters for
the third simulation are displayed in a window to the lower left of the page.
18
Glossary of Terms
Note: Each term is defined with respect to the meaning within this document. The terms
defined are found italicized in the document.
active- Any window or object which is able to acquire the focus.
allele- One of two alternative forms of a gene comprising a population's gene pool,
namely: A1 and A2.
check box- Allows selecting /deselecting evolutionary forces.
disabled- Making an object or option inactive.
enabled- Making an object or option active.
evolutionary force- One of five factors involved in graphing which are either
selected/deselected via check boxes. When selected the force's respective genetic
parameters are enabled.
focus- Any window or object in which a user is currently working and to which the user
can make changes.
genetic parameters- One of ten factors involved in graphing which are modifiable by the
user. Each parameter (excluding q0) is associated with an evolutionary force.
graph pane- Set of x/y axes containing the plotted function on each page.
graph parameters- Specific arrangement of evolutionary forces and genetic parameters
necessary to create a graph.
inactive- Any window or object which is incapable of acquiring the focus with the current
settings. Known sometimes as "grayed out" or "disabled".
page - One of five "tabs" which contain graph parameters and a graph pane. Only one
page may have the focus at any given time.
spin control- Up and down arrows for using the mouse to increment/decrement the
numeric values contained in the text box.
text box- Allows genetic parameter values to be typed via keyboard.
x/y axis - Quadrant I of standard Cartesian coordinate plane where x is horizontal axis
and y is vertical axis
19
Download