VisualPopGen User’s Manual This software was produced by students in Software Engineering I (CSCI 3250-001; Fall 1998) and Software Engineering II (CSCI 3350-001; Spring 1999) at East Tennessee State University. The course was taught by Dr. Don Gotterbarn, Department of Computer and Information Sciences. The "client" for whom the software was designed was Dr. Dan M. Johnson, Department of Biological Sciences. VisualPopGen may be obtained free from the following web site: http://www-cs.etsu.edu/resrch.htm For software related questions contact Dr. Don Gotterbarn at gotterba@etsu.edu. For biological questions contact Dr. Dan M. Johnson at johnsodm@etsu.edu. 0 VisualPopGen 106754324 Table of Contents Introduction 2 2 2 2 Overview of Software Package System Requirements Installing VisualPopGen Biological Background Evolutionary Forces, Genetic Parameters, and Default Values Definition of Evolutionary Forces and Genetic Parameters Equations Implemented in Sequential Order Familiarizing Yourself With VisualPopGen Overview Hot Key List Menu Options File Menu Options Generations Menu Options Help Menu Options Saving and Printing Graphs Opening a Saved, Non-Bitmap, Graph Printing a Graph Saving Graph as a Bitmap Image Example Scenarios Simulating Hardy-Weinberg Equilibrium Simulating Effects of User Entered Parameters Viewing Composite Graphs Glossary of Terms 4 4 5 7 8 8 10 11 11 11 12 13 13 13 15 16 16 17 18 19 1 VisualPopGen 106754324 . Introduction Overview of Software Package VisualPopGen software package is designed for students in the Department of Biological Sciences (Principles of Biology Lab III) at East Tennessee State University. This program calculates the sequential effects of genetic drift, inbreeding, selection, gene flow, and mutation on the proportion of a population’s gene pool comprised of the second of two alleles (for definitions of these terms look under subheading Biological Background). Simulations are usually for 100 generations, but they can be conducted for as few as 10 or as many as 1,000 generations. VisualPopGen’s main purpose is to help the user explore the effects of various evolutionary forces on a population’s gene pool by providing graphs of the calculations, allowing the user to visualize the changes. System Requirements VisualPopGen is designed to run on IBM compatible machines with a Windows 95, 98, NT 4.0, or compatible operating system. If the operating system is Windows NT, Service Pack 3, which is available free of charge, must be installed on the system. The software is designed to be used with a keyboard or a mouse, and is designed for use with a SVGA color monitor set at 800x600 (256 colors) resolution or higher. Users with lower resolution models or with monochrome monitors will be unable to run this software. The software requires that a printer be correctly attached to any system to produce graph printouts. It is highly recommended to use a color or laser printer for the best printout quality. If a dot matrix printer is used, the distinction between graphs on a composite printout may be unclear. Installing VisualPopGen In order to run VisualPopGen, the installation program will install necessary files onto the system. The installation program detects the necessary requirements for each system. If installing from a CD: 1. Place the CD in to the CD-ROM drive on the computer. At this point the CD should automatically run. The install shield will walk the user through the installation process. If the CD does not automatically run, follow the following steps for drive installation. 2. Double left click the My Computer icon from the Windows desktop environment. 3. Double left click the Control Panel option. 2 VisualPopGen 106754324 4. Double left click the Add/Remove Programs option. 5. Choose the Install option then press Next. 6. If VisualPopGen is not on a floppy disk the browse option may be used to find it on another drive. 7. Choose setup.exe and left click Next. 8. After the installation is complete the install shield will ask the user to reboot the system. This reboot is necessary. The VisualPopGen icon will not appear until the reboot is complete. After the system has rebooted, the VisualPopGen menu option will be available on the Programs section of your Windows environment. You can place the mouse pointer over the VisualPopGen item, then double click on the submenu that appears. 3 Biological Background Evolutionary Forces, Genetic Parameters, and Default Values Evolutionary Force Genetic Parameter Name Expected Range Default Value Graduation Genetic Drift N Effective population size 1 - 1,000,000 1,000,000 1 Inbreeding* F Inbreeding coefficient. 0 – 0.5 Selection w11 w12 w22 Relative fitness of the three possible genotypes. Proportion of new migrants per generation Proportion of allele 2 in migrants' gene pool Forward mutation rate. (A1->A2) Back-mutation rate. (A2->A1) Initial proportion of A2 in the population's gene pool Number of generations involved in calculations. 0–1 0–1 0–1 (1/(2N)) or 0* 1.00 1.00 1.00 0–1 0.000 0.001 0–1 0.000 0.001 0–1 0.0000000 0.0000001 0–1 0.0000000 0.0000001 0–1 0.500 0.001 10 - 1000 100 10 m Gene Flow qm u Mutation v q0 Generation s *F=1/2N if Inbreeding is activated for simulation; F=0 if it is not. 4 0.001 0.01 0.01 0.01 Definition of evolutionary forces and genetic parameters: Genetic Drift: Parameter (N) –Effective breeding population size. When N is small, only a small number of alleles (2N) comprise the gene pool passed from one generation to the next. Given these circumstances, allele proportions may be affected by chance (random) sampling errors. Changes could be positive or negative (flip of a coin). Genetic drift may lead to fixation of an allele by accidentally failing to pass on alternative alleles during one generation. Inbreeding: Parameter (F) – Inbreeding coefficient. Breeding among relatives causes some individuals who would otherwise have been heterozygotes to be homozygous because the two alleles in their genotype are “identical by descent.” The inbreeding coefficient, F, is the proportion by which the expected heterozygote genotype proportion is reduced due to inbreeding. Half of that proportion is added to each of the two homozygous genotype proportions. The value of F increases by a factor of (1+F) each generation. Inbreeding will not change allele proportions (p & q) unless combined with selection--the only evolutionary force that acts directly on genotypes. Selection: Parameter (w11) – Relative fitness of one homozygous genotype (A1A1) Parameter (w12) – Relative fitness of the heterozygous genotype (A1A2). Parameter (w22) – Relative fitness of one homozygous genotype (A2A2). The most fit genotype must be assigned a relative fitness of 1, the others (those being “selected against”) are assigned relative values (0 <= w <= 1). Directional selection involves selection against one of the homozygotes but not the other. This type of selection tends to favor one allele. This may result in fixation of the favored allele due to complete elimination of the alternative allele. Stabilizing selection involves selection against both homozygotes. This type of selection results in an equilibrium (q stops changing). This situation results in a “stable polymorphism,” where both alleles remain present in the gene pool. Note: Selection tends to have a stronger effect on allele proportions when both alleles are relatively frequent. If nearly all alleles are of one kind, there is little genetic variation among individuals, and few experience their selective disadvantage; thus changes in q become smaller per generation. 5 Gene flow: Parameter (m) – the proportion of new migrants in a breeding population each generation. Parameter (qm) – the proportion of second allele (A2) among new migrants. The effect of migrants entering the breeding population with their own characteristic allele proportions may be very great if a high proportion of the population (m) are recent migrants each generation, and those migrants have allele proportions (qm) that are quite different from those in the rest of the population. But if the proportion of migrants is low, or if allele proportions are similar in residents and migrants, the effect of gene flow may be more similar to that of mutations (an occasional source of alternate allele). Mutation: Parameter (u) – Forward mutation rate (A1 --> A2). Parameter (v) – Backward mutation rate (A2 --> A1). Mutation is not considered an important force in changing allele proportions. The importance of mutations is as a source of alternative alleles, some of which might become abundant in a population due to genetic drift or to selection. Mutation & Selection. To illustrate the interaction of mutation (a source of alternative alleles) and selection (a potentially strong force changing allele proportions), start with q0 = 0 so that A2 must be introduced by mutation. Purifying selection. Start with only one allele (A1) in the gene pool (set the initial value of q to 0); then assign realistic mutation rates that cause introduction of the second allele (A2); and implement selection against the mutant by assigning w22 the lowest value, and w11 = 1. Progressive selection. Start with only one allele (A1) in the gene pool as above; use the same realistic mutation rates; and implement selection that favors the mutant allele (w22 = 1; w11 < 1). 6 Equations Implemented in Sequential Order Allele proportions, conventionally called "gene frequencies," of the first generation parent population, must be specified by the user unless the default value q0 = 0.5 is accepted. q0 p0 = 1 - q0 {q0, proportion of second allele (A2) in gene pool at start of simulation} {p, proportion of first allele (A1) in gene pool} {NOTE: p + q = 1 because only two alleles comprise the gene pool.} {NOTE: Subscripts for q and p below indicate sequential calculations within each generation. All equations are implemented each generation, but default parameter values preclude some evolutionary forces from affecting allele proportions unless they are activated and their genetic parameters are assigned other values by the user.} Genetic Drift q1 = q0 + (p0q0 / 2N)1/2 p1 = 1 – q1 {N, effective population size; default N = 1,000,000} Inbreeding & Selection q2 = (((p1q1 - Fp1q1)w12 + (q12 + Fp1q1)w22) / ((p12 + Fp1q1)w11 + (2p1q1 - 2Fp1q1)w12 + (q12 + Fp1q1)w22))) p2 = 1 – q2 {F, inbreeding coefficient; default F = 0 if no inbreeding, or F = 1/2N if inbreeding is being simulated} {F increases each generation, Fg+1 = Fg (1+F)} {w, relative fitness of genotype; default w11 = w12 = w22 = 1.00} Gene Flow q3 = q2 – m(q2 – qm) p3 = 1 – q3 {m, proportion of new migrants per generation; default m = 0} {qm, frequency of A2 in migrants; default qm = 1} Mutation q4 = q3 + up3 – vp3 p4 = 1 – q4 {u, forward mutation rate (A1->A2) per gene per generation; default u = 0} {v, back mutation rate (A2->A1) per gene per generation; default v = 0} The resulting gene pool represents the next generation parent population. q0 = q4 Repeat for a specified number of generations. {default = 100 generations} {Note: The equations above were adapted by Dan M. Johnson from Population Biology: The Evolution and Ecology of Populations by Philip W. Hedrick, published by Jones and Bartlett Publishers, Inc., in 1984.} 7 Familiarizing yourself with VisualPopGen Overview This section describes the graphical user interface of VisualPopGen, including the graph tabs and menu options. Open the VisualPopGen program by double clicking your mouse on the VisualPopGen icon. You should see the VisualPopGen program open. The program will display one parent window containing five property pages (tabs) representing four (4) individual graph pages and one composite graph page. Only one graph page will be active containing the default parameter values when no genetic forces are selected at initial program startup. Only graph pages containing active graphs may be selected. The inactive graphs will not be accessible until you select the File->New main menu item. If you select the graph option New, from the File menu, the next available inactive graph page will become active and its parameters will be set to the default values. After four (4) graphs have been made active the File->New menu option will produce an error message if selected. If two or more graphs are active at any given time the Composite page will be active as well. Otherwise the composite page will remain inactive. At the bottom left corner of each graph page will be 4 tabs containing the names of the graph pages. These names will be inactive as long as the current parameter values have been used to calculate the current graph. This tab becomes bolded then that graph page’s current parameter values are not represented by the existing graph. This signals the user to Plot the graph again to see the effects of the changed parameter values. Each graph page will consist of the following: • A label identifying the name of the graph. The default is “PlotX” where X is the graph number and the Composite page is named “Composite.” The label will be on the tab that corresponds to the graph. You will have the option to rename this label (a maximum of 15 characters long). • A list of evolutionary forces with a corresponding check box to the left of each. An evolutionary force will be active when its box is checked and inactive when cleared. • A list of the genetic parameters with a corresponding text box to the right of each parameter. The text boxes will contain up and down arrows for using the mouse to increment/decrement the numeric values contained in the text box. If the user wishes to enter the text from the keyboard the Tab button must be pressed before input will be accepted. After a value in a text box has been changed, the new values will not be accepted until the user places the mouse pointer somewhere outside of the text box and clicks the left mouse button. 8 • Each text box for a genetic parameter will initially be inactive until the relevant evolutionary force check box has been selected. • A pane with a blank x/y graph. Axes of the graph will be labeled q, the proportion of the second allele, (on the y-axis) and Generations Over Time (the x-axis). • A Reset button to reset a graph to its default parameters. • A Plot button to graph the results of the current simulation. If the Plot button is selected, the mouse button can be held down, or the Enter button on the keyboard can be held down, and the graph will repeatedly re-calculate graphs using current parameter values. This is especially interesting if chance phenomena (genetic drift, mutation) are activated so that each simulation is expected to be somewhat different. The composite page will consist of the following: • A pane with a blank x/y graph. The axes of this graph will be labeled Proportion q (the y-axis) and Generations Over Time (the x-axis). • A list of graphs held in the Data Module and a corresponding check box for each. Checking this box for a graph will include that graph in the composite graph. • A pane listing the genetic parameters for any graph selected by the user. Only one set of parameters may be selected at a time. 9 Hot Key List File New Open Save Save As Save All Save As Bitmap Print Composite Graph1 Graph2 Graph3 Graph4 ALT + F CTRL + N CTRL + O CTRL + S CTRL + B CTRL + P Exit Generations ALT + G Help Help (Online Help) About ALT + H F1 Note: The user will be able to get through the main menu by using the tab key. 10 Menu options File Menu Options New (Activates new graph page...if an inactive page exists. If this option is selected and no inactive page is available an error message will appear) Open (Opens a graph using genetic parameters stored in a disk file.) Save (Save active graph in focus to a disk file. Use this option if the parameters may be changed at a later date.) Save As Bitmap (Save active graph in focus under a new name, 15 characters in length, as Bitmap (.BMP) graphics format. Use this option if the graph will be printed at a later date but not changed.) Save All (Save all active graphs using their respective tab names. All graphs may be opened and changed at a later date.) Print (Activates popup submenu to select graph(s) to print. Then brings up a Print Dialog box. The default setting will be for color printing. If a color printer is not available, or the user wishes to print in grayscale, this option must be selected. Graphs may only be printed in Portrait format.) This is the popup submenu when you select Print: Print ------------Default Printer Composite Graph1 Graph2 Graph3 Graph4 Scale Exit (Print the selected graphs.) (Shows the default printer the system uses.) (Select composite page to print.) ( Any active graphs that you wish to print will be selected (checked) here. ) (Allows user to select grayscale or color option for printing) (Quit program.) Generations Menu Options Set Number of Generations (Allows user to select from 10 to 10,000 generations for the simulation) 11 Help Menu Options Help (Help: Online Users Guide to VisualPopGen.) About (About screen: The Team) The VisualPopGen Team Phase 1: Phase 2: Phase 3: Project Manager Roger Snodgrass Project Manager Roger Snodgrass Project Manager Roger Snodgrass Configuration Managers David Mumpower Jennifer Smith Configuration Managers David Mumpower Jennifer Smith Configuration Manager Jennifer Smith Requirements Team Melanie Timbs Jay K. Singleton Testing Team Aaron Umbarger Wendi Warden Testing Team Chris Simons Jay K. Singleton Melanie Timbs Sushma Patel Preliminary Design Team Edward Ho Sushma Patel Chris Simons Tools/User Manual Team David Blair Shannon Whitt Jay K. Singleton Arlene Miles Tools/User Manual Team Wendi Warden Shannon Whitt David Blair Arlene Miles Testing Team Brandon Doran Shannon Whitt Eric Wilhoit Detailed Design Melanie Timbs Sushma Patel Brandon Doran Code and Unit Testing Aaron Umbarger David Mumpower Edward Ho Brandon Doran Eric Wilhoit Tool/User Manual Team David Blair Aaron Umbarger Arlene Miles Wendi Warden Code and Unit Testing Edward Ho Chris Simons Eric Wilhoit Other Contributors Doug Harris Todd Hawthorne Rob Kubicki Brian Luethke Kate Tebeau James Thomas Dr. Dan M. Johnson Dr. Don Gotterbarn 12 Saving and Printing Graphs Opening a Saved, Non-Bitmap, Graph To open a saved, non-bitmap, image the user must have a graph saved and the location of the saved graph must be known to the user. First left click on the File menu option and then left click on the Open option. From there a window will appear asking for the user to select a file to open. Choose the correct file to open and press the OK button. The graph will appear in the Visual PopGen environment in the first available inactive graph page. If there are no inactive graph pages the user must close one active graph, making room for the new graph. Printing a Graph When printing a graph, the user will left click once on the File menu option and left click on the Print menu option. A window will appear asking the user to check the boxes next to the graph(s) they wish to print. After the user chooses the graph(s), left click on the Print button in this window. Another window will appear asking for the printing destination. This option should default to the printer attached to the system (user cannot perform the print option unless there is a working printer attached to the computer or network the computer is on). Left click on the OK button once the correct printer is set as the working printer. . If multiple printers are used on the system in which Visual PopGen is installed the user must choose one by entering the Print Setup menu under the File menu. This screen will also allow the user to print multiple copies if needed. Select the correct printer from the Name box and left click on the OK button to activate the print option. Once the graphs are selected, the print button is pressed, and the printer selected, the graph will be printed by the printer. A row of genetic parameter values used to generate the each simulation included in the graph will be printed just below the image. 13 The following Composite graph was printed using the procedure described above. Note that a row of genetic parameter values used to generate each simulation was printed below the image. But this image had to be incorporated into this document the "oldfashioned" way--by literally cutting (with scissors) and pasting (with tape). 14 Saving Graph as a Bitmap Image When saving a graph, the user has the option of saving the graph as a Bitmap image. This option will allow the user to save a graph and open the graph image in another application (such as a word processor) or at another computer for later use. The image can be brought up in any Windows based environment that supports Bitmap imaging. The user will be able to print out the image, as well as to add commenting text to the graph for class assignments. In this example the Plot1 graph will be saved as an image. To do this, position the mouse pointer over the File menu option and left click once on the mouse button. This will cause a File menu to appear. Left click once again on the Save As Bitmap option. This will pull up a window allowing the user to choose a destination and give a file name to the saved graph. The file name should be less than fifteen characters in length. After choosing the option to save as a Bitmap image a popup menu will appear allowing the user to save the file to a user chosen destination and to enter a file name for the file. It is recommended that the file names be kept to a maximum of 15 characters. The user should select a destination (if using a floppy disk choose A) and name the file. After this left click on the Save button, as seen above, once and the graph will be saved. Note that when a Bitmap image is retrieved and inserted into another application (or printed), the row(s) of genetic parameter values used to generate each simulation will not be printed below the image. Here is an example: 15 Example Scenarios Simulating the Hardy-Weinberg Equilibrium When any new graph tab is opened or the Reset button is pressed, a set of default genetic parameter values will appear in the text boxes. These values have the effect of causing no evolutionary forces to affect allele proportions, simulating what is known as HardyWeinberg Equilibrium. If the user presses the Plot button a graph of a straight line should appear. Example: (example from VisualPopGen) The graph in this example is a (nearly) straight line because the proportion of the gene pool, q, comprised by the second allele (A2) is not changed from its initial value, q0 = 0.5, during 100 generations of simulation. The very slight deviations from a straight line are attributable to genetic drift, even with N = 1,000,000. 16 Simulating effects of user entered parameters To activate an evolutionary force, the user must mark the check box to the left of that force by either left clicking the mouse button once while the mouse pointer is positioned over the box, or by pressing the Tab key until it is accessed. This will cause text boxes to the right of parameter names to become active. The user can then enter values for those parameters and plot a graph showing the effects of those forces on the value of q. To enter parameter values in these boxes, highlight the box and key in a valid value. To activate that value, left click once anywhere on the graph page and VisualPopGen will accept the value, check it for validity, and format the value to be acceptable in the implemented equations. Example: One graph page. The graph above shows the combined effects of two evolutionary forces--genetic drift (with parameter N = 100) and selection (against both genotypes that include the second allele). The effect of selection is to reduce the proportion of the gene pool comprised of A2 from q0 = 0.5 at the beginning to q = 0 (p = 1; so A1 has been "fixed" in the gene pool) at Generation 58. If genetic drift were not activated, the effect of selection would be a very smooth line; genetic drift is responsible for the irregularities in the line--random changes in allele proportions due to chance. 17 Viewing Composite graphs The Composite graph page allows any set of the graphs displayed in the Plot1 through Plot4 graph pages to be graphed together for comparison. To access graph pages Plot2 through Plot4, the user must enter the File menu by left clicking the mouse button once over the File option and choosing New; repeat this procedure to activate Plot3 and Plot 4. Once two or more single graphs are active the user can left click on the Composite graph page tab and select the graphs to compare. The graphs will appear in different colors, allowing the user to distinguish between them. A key is displayed in the upper left hand of the graph page to assist the user in identifying the separate lines. Left clicking on the down arrow box below Plots Shown causes a chart to appear displaying the parameter values for one of the graphs being shown. Example: (Composite page) The composite graph above compares three simulations. The first (drift/select) is like the previous example, except that random effects of genetic drift are different due to chance. The second simulation (selection) shows the smooth curve expected if only selection were activated with the same parameter values used in the first simulation (and in the previous example). The third simulation (select/geneflow) shows the combined effects of selection (same as second simulation) and gene flow. The values of all parameters for the third simulation are displayed in a window to the lower left of the page. 18 Glossary of Terms Note: Each term is defined with respect to the meaning within this document. The terms defined are found italicized in the document. active- Any window or object which is able to acquire the focus. allele- One of two alternative forms of a gene comprising a population's gene pool, namely: A1 and A2. check box- Allows selecting /deselecting evolutionary forces. disabled- Making an object or option inactive. enabled- Making an object or option active. evolutionary force- One of five factors involved in graphing which are either selected/deselected via check boxes. When selected the force's respective genetic parameters are enabled. focus- Any window or object in which a user is currently working and to which the user can make changes. genetic parameters- One of ten factors involved in graphing which are modifiable by the user. Each parameter (excluding q0) is associated with an evolutionary force. graph pane- Set of x/y axes containing the plotted function on each page. graph parameters- Specific arrangement of evolutionary forces and genetic parameters necessary to create a graph. inactive- Any window or object which is incapable of acquiring the focus with the current settings. Known sometimes as "grayed out" or "disabled". page - One of five "tabs" which contain graph parameters and a graph pane. Only one page may have the focus at any given time. spin control- Up and down arrows for using the mouse to increment/decrement the numeric values contained in the text box. text box- Allows genetic parameter values to be typed via keyboard. x/y axis - Quadrant I of standard Cartesian coordinate plane where x is horizontal axis and y is vertical axis 19