Analysis of Microsatellite Loci on Chromosome 15 Melissa Bochnowicz Partner: Elizabeth Davidson Instructor: Dr. Carlini & Ben Gamache Due: Monday, April 29, 2013 Biology-356 Section 001H Abstract: Genomic DNA was extracted, then analyzed through PCR amplification and gel electrophoresis for 3 microsatellite loci on chromosome 15. These samples were then measured and compiled according to genotype in order to determine both allelic frequencies and whether the sample studied (the class) was in Hardy-Weinberg equilibrium at one of the loci. Based on the data, 17 of 28 genotypes were observed with 7 different alleles at the D15S657 locus, 28 of 66 were observed with 11 different alleles at the D15S652 locus, and 16 of 28 were observed with 7 different alleles at the D15S655 locus; of a possible 51,744 possible genotypes, only 43 were present. Allelic frequency was calculated by dividing the number of times an allele occurred by the total number of alleles present (in this case, 86 alleles were present). After completing a chi-squared test of equilibrium at the D15S657 locus in which chi-squared equaled 12.63460511, p equaled 0.9914 and there was 27 degrees of freedom, it was determined that the sample was indeed in Hardy-Weinberg equilibrium. Introduction: Microsatellites are a category of simple sequence repeats within the DNA that account for roughly 3% of the human genome. Most of these loci are not transcribed into RNA and are considered highly polymorphic since the mutations that accumulate within this portion of the DNA does not go on to affect the phenotype and therefore are not repaired as often as other, coding portions of the DNA. This level of polymorphism is due to the highly variable amount of copies per repeat unit in any one allele; these are also referred to as Variable Number Tandem Repeats (VNTR) and arise due to a defect in DNA polymerase that results in an increased number of repeats. Since DNA polymerase does not fix these mutations, they often times result in the lengthening and shortening of microsatellites, which in turn increases the number of repeating units (Lai & Sun 2003). Thus, each locus alone can contain hundreds of different alleles, making most individuals heterozygous at these points in the DNA and giving them all a unique DNA profile (fingerprint) if enough loci are analyzed. Since each individual (other than identical twins) has a unique DNA fingerprint, microsatellite analysis is typically used in criminal forensics and paternity suits to identify the proper individuals (Carlini 2013). For this experiment, student’s genomic DNA was extracted then analyzed through PCR amplification and gel electrophoresis in order to determine whether the class was in Hardy-Weinberg equilibrium. This was done using three distinct primer pairs per loci when completing the PCR amplification, analyzing the genotypic data gathered for the entire class and comparing this to the expected results of Hardy-Weinberg equilibrium. It was hypothesized that the class would not in Hardy-Weinberg equilibrium due to the size and nature of the sample studied. Materials & Methods: During Week 1, a 0.5X dilution of genomic DNA (obtained earlier in the semester) was created by transferring 5 μl of the DNA to a tube containing 5 μl of ddH2O. A 0.25X dilution was prepared by transferring 5 μl of the 0.5X diluted solution from the previous step to a different tube containing 5 μl of ddH2O. Both tubes were gently mixed before having 1 μl from each dilution pipetted into separate tubes. Once complete, the samples were placed in a thermocycler to undergo the thermal profile shown below. The samples were then kept at 4°C for the next 2 weeks. Thermal Profile Step Temperature Time Denaturation 95°C 5:00 5 cycles decreasing anneal 95°C 0:45 temp. 2°C each cycle: 68°C 5:00 72°C 1:00 5 cycles decreasing anneal 95°C 0:45 temp. 2°C each cycle: 58°C 2:00 72°C 1:00 95°C 0:45 50°C 2:00 72°C 1:00 72°C 5:00 25 cycles Final Polymerization After 2 weeks, the diluted genomic DNA was loaded into a gel and run for approximately 45 minutes. Images of the gel were then uploaded and the DNA profile analyzed accordingly. First, the straight-line vertical distance was measured for each of the 7 bands for both the far left ladder and the far right ladder. These values were then compiled and a linear regression performed in order to determine the appropriate equation for calculating the size of the DNA bands. After measuring the distance from the top of the gel for each of the 10 subsequent samples, the size of each allele was calculated using the above-determined equations (depending which ladder the sample was closer to determined which equation was used). Once complete, all data was compiled and compared. Next, a series of questions were answered in order to first determine allelic frequency, then whether the genotypes observed within the class were in HardyWeinberg equilibrium. To start, the number of different alleles was identified, then the number of possible different genotypes calculated. This value was compared to the actual number of genotypes observed for the class. Then allelic frequency was calculated by counting the number of times a single allele appeared and dividing this value by the total number of alleles present (in this case, 86 alleles were present). Finally, the expected frequencies for Hardy-Weinberg equilibrium were calculated for the 15S657 locus only using the expanded form of the below equation: ( p + q + r + s + t + u + v)2 = 1 where each variable represents the genotypic frequencies of each of the 7 homozygous genotypes, and the multi-variable portions refer to the frequencies of the heterozygotes. These values were multiplied by the sample size in order to accurately determine the expected number of students to have a single genotype. A chi-squared test was then performed to determine whether the observed data was in Hardy-Weinberg equilibrium. Results: Table 1: Allele Frequency for Chromosome 15 at 3 Loci D15S657 Locus D15S652 Locus D15S655 Locus Allele Frequency Allele Frequency Allele Frequency 344 0.058 300 0.035 251 0.012 348 0.279 303 0.221 254 0.209 352 0.198 306 0.047 257 0.302 356 0.233 309 0.081 260 0.128 360 0.116 312 0.093 263 0.035 364 0.105 315 0.116 266 0.140 368 0.012 318 0.058 269 0.174 321 324 327 330 0.198 0.093 0.023 0.035 Table 2: Test for Hardy-Weinberg Equilibrium at D15S657 Locus Observed Expected Number Expected Number of of Genotype Frequency Students Students (O-E)2/E 344/344 344/348 344/352 344/356 344/360 344/364 344/368 348/348 348/352 348/356 348/360 348/364 348/368 352/352 352/356 352/360 352/364 352/368 356/356 356/360 356/364 356/368 360/360 360/364 360/368 364/364 364/368 368/368 0.0038 0.03245 0.02299 0.027028 0.013456 0.01218 0.001392 0.077841 0.110484 0.130014 0.064728 0.05859 0.006696 0.039204 0.092268 0.045936 0.04158 0.004752 0.054289 0.054056 0.04893 0.005592 0.013456 0.02436 0.002784 0.011025 0.00252 0.000144 0.1453 1.3953 0.9884 1.162204 0.578608 0.52374 0.059856 3.347163 4.750812 5.590602 2.783304 2.51937 0.287928 1.685772 3.967524 1.975248 1.78794 0.204336 2.334427 2.324408 2.10399 0.240456 0.578608 1.04748 0.119712 0.474075 0.10836 0.006192 0 0.1453 3 1.8453 1 0.0001 1 0.0226381407 0 0.578608 0 0.52374 0 0.059856 4 0.1273305628 4 0.1186573283 4 0.4525478155 2 0.220448944 2 0.1070685119 1 1.761018495 2 0.0585721177 5 0.2686831108 2 0.0003101694086 1 0.3472428849 0 0.204336 2 0.0479095805 3 0.1963616329 3 0.3815768707 0 0.240456 0 0.578608 3 3.63952949 0 0.119712 0 0.474075 0 0.10836 0 0.006192 χ2 Degrees 12.63460511 27 of Freedom P-Value 0.9914 Discussion: A comparison of the alleles at each locus showed that at the D15S657 locus, the 348 allele occurred most frequently, appearing 24 times within the sample resulting in a frequency of 0.279; at the D15S652 locus, the 303 allele occurred most frequently, appearing 19 times within the sample resulting in a frequency of 0.221; and at the D15S655 locus, the 357 allele occurred most frequently, appearing 26 times within the sample resulting in a frequency of 0.302. For the D15S657 locus, 7 alleles were present and of the 28 possible genotypes, 17 were observed. At the D15S652 locus, 11 alleles were present and of the 66 possible genotypes, 28 were observed. Finally, at the D15S655 locus, 7 alleles were present and of the 28 possible genotypes 16 were observed. Many students shared the same genotype within, but there were only 2 pairs of student (TA13/TA8 & F7/TP13) that shared genotypes at 2 loci, in this case the D15S657 locus and the D15S655 locus. No students shared the same genotype at all 3 loci. After completing the chi-squared analysis, it was determined that the genotypic frequencies at the D15S657 locus were indeed in accordance with Hardy-Weinberg equilibrium, thus proving the hypothesis true. The chi-squared value of the completed test was 12.63460511 with a p-value of 0.9914 at 27 degrees of freedom. This means that there was a 99.14% chance that the observed and expected genotypic frequencies for the sample were caused at random. There was no significant difference between observed and expected, thus meaning the sample was in Hardy-Weinberg equilibrium. Any of the deviations observed can be explained by the finite sample size (population), random mutations or natural selection. Using only these three microsatellite loci, it would be possible to correctly identify any student based solely on an unknown DNA sample. This is possible since no two students share the same genotype at all three of the loci observed. If only the D15S657 and D15S655 loci were observed, a definitive identity would not be reached because multiple students share the same genotype at these loci. Though observing only 3 loci works for this finite sample, a greater diversity of loci should be viewed when searching for an individual in a larger population due to the greater possibility of similar genotypes occurring at only 3 loci. References: Carlini D. 2013. PCR Amplification of Three Microsatellite Loci on Chromosome 15. Lab Handout. Carlini D. 2013. Analysis of Microsatellite Data. Lab Handout. Lai Y, Sun F. 2003. The Relationship Between Microsatellite Slippage Mutation Rate and the Number of Repeat Units. Molecular Biology and Evolution 20(12): 21232131.