Blackett Family DNA Activity STR What is a Short Tandem Repeat Polymorphism (STR)? STR Polymorphisms Most of our DNA is identical to DNA of others. However, there are inherited regions of our DNA that can vary from person to person. Variations in DNA sequence between individuals are termed "polymorphisms". As we will discover in this activity, sequences with the highest degree of polymorphism are very useful for DNA analysis in forensics cases and paternity testing. This activity is based on analyzing the inheritance of a class of DNA polymorphisms known as "Short Tandem Repeats", or simply STRs. STRs are short sequences of DNA, normally of length 2-5 base pairs, that are repeated numerous times in a head-tail manner, i.e. the 16 bp sequence of "gatagatagatagata" would represent 4 head-tail copies of the tetramer "gata". The polymorphisms in STRs are due to the different number of copies of the repeat element that can occur in a population of individuals. In this activity, you will learn the concepts and techniques behind DNA profiling of the 13 core CODIS "Short Tandem Repeat" loci used for the national DNA databank CODIS = Combined DNA Index System: http://www.fbi.gov/hq/lab/html/codis1.htm What are the 13 core CODIS loci? A National DNA Databank The Federal Bureau of Investigation (FBI) of the US has been a leader in developing DNA typing technology for use in the identification of perpetrators of violent crime. In 1997, the FBI announced the selection of 13 STR loci to constitute the core of the United States national database, CODIS. All CODIS STRs are tetrameric repeat sequences. All forensic laboratories that use the CODIS system can contribute to a national database. DNA analysts can also attempt to match the DNA profile of crime scene evidence to DNA profiles already in the database. There are many advantages to the CODIS STR system: The CODIS system has been widely adopted by forensic DNA analysts STR alleles can be rapidly determined using commercially available kits. STR alleles are discrete, and behave according to known principles of population genetics The data are digital, and therefore ideally suited for computer databases Laboratories worldwide are contributing to the analysis of STR allele frequency in different human populations STR profiles can be determined with very small amounts of DNA Genetics of STR Inheritance Since there are no phenotypes associated with the CODIS STR loci, understanding the genetics of STR inheritance is simplified compared to other genetic problems. We need only consider the genotypes of the parents and their offspring. The alleles of different STR loci are inherited like any other Mendelian genetic markers. Diploid parents each pass on one of their two alleles to their offspring according. Here is brief review of the genetic concepts and terms important for understanding STR allele inheritance. For an in depth tutorial, visit our Monohybrid Cross problem set. 1 Allele. The different forms of a gene. Different STR repeat lengths represent different alleles at a genetic locus, i.e. 8 and 9 are different alleles of the THO1 locus. Locus. The position on a specific chromosome where the different alleles of a genetic marker are located. The plural is loci. Monohybrid Cross. Genetic cross involving parents differing in only one trait. Inheritance of each of the 13 STR loci can be treated as a separate Monohybrid Cross. Genotype. The genetic composition of the alleles at a locus. Since we are diploid, we each have two alleles at each locus. Homozygous. Both alleles at a locus are the same, i.e. Fred has a genotype of 29, 29 at the D21S11 locus. Heterozygous. Alleles at a locus are not the same, i.e. Normal has a genotype of 29, 31 at the D21S11 locus. Multiple Allelic Series. Many different alleles at a locus, i.e. the known alleles at the vWA locus are 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, and 21. Punnett Square. A diagram used to determine all possible genotypes that can occur in a genetic cross. All of the diagrams on this page are Punnett Squares. A DNA Profile: The 13 CODIS STR loci As part of his training and proficiency testing for DNA Profile analysis of STR (Short Tandem Repeat) Polymorphisms, Forensic Scientist and DNA Analyst Bob Blackett created a DNA profile on his own DNA. Here is Bob's DNA Profile for the 13 core Genetic Loci of the United States national database, CODIS (Combined DNA Index System): Locus D3S1358 vWA FGA D8S1179 D21S11 D18S51 D5S818 Genotype 15, 18 16, 16 19, 24 12, 13 29, 31 12, 13 11, 13 Frequency 8.2% 4.4% 1.7% 9.9% 2.3% 4.3% 13% Locus D13S317 D7S820 D16S539 THO1 TPOX CSF1PO AMEL Genotype 11, 11 10, 10 11, 11 9, 9.3 8, 8 11, 11 XY Frequency 1.2% 6.3% 9.5% 9.6% 3.52% 7.2% (Male) For each genetic locus, Bob has determined his "genotype", and the expected frequency of his genotype at each locus in a representative population sample. For example, at the genetic locus known as "D3S1358", Bob has the genotype of "15, 18". This genotype is shared by about 8.2% of the population. By combining the frequency information for all 13 CODIS loci, Bob can calculate that the frequency of his profile would be 1 in 7.7 quadrillion Caucasians (1 in 7.7 times 10 to the 15th power! In Bob's forensic DNA work, he often compares the DNA profile of biological evidence from a crime scene with a known reference sample from a victim or suspect. If any two samples have matching genotypes at all 13 CODIS loci, it is a virtual certainty that the two DNA samples came from the same individual (or an identical twin). D7S280 D7S280 is one of the 13 core CODIS STR genetic loci. This DNA is found on human chromosome 7. The DNA sequence of a representative allele of this locus is shown below. This sequence comes from GenBank, a public DNA database. The tetrameric repeat sequence of D7S280 is "gata". Different alleles of this locus have from 6 to 15 tandem repeats of the "gata" sequence. How many tetrameric repeats are present in the DNA sequence shown below? Notice that one of the tetrameric sequences is "gaca", rather than "gata". 2 1 61 121 181 241 301 aatttttgta tattttaagg ctaacgatag tgatagtttt gtgcaattct cctctgagtt ttttttttag ttaatatata atagatagat tttttatctc gtcaatgagg tttgatacct agacggggtt taaagggtat agatagatag actaaatagt ataaatgtgg cagattttaa tcaccatgtt gatagaacac atagatagat ctatagtaaa aatcgttata ggcc ggtcaggctg ttgtcatagt agatagatag catttaatta attcttaaga actatggagt ttagaacgaa atagacagat ccaatatttg atatatattc Answer the following questions- http://www.fbi.gov/hq/lab/html/codis1.htm 1. 2. 3. 4. 5. 6. 7. 8. What is CODIS? What is National DNA Index System (NDIS) and how many states participate in NDIS? Which state has the largest number of offender profiles and how many does that state have? How many profiles does Arizona have? How many labs in Arizona submit data to CODIS – NDIS? What % of Americans are listed in CODIS? Assume an American population of 300,000,000 What is the proposed future of CODIS? What is the general procedure for expunging a DNA profile? Here are some examples of the how STR data can be interpreted in a family DNA study. The numbers outside the Punnett Squares are the parental alles that can be present in the egg or sperm of the parents. The numbers inside the squares are the genotypes possible for the resulting children. Case 1 If the genotypes of both parents are known, we use a Punnett Square to predict the possible phenotypes of their offspring. Each child inherits one allele of a given locus from each parent. Panel (a) – Genotypes for Bob and his wife at the D21S11 locus- Bob Blackett (31,29) and his wife (28, 30) 1a. Draw a punnett square to illustrate the possible genotypes of any children of Bob and Ann. 1b. Bob’s mother, Norma, has a genotype at D21S11 locus- 29, 31 How could you prove that Fred could be a possible candidate to be Bob’s father? How could you rule out Fred as a father? Case 2 If the genotypes of a mother and several children are known, it is often possible to unambiguously predict the genotype of the father. In this case, Karen is the mother with a genotype of 9, 9.3 at the THO1 locus. Her three daughters are (8, 9) , (9, 9.3) and (9.3. 9.3). 2a. What is dad’s geneotype? 2b. If a fourth daughter were to have a genotype of (9.3, 7) what would we know about her? Case 3 Consider the case of Bob Blackett's 4 first cousins, Marilyn, Buddy, Dick and Janet. Bob did not have DNA samples for their parents, Bud and Louise, who are both deceased. The known genotypes for Buddy (12, 18) Marilyn (12,13) Dick (17,18) and Janet 917,18) 3a. What must be the genotypes of Bob’s Aunt and Uncle? 3b. Can you determine which genotype belong to his Aunt and which belongs to his Uncle? Explain your answer Create a data table to catalogue the genetic profile of the persons listed in the DNA analysis on page 5. Use your data to answer the following questions: Who are the parents of David and Katie? Do all of the data you have collected on the genotypes of Bob, Anne, Katie, and David support the conclusion that Bob and Anne are the biological parents of David and Katie? You should justify your answer by reference to the specific genotypes for the STR loci. 3 What is the genetic legacy of Fred and Norma? The alleles that Bob passes on to his children have in turn been inherited from Bob's parents, Fred and Norma. Identify the alleles among the 13 CODIS STR loci in the genotypes of Katie and David that have been unambigously inherited from each of their paternal grandparents. Now identify any additional alleles that might have been inherited from their paternal grandparents. Genetic Diversity and Sexual Reproduction Human geneticists are often asked why children have not inherited a particular trait from their parents. As a human geneticist, you know that one mechanism to insure genetic diversity is the independent assortment of alleles of different loci during gamete (egg and sperm) production, i.e. Mendel's Second Law of Genetics. To illustrate this important genetic principle, calculate how many genotypes would be possible among the children of Bob and Anne for the combined DNA profile from the D3S1358, vWA, and FGA. If you feel really ambitious, now calculate the possible genotypes of the children of Bob and Anne for all 13 CODIS STR loci. 4. How many genotypes are possible in a population for a three locus DNA Profile? If there are two alleles, A and B, at a genetic locus in a population, there are three possible genotypes, namely AA, BB, and AB. If there are three alleles, A or B or C, there are six possible genotypes, namely AA, BB, CC, AB, AC, and BC. For N different alleles, the total possible genotypes is given by the following expression: If we assume that the allele reference ladders from our data collection exercise represent all possible alleles (a conservative estimate), how many genotypes are possible in a population for the combined STR loci of D3S1358, vWA, and FGA? Extra Credit How many genotypes are possible in a population for the combined CODIS 13 STR loci? If you feel really ambitious, you may wish to calculate the number of possible genotypes considering all 13 CODIS STR markers. The table below shows the number of alleles for each locus. Beware, the number will be very large. Locus D3S1358 vWA FGA Alleles 8 11 14 Locus Alleles D13S317 D7S820 D16S539 8 10 9 D8S1179 D21S11 D18S51 D5S818 12 22 21 10 THO1 TPOX CSF1PO AMEL 7 8 10 XY 4 5