Genetics Challenge Set II --- Fall 2013 N.B. I encourage you to consult outside references as you work on these questions! Please provide reference information for each article/book/webpage that you find to be helpful. 1. You are charged with sequencing the matK gene from Rosa rubus, a local species for which several sequences are already available on boldsystems.org a. Locate relevant sequence data. Note: more than one sequence may be available. If so, use a unique identifier from the website to indicate which sequence you plan to investigate, e.g. "GBVJ2345=11". b. Choose a string of 20 nucleotides from anywhere within the recorded sequence data. Be sure to indicate the positions of these two nucleotides within the sequence of interest. For example, you might write, " I am focusing on nucleotides 70 through 89 in of the recorded sequence data: ATCAGTCAGCTAGCGACAAT." c. Draw a gel that could have been used to ascertain the sequence of this 20-nucleotide region using Sanger sequencing with fluorescent nucleotides. d. In no more than one paragraph, describe, in general, a procedure that could have been used to produce this gel. 2. New methods for sequencing ancient DNAs are increasing opportunities for biologists to deextinct species. As we've discussed in class, there are strong arguments both for and against these efforts. Imagine that you are a member of a group of scientists charged with bringing back Mammut americanum, a long-extinct mastodon that once roamed North America. a. In no more than one paragraph, describe a strategy you might use to approach this challenge. b. List three genetic/molecular challenges that you might encounter in this work, and propose a strategy to address each of them. c. In no more than one sentence, provide an argument either for or against de-extinction of M. americanum. 3. In class, we discussed cases of gyandromorphy in a chicken (Gallus gallus domesticus) and in the northern cardinal (Cardinalis cardinalis). Draw a diagram to show one mechanism where by a gyandromorphic individual might arise in a species for which the female is the heterogametic sex. 4. In lab earlier this semester, we considered allele frequencies at a biallelic locus under various assumptions about the relative fitnesses of homozygous and heterozygous genotypes. a. Using the R code provided for that lab (and included, again, below), find a set of relative fitness values that would yield non-zero, unequal genotype frequencies for the two homozygous genotypes at a locus that illustrates simple dominance. As you will recall, that R is a freeware program, and so can be downloaded onto your home computer (see http://www.rproject.org). The program is also available on the desktop Macs and PCs in Wilson 214. b. Please make a plot of the change in the frequency of the recessive allele over 1000 generations, print your plot, and include it with the rest of your challenge set. You are also welcome to sketch the plot by hand, if you prefer. c. In no more than one paragraph, describe the strategy you used to select this particular set of relative fitness values. d. Next, find a set of relative fitness values, with heterozygote fitness of greater than 0, that would result in fixation of the dominant allele. e. In no more than one paragraph, describe the strategy that you used to select this particular set of fitness values. 5. You are talking with a friend in introductory biology who is struggling with the concept of genetic drift. Draw (by hand or by computer, as you prefer) and annotate a diagram that could be helpful as you help to explain this concept. ### #R CODE FOR MODELING INFINITE POPULATION SIZE AND HARDY WEINBERG EQUILIBRIUM #These lines give the frequencies of the A (given by p) and a (given by q) alleles at the start of the simulation. The default setting is for the "a" allele to be a new mutation -- that is, it is present at very low frequency at the start of the simulation. q=.0001 p=1-q #These lines give the fitness values for three different genotypes at this locus. wAA=1 wAa=1 waa=1 qlist<-c() #This line stores the value of q (the frequency of the a allele) for each generation. qlisttenpercent<-c(q) #The function uses current allele frequencies to calculate the frequency of the a allele in the next generation. The default setting is to collect frequency data for 100 generations. for(i in 1:1000) { q=(p*(1-p)*wAa+(1-p)*(1-p)*waa)/(2*p*(1-p)*wAa+(1-p)*(1-p)*waa+p*p*wAA); qlist<-c(qlist,q); p<-(1-q)} plot(qlist, ylab="frequency of a", xlab="generation number")