Final exam and key for 2005

advertisement
NAME:
KEY
FINAL EXAMINATION
MOLECULAR GENETICS AND BIOTECHNOLOGY
BIOC6063
May 18, 2005
(1:00-3:00, Room 425C)
Notes: You have two hours to complete the exam. There are 10 questions,
so aim for spending no more than 10-12 minutes on each question. Do not feel
obligated to fill the space provided for answers.
Please return your completed course evaluation (unsigned).
1. A student gets an expression clone from a lab coworker, does expression in E. coli and
recovers an excellent yield of the protein. The coworker then provides a clone of the same gene
in the same expression vector that has been subjected to in vitro mutagenesis to change a
specified encoded residue. The mutation has been confirmed by DNA sequencing. By the same
expression protocol, the student is unable to recover any of the mutagenized protein. At the
suggestion of the supervising professor, a long series of unsuccessful experiments is conducted
changing expression conditions and methods of recovery of the sought after protein. Finally it is
discovered that the cause of the problem is that the insert is backwards in the second clone.
What simple initial characterization by the student would have prevented him from being
victimized by this blunder?
One should establish a restriction enzyme digestion pattern for each important DNA and
use it to verify the identity of subsequently produced samples. For a plasmid, the
enzymes used should be chosen so that the orientation of the insert is obvious from the
pattern. Most commonly, this exercise will save you from being victimized by
mislabeled tubes, miscommunications, and naturally arising deletions. This simple
preliminary characterization would have told you up front that the "mutant" clone was
compromised. As several students pointed out, a single sequence read across a
vector/insert junction would also conveniently confirm orientation. The restriction
pattern, however, detects a broader range of things that could have gone wrong in one
simple experiment.
This question is modeled after an actual event. The lab mate did the mutagenesis in
some other plasmid, and then transferred the insert to the expression clone. He
actually thoroughly characterized the DNA after the mutagenesis, and correctly
characterized the orientation of clones produced during the transfer. Then someone
either mislabeled a tube or misread a hand-written label, and you get handed the wrong
material. The moral of the story is that no matter how wonderful the material is
according to someone else's notebook, you always have the obligation to verify the
nature of materials with which you do your experiments.
I'd also say that long before changing expression conditions, I'd redetermine the DNA
sequence through the N terminus, translation controls, and back through the promoter
to be sure all of these were unaltered.
As an aside, after establishing the foundation stock (either the DNA stock or glycerol
stock that is supposed to last the next 4 years), I'd get DNA from it and reconfirm the
mutation by sequencing. Absolutely the worst thing that can happen to you in this
experiment is to inadvertently get the wild type clone, and then spend years studying
this supposed "mutant" that really doesn't have any mutation in it.
2. When PCR primers fail to produce the expected product, a common (and often successful)
strategy is to try again with a lower annealing temperature on the theory that the predicted Tm of
the primers was a little bit too high. Are there any circumstances where it might help to try again
with a higher annealing temperature rather than a lower one? [Hint: That the predicted annealing
temperature is too low is not a good answer.]
Sometimes the lack of the expected product means that the PCR primers are tied up in
some nonproductive interaction that might be destabilized by use of a higher
temperature. The most common of these is production of a primer dimer. Other
nonproductive interactions include hairpin formation or self annealing between primers
in a non priming configuration. Usually if the problem is just low stringency of priming,
the correct product will appear accompanied by other spurious products. However, in
cases where the desired product amplifies poorly because of GC content or large size,
false priming can make small spurious products that use up the primers before the
desired product has a chance to appear. Of course, if you go above the point where the
primers can prime on the intended sites, then the reaction will still fail. But the
predicted Tm may be conservative, so it's worth a try to both raise the temperature as
well as lowering it to see what happens. Often people will raise and lower the Mg
concentration instead of the temperature because that experiment can be done all at
once.
Of course, the best practice is to try to exclude these kinds of problems in the primer
design phase. However, sometimes you are constrained by the nature of the sequence
to use a suboptimal primer.
These kinds of problems are most common when there is a large 5' extension on the
primers, because of the increased number of possible interactions these sequences can
get into. In these cases, raising the annealing temperature as high as it can go (to the
extension temperature) may produce your product even if this temperature produces
inefficient priming at the correct sites. This is because the inefficiency will only affect
the first round. At later rounds, the entire length of the primer will be priming.
3. You conduct in vitro mutagenesis to change a protein residue encoded by an expression clone.
Your supervisor asks you to provide some of the DNA to a summer student to try their hand at
DNA sequencing. The summer student uses thermal cycle sequencing, fluorescent dye
terminators, and the institutional sequencing facility includes his reactions in one of their
capillary sequencing runs. The student reports back that your mutagenesis procedure has
induced 4 single base deletions in the insert besides the intended mutation. Of course, you will
obtain sequence data yourself for this clone, rather than relying on the work product of a summer
student. However, in the spirit of assisting in the training of the summer student, are there any
questions you should ask about their result.
You should insist on seeing the chromatogram(s). With the chromatogram in sight, it is
relatively simple to diagnose a variety of artifacts that cause bases to disappear. Of
these, the most common are trying to read too far from the primer (loss of resolution),
and uneven peak spacing from secondary structure (compression).
4. For the expression of mammalian proteins in E. coli, a fusion construct is usually used to
enable affinity purification. The affinity domain can be either fused at the C terminus or the N
terminus. Is there any advantage of putting the affinity domain on one end versus the other?
Most circumstances favor N terminal fusions: When there are problems related to
failure to fold leading to proteolysis, fusion to N terminal globular fusion partners is
thought to be a more effective stabilizing force than fusion to C terminal fusion partners.
(However, be aware that "instability" is the standard rationalization of people who don't
actually know what's wrong with their construct). If the intent is to cause secretion of the
protein, then the fusion partner with secretion signals must be on the N terminus. N
terminal fusions avoid putting alien sequences next to the translation signals, which can
inadvertently lead to mRNA secondary structure that inhibits translation. N terminal
fusion keeps infrequently used codons in the alien portion of the construct away from
the start of translation, where they are thought to have a greater inhibitory effect on
translation efficiency. There are some sequences that are thought to trigger turnover of
the protein when on the N terminus. N terminal fusions keep the alien sequence away
from that position. If you need to retain the N terminal methionine on your purified
protein, the N terminal fusion partner protects it from post translational removal. You
will have to position the cleavage site so that the N terminal uncovered after in vitro
cleavage is suitable for your purpose. Vectors for globular N terminal fusion partners
have the valuable property that you can validate most important aspects of the
expression system by expression from the vector without any alien insert.
C terminal fusions can work fine if you don't happen to have any of the above
circumstances. If you plan to do experiments with the fusion partner still attached, and
find out that the structure or function of the protein will not tolerate an N terminal fusion,
then a C terminal fusion might be the solution. You can imagine variations on this
structural incompatibility theme that a C terminal fusion might solve. For example, if the
N terminal structure tends to bury the protease cleavage site so that cleavage to
remove the fusion partner is difficult, you might try a C terminal fusion. Some fusion
partners have to be on the C terminus to properly function. For example, in some M13
phage display constructs the fusion partner needs its C terminus free to properly
assemble into the phage.
5. You are asked to do an initial bioinformatics investigation of a newly discovered human gene
that is very distantly related to another human gene your lab has studied for decades.
Specifically, you have been asked to identify a mouse gene that could be used as an animal
model of the human gene. You observe in one protein family database that the gene is part of a
named but uncharacterized family. The family lists yet another human gene and sometimes two
genes for other mammals including mice. A different protein family database also lists an
uncharacterized family containing the target gene, but not a second human gene and only one
mouse gene. What is the likely source of this discrepancy? How will this situation influence
your decision about choosing a mouse gene as the animal model for the human gene?
What is described is a set of (at least) 3 paralogous genes. They are 1) the gene your
lab has studied, 2) the gene you were asked to investigate, and 3) the other homologue
that showed up in database #1. Since paralogues can be expected to have some
degree of functional distinction, you will want to choose the orthologous mouse gene as
your prospective mouse model rather than the paralogous one. The lazy thing to do
would be to assume that curators for database #2 had subdivided the single family from
database #1 into 2 families, such that the single mouse gene that was grouped with
your target human gene was its orthologue. The worse case scenario would be that
database #2 had only one mouse gene and one human gene in its family because it
was incomplete. Then you'd have a 50% chance of choosing a mouse gene with a
different function to model your human gene. Of course, you could just search
database #2 for the missing paralogues to confirm that there was in fact another family
that included them. Even given that result, family databases use fairly low power and
arbitrary methods to define families. To avoid getting drawn into a paralogous
comparison, you should reevaluate the relationships among these sequences yourself.
The most thorough thing to do would be to make a tree containing all of these genes,
and do a bootstrap analysis. Many students simply said to consider the mouse gene
that was "closest" to be the prospective mouse model. The algorithms that make trees
are designed to be more accurate about measuring what is closest by descent than just
looking at some similarity scores. The bootstrap analysis determines the statistical
confidence that the perceived greater closeness of one of the mouse genes can not be
accounted for by the random nature of the divergence process.
Some students pointed out other issues that are worthy of mention. It is possible that
database curator #1 created two entries out of one gene because there is alternative
splicing. You should indeed track the protein sequence back to the nucleotide
sequence to know if these are really different genes or not. As another issue, the
protein databases actually organize families of domains, not of entire protein
sequences. You will want to track down the nature of any other domains within each of
these genes on the chance that the domain structure between paralogues is different.
NAME:
6. Write a brief abstract (a few sentences) describing results from a fictitious research project.
Use the following terms in the proper context in the abstract:
recessive
complementation
phenotype
selection
allele
diploid
conditional lethal
To isolate yeast mutants affected in DNA replication, we used mutagenesis and a
selection scheme based on temperature-sensitive, i.e. conditional lethal, 3H-thymidine
incorporation. Mutagenesis was conducted with MATa and MAT haploid cells, and 50
different mutants were found by phenotype analysis of diploids to fall into 5
complementation groups. All were recessive mutations. We have assigned preliminary
genotype designations of rep1-rep5. One complementation group (rep2) was shown to
represent 20 different mutant alleles.
7. Briefly describe the genetic selections used in the following applications:
(a) the yeast two-hybrid system
Co-transformants containing two plasmids (one containing the activation-domain
fusion gene and the other containing the DNA-binding domain fusion gene) are selected
based on nutritional markers on each plasmid (say LEU and TRP). Selection for
interaction is based on activation of GAL4 promoters or UAS sequences connected to
reporter genes (usually HIS and LACZ).
(b) gene replacement or knockout in mammalian embryonic stem cells
Homology boxes on either side of the changed sequence allow for
homologous recombination. Positive selection (i.e., G418 resistance due to
nearby NEO gene inside the homology boxes) combined with negative
selection (i.,e., for resistance to ganciclovir due to loss of TK genes outside
the homology boxes) is used to identify stem cells containing the
replacement at the proper locus.
8. Consider the following situation. You mutagenize Drosophila larvae and then examine the
wings of adults that hatch out from the pupal case. You notice that a group of sensory bristles
that line the periphery of the wing are absent. Based upon what was discussed in class:
a. Define cell autonomous and non-autonomous phenotypes. Based upon these definitions,
state in which cells you would expect to find the mutations that give rise to both
autonomous and non-autonomous phenotypes.
Cell Autonomous: Phenotype resides with genotype. Cells carrying the mutations exhibit
the phenotype. Mutations would reside in bristle cells themselves.
Non-autonomous: Phenotype is independent of genotype. Cells carrying the mutation
are not the one’s exhibiting the phenotype. Mutations would reside in cells other that the
bristle cells.
b. Why is it important to distinguish between such phenotypes?
As phenotypes don’t necessarily correlate with genotypes, it is important to determine
where the mutation resides. This will allow you to assess potential mechanisms of
actions for specific mutations and help in determining how genes function in an in vivo
setting.
c. Finally, suggest potential mechanisms whereby autonomous and non-autonomous
mutations might give rise to the observed phenotype.
Cell autonomous mechanism: Mutation resides in the genes involved in formation of
sensory bristles. Could range from master regulatory genes required for sensory cell
maintenance to genes involved in bristle structure (i.e. structural genes).
Non-autonomous mechanism: Mutation resides in cells neighboring the bristle.
Mutations could affect secretion of a signal required for bristle cell formation. Also could
play a role in lateral inhibition via competition mechanisms.
9. Briefly describe how DNA microarrays are used for expression profiling. What types of
follow-up methods are needed to confirm results?
DNA fragments (PCR products or oligonucleotides) representing a spectrum of
genes (the number may be limited or inclusive for an organism) are displayed on glass
slides or filters. The microarray is hybridized with cDNAs made from control RNA
samples and labeled with one type of fluorophore, and with cDNAs made from test RNA
samples and labeled with a different type of fluorophore. Levels of fluorescent intensity
are compared to determine relative levels of expression of given genes in the control
versus test RNAs. This method has many applications, including analysis of tissuespecific expression, disease versus non-disease states, mutant versus wild-type
patterns of expression, etc.
Results of microarray analyses should be confirmed by Northern or Western
blots to confirm changes in expression of some of the gene products.
10. Researchers have found that knockout of the gene for a particular mouse protein results in
embryonic lethality. They are convinced that the protein is particularly important in the kidney
as part of the insulin secretory pathway. How could they apply Cre-lox methods to test this
hypothesis?
They would construct one mouse line (using stem cell positive/negative
selection) containing the gene for the Cre recombinase under control of a kidneyspecific promoter (ideally one that can be turned on late in development). They would
construct another mouse line containing the gene of interest flanked by loxP sites (i.e., a
floxed gene). By mating they would make the second mouse line homozygous, then
mate this homozygous line with the line carrying the Cre recombinase gene. This
should provide specific knockout of the gene in kidney cells later in development to
study effects on insulin secretion
Download