BCB 444/544X Lab 8 RNA Secondary Structure Prediction

advertisement
BCB 444/544X
Lab 8
RNA Secondary Structure Prediction
Due: 10/22/2007 by 5pm
Email to: terrible@iastate.edu
Objectives
1. Learn about the resources available for RNA secondary structure
prediction
2. Practice using RNA secondary structure prediction software
3. Be able to compare the results of RNA secondary structure
predictions
Introduction
Most models for the function of molecules and experimental observations
make more sense if we know the structures of the molecules involved. For
RNA, it is often important to know if the bases we have determined to be
crucial for function are in a helical region, a loop, or a bulge. Having an
accurate secondary structure prediction for RNA can aid in designing and
interpreting experiments and developing functional models.
Exercises
Required questions are in red.
The first exercise is taken from Baxevanis and Ouellette’s
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins.
Part I. To demonstrate the utility of color annotation on the mfold server,
predict the secondary structure for the Drosophila sucinea R2 3’ UTR, as
shown here:
Figure 6.2 from Baxevanis and Ouellette
R2 elements are a class of retrotransposons that are found in most
arthropods (Eickbush, 2002). During retrotransposition, the 3’ UTR of the
message RNA is specifically recognized by the reverse transcriptase during
target-primed reverse transcription (Luan & Eickbush, 1995; Luan et al.,
1993). The secondary structure of the 3’ UTR was predicted for Drosophila
with comparative sequence analysis of 10 sequences (Mathews et al.,
1997). The sequence of the R2 element from D. sucinea, which can adopt the
comparative analysis structure, was later determined (Lathe & Eickbush,
1997). This sequence has been chosen for this example because it has a
known secondary structure and the prediction of this secondary structure
by free energy minimization is less accurate than average, so that the
usefulness of color annotation is demonstrated (Zuker & Jacobson, 1995;
Zuker & Jacobson, 1998).
Here is the R2 3’ UTR sequence:
UGAUCUCUGUAUUUGUUUCUAUUUUGAACAUUUGCCUGCUACCUUGGCAUA
ACAUCAAUAAGGUACAAACAUCGCAAAAAGUCAUCAUAAGGUGGGUUUUAG
UACGUAGGCGCUGUAGAACUUAAUUGUUCUGAUAGAGCAGCGAGUCGUGCA
UGCUAGUCUAGCAUUUCUUGCUACCUAGUAUCUUUAGAAGAUUUCCCUCCCU
UAGCGGUCAAA
Access the mfold Web server and paste the sucinea R2 element sequence
into the large field on the server Web site for the input sequence. Scroll to
the bottom of the Web page, to the section marked “Choose structure
annotation.” Select the button after “p-num” to choose a color annotation
that reflects how well determined base pairs are. Keep the default settings
for all other fields. Note, however, that there are links to a help page with
an explanation of each user definable setting.
Click the “Fold RNA” button at the bottom of the form. This sequence is
short enough that the default immediate job can be performed, so the Web
browser will move quickly to the results page. The results remain available
on the server for 24 hours. Note that the energy dot plot can be viewed by
following a hyperlink at the top of the page. Furthermore, a zip or tar file
can be downloaded that contains all the predicted structures. On the
results page, view the first individual structure by clicking jpg under
Structure 1.
1. In the color coding scheme, which color means that the base-pair has the
highest probability? Which color corresponds to the lowest probability?
Go to the RNAfold server and paste the sucinea R2 element sequence in the
input box. Scroll to the bottom and click on Fold it to generate the
prediction.
2. Are there similarities between the structures predicted by mfold and
RNAfold?
3. How does the predicted structures compare to the structure shown
above?
References cited in this section:
Eickbush, TH (2002). In Mobile DNA II (Craig, NL, Craigie, R, Gellart, M,
and Lambowitz, AM eds).
Lathe, WC and Eickbush, TH (1997). Mol. Biol. Evol. 14, 1232-1241.
Luan, DD and Eickbush TH (1995). Mol. Cell. Biol. 15, 3882-3891.
Luan, DD et al. (1993). Cell 72, 595-605.
Mathews, DH et al. (1997). RNA 3, 1-16.
Zuker, M, and Jacobson, AB (1995). Nucl. Acids Res. 23, 2791-2798.
Zuker, M, and Jacobson, AB (1998). RNA 4, 669-679.
Part II. Go through the exercise at:
http://cnx.rice.edu/content/m11065/latest/
4. There are 12 questions in the exercise. Submit answers to all 12.
Part III. For a more real world problem, use mfold to predict the secondary
structures of the sequences here. These sequences are for an important
regulatory element in the lentivirus HIV and EIAV called the Rev response
element, or RRE. The sequences have the same function in the two species
and we hypothesize that they may have similar structures.
5. Are there any similarities between the HIV and EIAV RRE’s?
To help determine if the sequences share a common structure, it may help to
identify regions of high similarity and predict the structure of just those
regions.
Go to ClustalW and enter the two sequences. Use the program with default
parameters to identify any regions of similarity. Save the alignment in a file
for use later.
Use mfold to predict the secondary structures of the regions of the
sequences that ClustalW aligns.
6. Are there any similarities between the HIV and EIAV RRE structures
from mfold?
Go to RNAalifold and submit the aligned HIV and EIAV RRE sequences.
Save the postscript drawing of the predicted structure.
7. How does the structure predicted by RNAalifold compare to the mfold
structures?
Download