BCB 444/544 More RNA Structure BCB 544 Projects Lecture 25

advertisement
BCB 444/544
Lecture 25
 More RNA Structure
 BCB 544 Projects
#25_Oct19
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
1
Required Reading
(before lecture)
Mon Oct 15 - Lecture 23
Protein Tertiary Structure Prediction
• Chp 15 - pp 214 - 230
Wed Oct 17 & Thurs Oct 18 - Lecture 24 & Lab 8
(Terribilini)
RNA Structure/Function & RNA Structure Prediction
• Chp 16 - pp 231 - 242
Fri Oct 18 - Lecture 25
(& Mon Oct 22)
Gene Prediction
• Chp 8 - pp 97 - 112
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
2
Homework Assignment
ALL: HomeWork #4 (emailed & posted online Sat AM)
Due: Mon Oct 22 by 5 PM (not Fri Oct 19)
Read:
Ginalski et al.(2005) Practical Lessons from Protein Structure
Prediction, Nucleic Acids Res. 33:1874-91.
http://nar.oxfordjournals.org/cgi/content/full/33/6/1874
(PDF posted on website)
• Although somewhat dated, this paper provides a nice overview of
protein structure prediction methods and evaluation of predicted
structures.
• Your assignment is to write a summary of this paper - for details
see HW#4 posted online & sent by email on Sat Oct 13
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
3
BCB 544 Only:
New Homework Assignment
544 Extra#2 (posted online Thurs?)
Due: Fri Nov 2 by 5 PM
HW#2 is next step in Team Projects
Will end lecture a few minutes early today - to allow time to meet
& discuss 544 Teams & Projects
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
4
Seminars this Week
BCB List of URLs for Seminars related to Bioinformatics:
http://www.bcb.iastate.edu/seminars/index.html
• Oct 18 Thur - BBMB Seminar 4:10 in 1414 MBB
• Sachdeve Sidhu (Genentech) Phage peptide and antibody
libraries in protein engineering and ligand selection
• Was great talk!
• Oct 19 Fri - BCB Faculty Seminar 2:10 in 102 ScI
• Lyric Bartholomay (Ent, ISU) Computational Biology and
vector-borne disease: from the field to the bench
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
5
Chp 16 - RNA Structure Prediction
SECTION V
STRUCTURAL BIOINFORMATICS
Xiong: Chp 16 RNA Structure Prediction (Terribilini)
•
•
•
•
•
•
RNA Function
Types of RNA Structures
RNA Secondary Structure Prediction Methods
Ab Initio Approach
Comparative Approach
Performance Evaluation
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
7
RNA Function
This slide has been changed
• Storage/transfer of genetic information
• Newly discovered regulatory functions
• miRNA & si RNA pathways, especially
• Catalytic
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
8
RNA types & functions
Types of RNAs
Primary Function(s)
mRNA - messenger
translation (protein synthesis)
regulatory
rRNA - ribosomal
translation (protein synthesis)
tRNA - transfer
translation (protein synthesis)
hnRNA - heterogeneous nuclear
precursors & intermediates of mature mRNAs &
other RNAs
scRNA - small cytoplasmic
signal recognition particle (SRP)
tRNA processing
<catalytic>
<catalytic>
snRNA - small nuclear
snoRNA - small nucleolar
mRNA processing, polyA addition <catalytic>
rRNA processing/maturation/methylation
regulatory RNAs (siRNA, miRNA,
etc.)
regulation of transcription and translation,
other??
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
9
RNA Structures
• RNA forms complex 3D structures
• Mainly "single-stranded" - but:
• Single RNA strandscan self-hybridize to form
Base-paired regions
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
10
Levels of RNA Structure
This slide has been changed
Like proteins, RNA has primary, secondary, and tertiary
structure (& quaternary structure, too)
1. Primary structure = Ribonucleotide sequence
2. Secondary structure = Helix vs turn (base-paired vs single-stranded)
Note: in RNA, helices often involve long-range interactions
3. Tertiary structure = 3D structure (also due to long-range interactions)
4. Quaternary structure = complex of 2 or more RNA strands
Rob Knight
Univ Colorado BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
11
Common structural motifs in RNA
• Helices
• Loops
•
•
•
•
Hairpin
Interior
Bulge
Multibranch
• Pseudoknots
• Tetraloops
Fig 6.2
Baxevanis & Ouellette
BCB 2005
444/544
F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
12
This is a new slide
Covalent & non-covalent bonds in RNA
Primary:
Covalent bonds
Secondary/Tertiary
Non-covalent bonds
• H-bonds
(base-pairing)
• Base stacking
Fig 6.2
BCB 444/544
Baxevanis & Ouellette
2005 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
13
RNA Structure Prediction
This slide has been changed
• RNA tertiary structure is very difficult to predict
• Focus on predicting RNA secondary structure:
• Given an RNA sequence, predict its secondary
structure
• Almost all methods ignore higher order secondary
structures such as pseudoknots & tetraloops
• Specialized software is available for predicting these
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
14
RNA Pseudoknots & Tetraloops
This is a new slide
• Often have important regulatory or catalyltic functions
Pseudoknot
http://www.lbl.gov/Science-Articles/ResearchReview/Annual-Reports/1995/images/rna.gif
Tetraloop
http://academic.brooklyn.cuny.edu/chem/z
huang/QD/mckay_hr.gif
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
15
Base Pairing in RNA
This slide has been changed
G-C, A-U, G-U ("wobble") & many variants
See: IMB Image Library of Biological Molecules
http://www.fli-leibniz.de/ImgLibDoc/nana/IMAGE_NANA.html#basepairs
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
16
Experimental RNA structure determination?
•
X-ray crystallography
•
NMR spectroscopy
•
Enzymatic/chemical mapping
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
17
This slide has been changed
RNA Secondary Structure Prediction
Methods
Two (three, recently) main types of methods:
1. Ab initio - based on calculating most energetically
favorable secondary structure(s)
Energy minimization (thermodynamics)
2. Comparative approach - based on comparisons of
multiple evolutionarily-related RNA sequences
Sequence comparison (co-variation)
3. Combined computational & experimental
Use experimental constraints when available
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
18
This is a new slide
RNA Secondary structure prediction - 1
1) Energy minimization (thermodynamics)
•
Algorithms:
•
Software:
Dynamic programming to find
high probability pairs
(also, some Genetic algorithms)
Mfold - Zuker
RNAfold (Vienna Package) -Hofacker
RNAstructure - Mathews
Sfold - Ding & Lawrence
R Knight 2005
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
19
This is a new slide
RNA Secondary structure prediction - 2
2) Comparative sequence analysis (co-variation)
•
Algorithms:
•
Software:
Mutual information
Context-free grammars
RNAlifold
Foldalign
Dynalign
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
20
This is a new slide
RNA Secondary structure prediction - 3
3) Combined experimental & computational
• Experiments:
DMS
Map single-stranded vs doublestranded regions in folded RNA
G
• How?
200
Enzymes: S1 nuclease, T1 RNase
Chemicals: kethoxal, DMS, OH
220
• Software:
Mfold
Sfold
RNAStructure
RNAFold
RNAlifold
240
Kethoxal modification
(mild)
(strong)
DMS modification
(mild)
(strong)
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
21
1 - Ab Initio Prediction
This slide has been changed
• Requires only a single RNA sequence
• Calculates minimum free energy structure
• Base-paired regions have lower free energy, so
methods "attempt to find secondary structure with
maximal base pairing" (Careful!)
• IMPORTANT: Largest contribution to energy is
to nearest neighbor (base-stacking) interactions,
not base-pairing!
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
22
This slide has been changed
Ab Initio Prediction: Clarifications
• Free energy is calculated based on parameters
determined in the wet lab
• Correction: Use known energy associated with
each type of nearest-neighbor pair (base-stacking)
(not base-pair)
• Base-pair formation is not independent: multiple
base-pairs adjacent to each other are more
favorable than individual base-pairs - cooperative because of base-stacking interactions
• Bulges and loops adjacent to base-pairs have a free
energy penalty
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
23
Ab Initio Prediction:
What are the assumptions?
•
This is a new slide
Native tertiary structure or "fold" of an RNA
molecule is (one of) its "lowest" free energy
configuration(s)
Gibbs free energy = G in kcal/mol at 37C
= equilibrium stability of structure
lower values (negative) are more favorable
Is this assumption valid?
in vivo? - this may not hold, but we don't really know
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
24
Energy minimization:
What are the rules?
A
A
U
U
Basepair
A=U
A=U
This is a new slide
What gives here?
G = -1.2 kcal/mole
A
U
U
A
Basepair
A=U
U=A
G = -1.6 kcal/mole
C Staben 2005
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
25
Energy minimization calculations:
Base-stacking is critical
AA
UU
AU or UA
AU
UA
AG, AC, CA, GA
UC, UG, GU, CU
CC
GG
This is a new slide
-1.2
CG
GC
-3.0
-1.6
GC
CG
-4.3
-2.1
GU
UG
-0.3
-4.8
XG, GX
YU, UY
0
- Tinocco et al.
C Staben 2005BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
26
This is a new slide
Ab initio RNA Structure Prediction:
Uses Nearest-neighbor parameters
• Most methods for ab initio prediction (free energy
minimization) use nearest-neighbor energy parameters
(derived from experiment) for predicting stability of an
RNA secondary structure (in terms of G at 37C)
& most available software packages use same set of
parameters
- Mathews, Sabina, Zuker
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
27
This slide has been changed
Ab Initio Energy Calculation
• Search for all possible base-pairing
patterns
• Calculate total energy of each
structure based on all stabilizing and
destabilizing forces
Total free energy for a specific
RNA conformation = Sum of
incremental energy terms for:
• helical stacking
(sequence dependent)
• loop initiation
• unpaired stacking
(favorable "increments" are < 0)
Fig 6.3
Baxevanis & Ouellette
2005 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544
10/19/07
28
Dot Matrices
• Can be used to find all
possible base pair patterns
• Compare input sequence to
itself and put a dot where
there is a complimentary
base
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
R Knight 2005
10/19/07
29
Dynamic Programming
This slide has been changed
• Finding optimal secondary structure is difficult lots of possibilities
• Compare RNA sequence with itself
• Apply scoring scheme based on energy parameters
for base stacking, cooperativity, and penalties for
destabilizing forces
• Find path that represents most energetically
favorable secondary structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
30
Problem with DP Approach
• DP returns SINGLE lowest energy structure
• There may be many structures with similar energies
• Also, predicted secondary structure is only as good as
energy parameters used
• Solution: return multiple structures with near optimal
energies
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
31
Popular Ab Initio Prediction Programs
• Mfold
• Combines DP with thermodynamic calculations
• Fairly accurate for short sequences, less accurate as
sequence length increases
• RNAfold
• Returns multiple structures near predicted optimal
structure
• Computes larger number of potential secondary structures
than Mfold, so uses a simplified energy function
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
32
2 - Comparative Prediction Approaches
• Use multiple sequence alignment
• Assume related sequences fold into same secondary
structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
33
Co-variation patterns in MSAs are critical
• RNA functional motifs are conserved
• To maintain RNA structure during evolution, a mutation
in a base-paired residue must be compensated for by a
mutation in residue with which it pairs
• Comparative methods search for co-variation patterns
in MSAs
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
34
Consensus Structures
• Predict secondary structure of each individual
sequence in a MSA
• Compare all structures and try to identify a
consensus structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
35
Popular Comparative Prediction Programs
Two main types:
1. Require user to provide MSA
• RNAalifold
2. No MSA required
• Foldalign
• Dynalign
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
36
RNAalifold
• Requires user to provide MSA
• Creates a scoring matrix combining minimum free
energy and co-variation information
• DP used to identify minimum free energy structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
37
Foldalign
• User provides pair of unaligned RNA sequences
• Constructs alignment & computes conserved structure
• Suitable only for relatively short sequences
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
38
Dynalign
• User provides two unaligned input sequences
• Calculates possible secondary structures using
algorithm similar to Mfold
• Compares multiple structures from both sequences
to find a common structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
39
3 - Popular Programs that use Combined
Computational Experimental Approaches
•
•
•
•
•
Mfold
Sfold
RNAStructure
RNAFold
RNAlifold
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
40
Comparison of Predictions for Single RNA
using Different Methods
SL Y
SL Y
SL Z
SL X
SL Z
SL X
Sfold -51.14 kcal/mol
Mfold -54.84 kcal/mol
SL Y
SL Z
SL Y
SL X
SL Z
SL X
RNAstructure -71.3 kcal/mol
RNAfold -80.16 kcal/mol
JH Lee 2007BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
41
Comparison of Mfold Predictions:
-/+ Constraints
Mfold
-126.05 kcal/mol
Mfold plus constraints
-54.84 kcal/mol
JH Lee 2007BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
42
Performance Evaluation
•
•
•
•
This slide has been changed
Ab initio methods? correlation coefficient = 20-60%
Comparative approaches? correlation coefficient = 20-80%
Programs that require user to supply MSA are more accurate
Comparative programs are consistently more accurate than ab initio
• Base-pairs predicted by comparative sequence analysis for large &
small subunit rRNAs are 97% accurate when compared with high
resolution crystal structures!
- Gutell, Pace
• BEST APPROACH? Methods that combine computational prediction
(ab initio & comparative) with experimental constraints (from
chemical/enzymatic modification studies)
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
43
BCB 544 "Team" Projects
• 544 Extra HW#2 is next step in Team Projects
•
•
•
•
Write ~ 1 page outline
Schedule meeting with Michael & Drena to discuss topic
Read a few papers
Write a more detailed plan
• You may work alone if you prefer
• Last week of classes will be devoted to Projects
• Written reports due: Mon Dec 3 (no class that day)
• Oral presentations (15-20') will be: Wed-Fri Dec 5,6,7
• 1 or 2 teams will present during each class period
See Guidelines for Projects posted online
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
44
Download