#25 - More RNA Structure & BCB 544 10/19/07 Projects BCB 444/544

advertisement
#25 - More RNA Structure & BCB 544
Projects
10/19/07
Required Reading
BCB 444/544
(before lecture)
Mon Oct 15 - Lecture 23
Lecture 25
Protein Tertiary Structure Prediction
• Chp 15 - pp 214 - 230
Wed Oct 17 & Thurs Oct 18 - Lecture 24 & Lab 8
 More RNA Structure
• Chp 16 - pp 231 - 242
 BCB 544 Projects
Fri Oct 18 - Lecture 25
(& Mon Oct 22)
Gene Prediction
#25_Oct19
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
(Terribilini)
RNA Structure/Function & RNA Structure Prediction
• Chp 8 - pp 97 - 112
10/19/07
1
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
2
BCB 544 Only:
New Homework Assignment
Homework Assignment
ALL: HomeWork #4 (emailed & posted online Sat AM)
544 Extra#2 (posted online Thurs?)
Due: Mon Oct 22 by 5 PM ( not Fri Oct 19)
Due: Fri Nov 2 by 5 PM
Read:
Ginalski et al.(2005) Practical Lessons from Protein Structure
Prediction, Nucleic Acids Res. 33:1874-91.
HW#2 is next step in Team Projects
http://nar.oxfordjournals.org/cgi/content/full/33/6/1874
(PDF posted on website)
Will end lecture a few minutes early today - to allow time to meet
& discuss 544 Teams & Projects
• Although somewhat dated, this paper provides a nice overview of
protein structure prediction methods and evaluation of predicted
structures.
• Your assignment is to write a summary of this paper - for details
see HW#4 posted online & sent by email on Sat Oct 13
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
3
BCB List of URLs for Seminars related to Bioinformatics:
http://www.bcb.iastate.edu/seminars/index.html
4
A Step Toward New HIV Therapies
Susan Carpenter
(Washington State Univ)
• Oct 18 Thur - BBMB Seminar 4:10 in 1414 MBB
• Sachdeve Sidhu ( Genentech) Phage peptide and antibody
libraries in protein engineering and ligand selection
Wendy Sparks
Yvonne Wannemuehler
Drena Dobbs, GDCB
Jae-Hyung Lee
Michael Terribilini
Kai-Ming Ho, Physics
Yungok Ihm
Haibo Cao
Cai-zhuang Wang
Gloria Culver, BBMB
Laura Dutca
• Was great talk!
• Oct 19 Fri - BCB Faculty Seminar 2:10 in 102 ScI
• Lyric Bartholomay (Ent, ISU) Computational Biology and
vector-borne disease: from the field to the bench
BCB 444/544 Fall 07 Dobbs
10/19/07
Another local example: Combining Structure Prediction,
Machine Learning & "Real" (wet-lab) Experiments to
Investigate the Lentiviral Rev Protein:
Seminars this Week
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
5
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
6
1
#25 - More RNA Structure & BCB 544
Projects
10/19/07
RNA Function
Chp 16 - RNA Structure Prediction
SECTION V
STRUCTURAL BIOINFORMATICS
• Storage/transfer of genetic information
Xiong: Chp 16 RNA Structure Prediction (Terribilini)
•
•
•
•
•
•
• Newly discovered regulatory functions
• miRNA & si RNA pathways, especially
RNA Function
Types of RNA Structures
RNA Secondary Structure Prediction Methods
Ab Initio Approach
Comparative Approach
Performance Evaluation
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
• Catalytic
10/19/07
7
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
RNA types & functions
Types of RNAs
This slide has been changed
10/19/07
8
10/19/07
10
10/19/07
12
RNA Structures
Primary Function(s)
mRNA - messenger
translation (protein synthesis)
regulatory
rRNA - ribosomal
translation (protein synthesis)
tRNA - transfer
translation (protein synthesis)
hnRNA - heterogeneous nuclear
precursors & intermediates of mature mRNAs &
other RNAs
scRNA - small cytoplasmic
signal recognition particle (SRP)
tRNA processing
snRNA - small nuclear
snoRNA - small nucleolar
mRNA processing, polyA addition <catalytic>
rRNA processing/maturation/methylation
regulatory RNAs (siRNA, miRNA,
etc.)
regulation of transcription and translation,
other??
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
Levels of RNA Structure
• RNA forms complex 3D structures
• Mainly "single-stranded" - but:
<catalytic>
• Single RNA strandscan self-hybridize to form
Base-paired regions
<catalytic>
10/19/07
9
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
This slide has been changed
Common structural motifs in RNA
• Helices
• Loops
•
•
•
•
Like proteins, RNA has primary, secondary, and tertiary
structure (& quaternary structure, too)
1. Primary structure = Ribonucleotide sequence
2. Secondary structure = Helix vs turn (base-paired vs single-stranded)
Hairpin
Interior
Bulge
Multibranch
• Pseudoknots
• Tetraloops
Note: in RNA, helices often involve long-range interactions
3. Tertiary structure = 3D structure (also due to long-range interactions)
4. Quaternary structure = complex of 2 or more RNA strands
Rob Knight
Univ Colorado BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544 Fall 07 Dobbs
10/19/07
11
Fig 6.2
Baxevanis & Ouellette
BCB 2005
444/544
F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
2
#25 - More RNA Structure & BCB 544
Projects
10/19/07
This is a new slide
Covalent & non-covalent bonds in RNA
RNA Structure Prediction
This slide has been changed
• RNA tertiary structure is very difficult to predict
• Focus on predicting RNA secondary structure:
Primary:
Covalent bonds
• Given an RNA sequence, predict its secondary
structure
Secondary/Tertiary
Non-covalent bonds
• H-bonds
(base-pairing)
• Base stacking
• Almost all methods ignore higher order secondary
structures such as pseudoknots & tetraloops
• Specialized software is available for predicting these
Fig 6.2
BCB 444/544
Baxevanis & Ouellette
2005 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
RNA Pseudoknots & Tetraloops
10/19/07
13
This is a new slide
http://www.lbl.gov/Science-Articles/ResearchReview/Annual-Reports/1995/images/rna.gif
Base Pairing in RNA
10/19/07
14
This slide has been changed
G-C, A-U, G-U ("wobble") & many variants
• Often have important regulatory or catalyltic functions
Pseudoknot
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
See: IMB Image Library of Biological Molecules
Tetraloop
http://www.fli-leibniz.de/ImgLibDoc/nana/IMAGE_NANA.html#basepairs
http://academic.brooklyn.cuny.edu/chem/z
huang/QD/mckay_hr.gif
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
15
Experimental RNA structure determination?
•
X-ray crystallography
•
NMR spectroscopy
•
Enzymatic/chemical mapping
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
16
This slide has been changed
RNA Secondary Structure Prediction
Methods
Two (three, recently) main types of methods:
1. Ab initio - based on calculating most energetically
favorable secondary structure(s)
Energy minimization (thermodynamics)
2. Comparative approach - based on comparisons of
multiple evolutionarily-related RNA sequences
Sequence comparison (co-variation)
3. Combined computational & experimental
Use experimental constraints when available
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544 Fall 07 Dobbs
10/19/07
17
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
18
3
#25 - More RNA Structure & BCB 544
Projects
10/19/07
This is a new slide
This is a new slide
RNA Secondary structure prediction - 1
RNA Secondary structure prediction - 2
1) Energy minimization (thermodynamics)
•
•
2) Comparative sequence analysis (co-variation)
Algorithms:
Dynamic programming to find
high probability pairs
(also, some Genetic algorithms)
Software:
•
Algorithms:
•
Software:
Mfold - Zuker
RNAfold (Vienna Package) -Hofacker
RNAstructure - Mathews
Sfold - Ding & Lawrence
R Knight 2005
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
19
This is a new slide
1 - Ab Initio Prediction
3) Combined experimental & computational
G
• How?
200
Enzymes: S1 nuclease, T1 RNase
Chemicals: kethoxal, DMS, OH•
Mfold
Sfold
RNAStructure
RNAFold
RNAlifold
240
This slide has been changed
Kethoxal modification
(mild)
(strong)
DMS modification
(mild)
(strong)
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
21
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
This slide has been changed
• Free energy is calculated based on parameters
determined in the wet lab
• Correction: Use known energy associated with
each type of nearest-neighbor pair (base-stacking)
(not base-pair)
• Base-pair formation is not independent: multiple
base-pairs adjacent to each other are more
favorable than individual base-pairs - cooperative
- because of base-stacking interactions
• Bulges and loops adjacent to base-pairs have a free
energy penalty
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
10/19/07
22
This is a new slide
Ab Initio Prediction:
What are the assumptions?
Ab Initio Prediction: Clarifications
BCB 444/544 Fall 07 Dobbs
20
• IMPORTANT: Largest contribution to energy is
to nearest neighbor (base-stacking) interactions,
not base-pairing!
220
• Software:
10/19/07
• Requires only a single RNA sequence
• Calculates minimum free energy structure
• Base-paired regions have lower free energy, so
methods "attempt to find secondary structure with
maximal base pairing" (Careful!)
DMS
Map single-stranded vs doublestranded regions in folded RNA
RNAlifold
Foldalign
Dynalign
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
RNA Secondary structure prediction - 3
• Experiments:
Mutual information
Context-free grammars
•
Native tertiary structure or "fold" of an RNA
molecule is (one of) its "lowest" free energy
configuration(s)
Gibbs free energy = ΔG in kcal/mol at 37°C
= equilibrium stability of structure
lower values (negative) are more favorable
Is this assumption valid?
in vivo? - this may not hold, but we don't really know
23
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
24
4
#25 - More RNA Structure & BCB 544
Projects
Energy minimization:
What are the rules?
A
A
U Basepair A=U
U
A=U
10/19/07
This is a new slide
What gives here?
ΔG = -1.2 kcal/mole
A
U
U
A
Basepair
A=U
U=A
ΔG = -1.6 kcal/mole
Energy minimization calculations:
Base-stacking is critical
This is a new slide
AA
UU
-1.2
CG
GC
-3.0
AU or UA
UA
AU
-1.6
GC
CG
-4.3
AG, AC, CA, GA
UC, UG, GU, CU
-2.1
GU
UG
-0.3
CC
GG
-4.8
XG, GX
YU, UY
0
- Tinocco et al.
C Staben 2005
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
25
This is a new slide
Ab initio RNA Structure Prediction:
Uses Nearest-neighbor parameters
C Staben 2005BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
26
This slide has been changed
Ab Initio Energy Calculation
• Search for all possible base-pairing
patterns
• Calculate total energy of each
structure based on all stabilizing and
destabilizing forces
• Most methods for ab initio prediction (free energy
minimization) use nearest-neighbor energy parameters
(derived from experiment) for predicting stability of an
RNA secondary structure (in terms of ΔG at 37°C)
Total free energy for a specific
RNA conformation = Sum of
incremental energy terms for:
• helical stacking
& most available software packages use same set of
parameters
- Mathews, Sabina, Zuker
(sequence dependent)
• loop initiation
• unpaired stacking
(favorable "increments" are < 0)
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
27
Dot Matrices
Dynamic Programming
BCB 444/544 Fall 07 Dobbs
10/19/07
28
This slide has been changed
• Finding optimal secondary structure is difficult lots of possibilities
• Compare RNA sequence with itself
• Apply scoring scheme based on energy parameters
for base stacking, cooperativity, and penalties for
destabilizing forces
• Find path that represents most energetically
favorable secondary structure
• Can be used to find all
possible base pair patterns
• Compare input sequence to
itself and put a dot where
there is a complimentary
base
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
R Knight 2005
Fig 6.3
Baxevanis & Ouellette
2005 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544
10/19/07
29
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
30
5
#25 - More RNA Structure & BCB 544
Projects
10/19/07
Popular Ab Initio Prediction Programs
Problem with DP Approach
• Mfold
• DP returns SINGLE lowest energy structure
• There may be many structures with similar energies
• Also, predicted secondary structure is only as good as
energy parameters used
• Solution: return multiple structures with near optimal
energies
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
• Combines DP with thermodynamic calculations
• Fairly accurate for short sequences, less accurate as
sequence length increases
• RNAfold
• Returns multiple structures near predicted optimal
structure
• Computes larger number of potential secondary structures
than Mfold, so uses a simplified energy function
31
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
2 - Comparative Prediction Approaches
Co-variation patterns in MSAs are critical
• Use multiple sequence alignment
• Assume related sequences fold into same secondary
structure
• RNA functional motifs are conserved
• To maintain RNA structure during evolution, a mutation
in a base-paired residue must be compensated for by a
mutation in residue with which it pairs
• Comparative methods search for co-variation patterns
in MSAs
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
33
Consensus Structures
BCB 444/544 Fall 07 Dobbs
10/19/07
34
Popular Comparative Prediction Programs
• Predict secondary structure of each individual
sequence in a MSA
• Compare all structures and try to identify a
consensus structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
32
Two main types:
1. Require user to provide MSA
• RNAalifold
2. No MSA required
• Foldalign
• Dynalign
10/19/07
35
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
36
6
#25 - More RNA Structure & BCB 544
Projects
10/19/07
RNAalifold
Foldalign
• Requires user to provide MSA
• User provides pair of unaligned RNA sequences
• Creates a scoring matrix combining minimum free
energy and co-variation information
• Constructs alignment & computes conserved structure
• Suitable only for relatively short sequences
• DP used to identify minimum free energy structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
37
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
38
3 - Popular Programs that use Combined
Computational Experimental Approaches
Dynalign
• User provides two unaligned input sequences
• Calculates possible secondary structures using
algorithm similar to Mfold
• Compares multiple structures from both sequences
to find a common structure
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
•
•
•
•
•
39
Comparison of Predictions for Single RNA
using Different Methods
Mfold
Sfold
RNAStructure
RNAFold
RNAlifold
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
40
10/19/07
42
Comparison of Mfold Predictions:
-/+ Constraints
SL Y
SL Y
SL Z
SL X
SL Z
SL X
Sfold -51.14 kcal/mol
Mfold -54.84 kcal/mol
SL Y
SL Z
SL Y
SL X
SL Z
SL X
RNAstructure -71.3 kcal/mol
Mfold
-126.05 kcal/mol
Mfold plus constraints
-54.84 kcal/mol
RNAfold -80.16 kcal/mol
JH Lee 2007BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544 Fall 07 Dobbs
10/19/07
41
JH Lee 2007BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
7
#25 - More RNA Structure & BCB 544
Projects
Performance Evaluation
•
•
•
•
10/19/07
This slide has been changed
Ab initio methods? correlation coefficient = 20-60%
Comparative approaches? correlation coefficient = 20-80%
Programs that require user to supply MSA are more accurate
Comparative programs are consistently more accurate than ab initio
BCB 544 "Team" Projects
• 544 Extra HW#2 is next step in Team Projects
•
•
•
•
• Base-pairs predicted by comparative sequence analysis for large &
small subunit rRNAs are 97% accurate when compared with high
resolution crystal structures!
- Gutell, Pace
Write ~ 1 page outline
Schedule meeting with Michael & Drena to discuss topic
Read a few papers
Write a more detailed plan
• You may work alone if you prefer
• BEST APPROACH? Methods that combine computational prediction
(ab initio & comparative) with experimental constraints (from
chemical/enzymatic modification studies)
• Last week of classes will be devoted to Projects
• Written reports due: Mon Dec 3 (no class that day)
• Oral presentations (15-20') will be: Wed-Fri Dec 5,6,7
• 1 or 2 teams will present during each class period
See Guidelines for Projects posted online
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
BCB 444/544 Fall 07 Dobbs
10/19/07
43
BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects
10/19/07
44
8
Download