#25 - More RNA Structure & BCB 544 Projects 10/19/07 Required Reading BCB 444/544 (before lecture) Mon Oct 15 - Lecture 23 Lecture 25 Protein Tertiary Structure Prediction • Chp 15 - pp 214 - 230 Wed Oct 17 & Thurs Oct 18 - Lecture 24 & Lab 8 More RNA Structure • Chp 16 - pp 231 - 242 BCB 544 Projects Fri Oct 18 - Lecture 25 (& Mon Oct 22) Gene Prediction #25_Oct19 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects (Terribilini) RNA Structure/Function & RNA Structure Prediction • Chp 8 - pp 97 - 112 10/19/07 1 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 2 BCB 544 Only: New Homework Assignment Homework Assignment ALL: HomeWork #4 (emailed & posted online Sat AM) 544 Extra#2 (posted online Thurs?) Due: Mon Oct 22 by 5 PM ( not Fri Oct 19) Due: Fri Nov 2 by 5 PM Read: Ginalski et al.(2005) Practical Lessons from Protein Structure Prediction, Nucleic Acids Res. 33:1874-91. HW#2 is next step in Team Projects http://nar.oxfordjournals.org/cgi/content/full/33/6/1874 (PDF posted on website) Will end lecture a few minutes early today - to allow time to meet & discuss 544 Teams & Projects • Although somewhat dated, this paper provides a nice overview of protein structure prediction methods and evaluation of predicted structures. • Your assignment is to write a summary of this paper - for details see HW#4 posted online & sent by email on Sat Oct 13 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 3 BCB List of URLs for Seminars related to Bioinformatics: http://www.bcb.iastate.edu/seminars/index.html 4 A Step Toward New HIV Therapies Susan Carpenter (Washington State Univ) • Oct 18 Thur - BBMB Seminar 4:10 in 1414 MBB • Sachdeve Sidhu ( Genentech) Phage peptide and antibody libraries in protein engineering and ligand selection Wendy Sparks Yvonne Wannemuehler Drena Dobbs, GDCB Jae-Hyung Lee Michael Terribilini Kai-Ming Ho, Physics Yungok Ihm Haibo Cao Cai-zhuang Wang Gloria Culver, BBMB Laura Dutca • Was great talk! • Oct 19 Fri - BCB Faculty Seminar 2:10 in 102 ScI • Lyric Bartholomay (Ent, ISU) Computational Biology and vector-borne disease: from the field to the bench BCB 444/544 Fall 07 Dobbs 10/19/07 Another local example: Combining Structure Prediction, Machine Learning & "Real" (wet-lab) Experiments to Investigate the Lentiviral Rev Protein: Seminars this Week BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 5 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 6 1 #25 - More RNA Structure & BCB 544 Projects 10/19/07 RNA Function Chp 16 - RNA Structure Prediction SECTION V STRUCTURAL BIOINFORMATICS • Storage/transfer of genetic information Xiong: Chp 16 RNA Structure Prediction (Terribilini) • • • • • • • Newly discovered regulatory functions • miRNA & si RNA pathways, especially RNA Function Types of RNA Structures RNA Secondary Structure Prediction Methods Ab Initio Approach Comparative Approach Performance Evaluation BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects • Catalytic 10/19/07 7 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects RNA types & functions Types of RNAs This slide has been changed 10/19/07 8 10/19/07 10 10/19/07 12 RNA Structures Primary Function(s) mRNA - messenger translation (protein synthesis) regulatory rRNA - ribosomal translation (protein synthesis) tRNA - transfer translation (protein synthesis) hnRNA - heterogeneous nuclear precursors & intermediates of mature mRNAs & other RNAs scRNA - small cytoplasmic signal recognition particle (SRP) tRNA processing snRNA - small nuclear snoRNA - small nucleolar mRNA processing, polyA addition <catalytic> rRNA processing/maturation/methylation regulatory RNAs (siRNA, miRNA, etc.) regulation of transcription and translation, other?? BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects Levels of RNA Structure • RNA forms complex 3D structures • Mainly "single-stranded" - but: <catalytic> • Single RNA strandscan self-hybridize to form Base-paired regions <catalytic> 10/19/07 9 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects This slide has been changed Common structural motifs in RNA • Helices • Loops • • • • Like proteins, RNA has primary, secondary, and tertiary structure (& quaternary structure, too) 1. Primary structure = Ribonucleotide sequence 2. Secondary structure = Helix vs turn (base-paired vs single-stranded) Hairpin Interior Bulge Multibranch • Pseudoknots • Tetraloops Note: in RNA, helices often involve long-range interactions 3. Tertiary structure = 3D structure (also due to long-range interactions) 4. Quaternary structure = complex of 2 or more RNA strands Rob Knight Univ Colorado BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects BCB 444/544 Fall 07 Dobbs 10/19/07 11 Fig 6.2 Baxevanis & Ouellette BCB 2005 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 2 #25 - More RNA Structure & BCB 544 Projects 10/19/07 This is a new slide Covalent & non-covalent bonds in RNA RNA Structure Prediction This slide has been changed • RNA tertiary structure is very difficult to predict • Focus on predicting RNA secondary structure: Primary: Covalent bonds • Given an RNA sequence, predict its secondary structure Secondary/Tertiary Non-covalent bonds • H-bonds (base-pairing) • Base stacking • Almost all methods ignore higher order secondary structures such as pseudoknots & tetraloops • Specialized software is available for predicting these Fig 6.2 BCB 444/544 Baxevanis & Ouellette 2005 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects RNA Pseudoknots & Tetraloops 10/19/07 13 This is a new slide http://www.lbl.gov/Science-Articles/ResearchReview/Annual-Reports/1995/images/rna.gif Base Pairing in RNA 10/19/07 14 This slide has been changed G-C, A-U, G-U ("wobble") & many variants • Often have important regulatory or catalyltic functions Pseudoknot BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects See: IMB Image Library of Biological Molecules Tetraloop http://www.fli-leibniz.de/ImgLibDoc/nana/IMAGE_NANA.html#basepairs http://academic.brooklyn.cuny.edu/chem/z huang/QD/mckay_hr.gif BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 15 Experimental RNA structure determination? • X-ray crystallography • NMR spectroscopy • Enzymatic/chemical mapping BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 16 This slide has been changed RNA Secondary Structure Prediction Methods Two (three, recently) main types of methods: 1. Ab initio - based on calculating most energetically favorable secondary structure(s) Energy minimization (thermodynamics) 2. Comparative approach - based on comparisons of multiple evolutionarily-related RNA sequences Sequence comparison (co-variation) 3. Combined computational & experimental Use experimental constraints when available BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects BCB 444/544 Fall 07 Dobbs 10/19/07 17 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 18 3 #25 - More RNA Structure & BCB 544 Projects 10/19/07 This is a new slide This is a new slide RNA Secondary structure prediction - 1 RNA Secondary structure prediction - 2 1) Energy minimization (thermodynamics) • • 2) Comparative sequence analysis (co-variation) Algorithms: Dynamic programming to find high probability pairs (also, some Genetic algorithms) Software: • Algorithms: • Software: Mfold - Zuker RNAfold (Vienna Package) -Hofacker RNAstructure - Mathews Sfold - Ding & Lawrence R Knight 2005 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 19 This is a new slide 1 - Ab Initio Prediction 3) Combined experimental & computational G • How? 200 Enzymes: S1 nuclease, T1 RNase Chemicals: kethoxal, DMS, OH• Mfold Sfold RNAStructure RNAFold RNAlifold 240 This slide has been changed Kethoxal modification (mild) (strong) DMS modification (mild) (strong) BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 21 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects This slide has been changed • Free energy is calculated based on parameters determined in the wet lab • Correction: Use known energy associated with each type of nearest-neighbor pair (base-stacking) (not base-pair) • Base-pair formation is not independent: multiple base-pairs adjacent to each other are more favorable than individual base-pairs - cooperative - because of base-stacking interactions • Bulges and loops adjacent to base-pairs have a free energy penalty BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 10/19/07 22 This is a new slide Ab Initio Prediction: What are the assumptions? Ab Initio Prediction: Clarifications BCB 444/544 Fall 07 Dobbs 20 • IMPORTANT: Largest contribution to energy is to nearest neighbor (base-stacking) interactions, not base-pairing! 220 • Software: 10/19/07 • Requires only a single RNA sequence • Calculates minimum free energy structure • Base-paired regions have lower free energy, so methods "attempt to find secondary structure with maximal base pairing" (Careful!) DMS Map single-stranded vs doublestranded regions in folded RNA RNAlifold Foldalign Dynalign BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects RNA Secondary structure prediction - 3 • Experiments: Mutual information Context-free grammars • Native tertiary structure or "fold" of an RNA molecule is (one of) its "lowest" free energy configuration(s) Gibbs free energy = ΔG in kcal/mol at 37°C = equilibrium stability of structure lower values (negative) are more favorable Is this assumption valid? in vivo? - this may not hold, but we don't really know 23 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 24 4 #25 - More RNA Structure & BCB 544 Projects Energy minimization: What are the rules? A A U Basepair A=U U A=U 10/19/07 This is a new slide What gives here? ΔG = -1.2 kcal/mole A U U A Basepair A=U U=A ΔG = -1.6 kcal/mole Energy minimization calculations: Base-stacking is critical This is a new slide AA UU -1.2 CG GC -3.0 AU or UA UA AU -1.6 GC CG -4.3 AG, AC, CA, GA UC, UG, GU, CU -2.1 GU UG -0.3 CC GG -4.8 XG, GX YU, UY 0 - Tinocco et al. C Staben 2005 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 25 This is a new slide Ab initio RNA Structure Prediction: Uses Nearest-neighbor parameters C Staben 2005BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 26 This slide has been changed Ab Initio Energy Calculation • Search for all possible base-pairing patterns • Calculate total energy of each structure based on all stabilizing and destabilizing forces • Most methods for ab initio prediction (free energy minimization) use nearest-neighbor energy parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of ΔG at 37°C) Total free energy for a specific RNA conformation = Sum of incremental energy terms for: • helical stacking & most available software packages use same set of parameters - Mathews, Sabina, Zuker (sequence dependent) • loop initiation • unpaired stacking (favorable "increments" are < 0) BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 27 Dot Matrices Dynamic Programming BCB 444/544 Fall 07 Dobbs 10/19/07 28 This slide has been changed • Finding optimal secondary structure is difficult lots of possibilities • Compare RNA sequence with itself • Apply scoring scheme based on energy parameters for base stacking, cooperativity, and penalties for destabilizing forces • Find path that represents most energetically favorable secondary structure • Can be used to find all possible base pair patterns • Compare input sequence to itself and put a dot where there is a complimentary base BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects R Knight 2005 Fig 6.3 Baxevanis & Ouellette 2005 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects BCB 444/544 10/19/07 29 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 30 5 #25 - More RNA Structure & BCB 544 Projects 10/19/07 Popular Ab Initio Prediction Programs Problem with DP Approach • Mfold • DP returns SINGLE lowest energy structure • There may be many structures with similar energies • Also, predicted secondary structure is only as good as energy parameters used • Solution: return multiple structures with near optimal energies BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 • Combines DP with thermodynamic calculations • Fairly accurate for short sequences, less accurate as sequence length increases • RNAfold • Returns multiple structures near predicted optimal structure • Computes larger number of potential secondary structures than Mfold, so uses a simplified energy function 31 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 2 - Comparative Prediction Approaches Co-variation patterns in MSAs are critical • Use multiple sequence alignment • Assume related sequences fold into same secondary structure • RNA functional motifs are conserved • To maintain RNA structure during evolution, a mutation in a base-paired residue must be compensated for by a mutation in residue with which it pairs • Comparative methods search for co-variation patterns in MSAs BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 33 Consensus Structures BCB 444/544 Fall 07 Dobbs 10/19/07 34 Popular Comparative Prediction Programs • Predict secondary structure of each individual sequence in a MSA • Compare all structures and try to identify a consensus structure BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 32 Two main types: 1. Require user to provide MSA • RNAalifold 2. No MSA required • Foldalign • Dynalign 10/19/07 35 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 36 6 #25 - More RNA Structure & BCB 544 Projects 10/19/07 RNAalifold Foldalign • Requires user to provide MSA • User provides pair of unaligned RNA sequences • Creates a scoring matrix combining minimum free energy and co-variation information • Constructs alignment & computes conserved structure • Suitable only for relatively short sequences • DP used to identify minimum free energy structure BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 37 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 38 3 - Popular Programs that use Combined Computational Experimental Approaches Dynalign • User provides two unaligned input sequences • Calculates possible secondary structures using algorithm similar to Mfold • Compares multiple structures from both sequences to find a common structure BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 • • • • • 39 Comparison of Predictions for Single RNA using Different Methods Mfold Sfold RNAStructure RNAFold RNAlifold BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 40 10/19/07 42 Comparison of Mfold Predictions: -/+ Constraints SL Y SL Y SL Z SL X SL Z SL X Sfold -51.14 kcal/mol Mfold -54.84 kcal/mol SL Y SL Z SL Y SL X SL Z SL X RNAstructure -71.3 kcal/mol Mfold -126.05 kcal/mol Mfold plus constraints -54.84 kcal/mol RNAfold -80.16 kcal/mol JH Lee 2007BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects BCB 444/544 Fall 07 Dobbs 10/19/07 41 JH Lee 2007BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 7 #25 - More RNA Structure & BCB 544 Projects Performance Evaluation • • • • 10/19/07 This slide has been changed Ab initio methods? correlation coefficient = 20-60% Comparative approaches? correlation coefficient = 20-80% Programs that require user to supply MSA are more accurate Comparative programs are consistently more accurate than ab initio BCB 544 "Team" Projects • 544 Extra HW#2 is next step in Team Projects • • • • • Base-pairs predicted by comparative sequence analysis for large & small subunit rRNAs are 97% accurate when compared with high resolution crystal structures! - Gutell, Pace Write ~ 1 page outline Schedule meeting with Michael & Drena to discuss topic Read a few papers Write a more detailed plan • You may work alone if you prefer • BEST APPROACH? Methods that combine computational prediction (ab initio & comparative) with experimental constraints (from chemical/enzymatic modification studies) • Last week of classes will be devoted to Projects • Written reports due: Mon Dec 3 (no class that day) • Oral presentations (15-20') will be: Wed-Fri Dec 5,6,7 • 1 or 2 teams will present during each class period See Guidelines for Projects posted online BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects BCB 444/544 Fall 07 Dobbs 10/19/07 43 BCB 444/544 F07 ISU Dobbs #25 - More RNA Structure & BCB 544 Projects 10/19/07 44 8