2010 RNA_Bioinformatics Lecture 1

advertisement
Introduction to
RNA Bioinformatics
Craig L. Zirbel
October 5, 2010
Based on a talk originally given by Anton Petrov.
Outline
Lecture 1
• Importance of RNA, examples (miRNA, riboswitches).
• RNA 2D and 3D structure.
• RNA structure prediction.
Lecture 2
• RNA basepairs and 3D motifs
• Predicting secondary structure from sequence (mfold)
Lecture 3
• Statistical variability of protein and RNA sequences
In the human, out of approximately 3 billion nucleotides, only about
1.5% code for proteins, although up to 93% are transcribed into RNA.
What is this “non-coding” RNA doing?
ENCODE Project Consortium, Identification and
analysis of functional elements in 1% of the human
genome by the ENCODE pilot project. Nature. 2007
Jun 14447(7146):799-816
Mattick, J.S. (2004) The hidden genetic program of complex organisms. Scientific American 291 (4): 60-67.
DNA
Transcription
RNA
tRNA
Ribosomal RNA
Translation
Protein
DNA
Reverse Transcription
Transcription
micro RNA
Introns (RNA)
RNA
tRNA
Ribosomal RNA
Many other
types of ncRNA
Splicing
Translation
of exons
Protein
Mattick, J.S. (2004) The hidden genetic program of complex organisms. Scientific American 291 (4): 60-67.
microRNA
miRNAs in a transcript, waiting to be diced out
Mattick, J.S. (2004) The hidden genetic program of complex organisms.
Scientific American 291 (4): 60-67.
Bioinformatical challenge:
given a DNA sequence,
predict microRNA genes and
their respective targets.
Kim VN, MicroRNA biogenesis: coordinated cropping and dicing. Nat Rev Mol Cell Biol. 2005 May;6(5):376-85
Acquisition of novel microRNAs (shown in white boxes) may be
a driving force of recent evolution. Also a factor in cancers?
Peterson, K.J., Dietrich, M.R. and McPeek, M.A. (2009) MicroRNAs and metazoan macroevolution:
insights into canalization, complexity, and the Cambrian explosion. BioEssays 31:736–747.
There are 84 mammal-specific
microRNAs, and 84 more that are
found exclusively in apes.
RIBOSWITCHES
RNAs which bind to other molecules when they are
present, altering the shape and function of the RNA.
Bioinformatic
challenges: find
riboswitches in
genomic sequences,
design novel
riboswitches.
Montange, R. K., & Batey, R. T. (2008). Riboswitches: emerging themes in RNA structure and function. Annu Rev Biophys 37:117-133.
Types of RNA
Bioinformatic
challenges: Is this list
final? Could there be
more types of noncoding (ncRNA) that
we don’t know yet?
How to search for
novel ncRNAs in
genomes?
http://en.wikipedia.org/wiki/List_of_RNAs
Goals of RNA bioinformatics
• Find and classify RNA genes in genomic
sequences (using both experimental and
computational methods).
• Predict secondary and 3D structure from RNA
sequence.
• Infer function from structure.
• Rationally design RNA molecules for
biotechnology.
• Find diseases associated with RNAs (e.g., cancer
and miRNA)
Why RNA is unique
• Similar to DNA in chemical composition, primary
and secondary structure, and information content,
but with more complicated structure than helices
• Similar to Proteins in tertiary and 3D structure and
function, but also very different, mostly base-base
interactions, fewer backbone-backbone
• Binds substrates and catalyzes reactions, just as
proteins.
• Participates in all stages of gene expression and
information transfer: transcription, splicing,
translation. Frequent target of antibiotics.
Similarities Between Protein and RNA
3D Structures
• Compact folding
• Hierarchical
organization
• Modular domains
• Specific tertiary
interactions
• Molecular “mimicry”
-- Proteins that
“mimic” RNA
The tertiary structures of
tRNA-mimic translation
factors and tRNA. (a)
Thermus thermophilus EFG:GDP (PDB accession code
1DAR). (b) Thermus
aquaticus EFTu:GDPNP:Phe-tRNAPhe
(1TTT). (c) Thermus
thermophilus RRF (1EH1). (d)
Yeast Phe-tRNAPhe.
LIANG, H., & LANDWEBER, L. F. (2005). Molecular mimicry: Quantitative methods to study structural similarity between protein and RNA. R
1172.
RNAs are not linear - they fold back on
themselves to match up complementary strands
RNA 2D Structure Elements
Basepairs are the basic units
of secondary structure.
Bioinformatics: sequence and genome analysis By David W. Mount
Bioinformatic challenges:
predict most stable 2D
structures, resolve
pseudoknotted regions etc.
2d to 3d structure of RNA
Download