Bioinformatics Master Course II: DNA/Protein structure-function analysis and prediction Lecture 12: DNA/RNA structure Centre for Integrative Bioinformatics VU Biological Functions of Nucleic Acids • DNA transcription mRNA translation Protein tRNA (transfer RNA, adaptor in translation) rRNA (ribosomal RNA, component of ribosome) snRNA (small nuclear RNA, component of splicesome) snoRNA (small nucleolar RNA, takes part in processing of rRNA) RNase P (ribozyme, processes tRNA) SRP RNA (RNA component of signal recognition particle) …….. transcription + translation = expression Eukaryotes have spliced genes… DNA makes RNA makes Protein Some facts about human genes • • • • • • Comprise about 3% of the genome Average gene length: ~ 8,000 bp Average of 5-6 exons/gene Average exon length: ~200 bp Average intron length: ~2,000 bp ~8% genes have a single exon • Some exons can be as small as 1 or 3 bp. • HUMFMR1S is a typical gene : 17 exons 40-60 bp long, comprising 3% of a 67,000 bp gene • The human factor VIII gene (whose mutations cause hemophilia A) is spread over ~186,000 bp. It consists of 26 exons ranging in size from 69 to 3,106 bp, and its 25 introns range in size from 207 to 32,400 bp. The complete gene comprises ~9 kb of exon and ~177 kb of intron. • The biggest human gene yet is for dystrophin. It has >30 exons and is spread over 2.4 million bp. Nucleic Acid Basics • Nucleic Acids Are Polymers • Each Monomer Consists of Three Moieties: Nucleotide A Base + A Ribose Sugar + A Phosphate Nucleoside • A Base Can be One of the Five Rings: • Pyrimidines • Purines Nucleic Acid Basics • Nucleic Acids Are Polymers • Each Monomer Consists of Three Moieties: Nucleotide A Base + A Ribose Sugar + A Phosphate Nucleoside • A Base Can be One of the Five Rings: •Pyrimidines and Purines Can Base-Pair (Watson-Crick Pairs) Nucleic Acids As Heteropolymers • Nucleosides, Nucleotides • Single Stranded DNA 5’ 3’ •A single stranded RNA will have OH groups at the 2’ positions •Note the directionality of DNA or RNA Stability of base-pairing • C-G base pairing is more stable than A-T (AU) base pairing • 3rd codon position has freedom to evolve (synonymous mutations) • Species can therefore optimise their G-C content (e.g. thermophiles are GC rich) DNA compositional biases • Base composition of genomes: • E. coli: 25% A, 25% C, 25% G, 25% T • P. falciparum (Malaria parasite): 82%A+T • Translation initiation: • ATG (AUG) is the near universal motif indicating the start of translation in DNA coding sequence. Genetic diseases Cystic Fibrosis • Known since very early on (“Celtic gene”) • Inherited autosomal recessive condition (Chr. 7) • Symptoms: – Clogging and infection of lungs (early death) – Intestinal obstruction – Reduced fertility and (male) anatomical anomalies • CF gene CFTR has 3-bp deletion leading to Del508 (Phe) in 1480 aa protein (epithelial Cl- channel) – protein degraded in ER instead of inserted into cell membrane Structure Overview of Nucleic Acids • Unlike three dimensional structures of proteins, DNA molecules assume simple double helical structures independent on their sequences. There are three kinds of double helices that have been observed in DNA: type A, type B, and type Z, which differ in their geometries. The double helical structure is essential to the coding function of DNA. Watson (biologist) and Crick (physicist) first discovered the double helix structure in 1953 by X-ray crystallography. • RNA, on the other hand, can have as diverse structures as proteins, as well as simple double helix of type A. The ability of being both informational and diverse in structure suggests that RNA was the prebiotic molecule that could function in both replication and catalysis (The RNA World Hypothesis). In fact, some viruses encode their genetic materials by RNA (retrovirus) Three Dimensional Structures of Double Helices A-DNA Minor Groove Major Groove Forces That Stabilize Nucleic Acid Double Helix • There are two major forces that contribute to stability of helix formation – Hydrogen bonding in base-pairing – Hydrophobic interactions in base stacking 5’ 3’ 3’ 5’ Same strand stacking cross-strand stacking A-RNA Types of DNA Double Helix • Type A: major conformation of RNA, minor conformation of DNA; • Type B: major conformation of DNA; • Type Z: minor conformation of DNA 3’ 5’ 3’ A Narrow tight 5’ 5’ 3’ 3’ B Wide Less tight 5’ 5’ 3’ Z 3’ Left-handed 5’ Least tight Secondary Structures of Nucleic Acids • DNA is primarily in duplex form. • RNA is normally single stranded which can have a diverse form of secondary structures other than duplex. Non-B-DNA secondary structures •Cruciform •Triple-helical H-DNA •Slipped DNA = Hoogsteen basepair Secondary Structures of Nucleic Acids More Secondary Structures Pseudoknots: • DNA is primarily in duplex form. • RNA is normally single stranded which can have a diverse form of secondary structures other than duplex. Source: Cornelis W. A. Pleij in Gesteland, R. F. and Atkins, J. F. (1993) THE RNA WORLD. Cold Spring Harbor Laboratory Press. rRNA Secondary Structure Based on Phylogenetic Data 3D Structures of RNA: Transfer RNA Structures Secondary Structure of tRNA Tertiary Structure of tRNA 3D Structures of RNA: Ribosomal RNA Structures Secondary Structure Of large ribosomal RNA Tertiary Structure Of large ribosome subunit TyC Loop Variable loop Anticodon Stem D Loop Anticodon Loop Ban et al., Science 289 (905-920), 2000 3D Structures of RNA: Catalytic RNA Secondary Structure Of Self-splicing RNA Tertiary Structure Of Self-splicing RNA Some structural rules: •Base-pairing is stabilising •Un-paired sections (loops) destabilise •3D conformation with interactions makes up for this Sense/antisense RNA • antisense RNA blocks translation through hybrisization with coding strand Sense/antisense peptides •Have been therapeutically used Sense/antisense proteins •Does it make (anti)sense?