Document 13999380

advertisement
Does a folded RNA have an inside and outside?
Lathan and Cech (1994) used hydroxyl
radical probing to answer this question on
a tRNA molecule. Riboses in tertiary
structure will not be accessible to the
hydroxyl radical.
The tRNA was enzymatically synthesized and
so it contained no modified bases. Its end was
labeled with 32P, the RNA was folded by
heating and cooling, and then incubated with
Fe(EDTA) and peroxide.
[Fe(EDTA)]2- + H2O2 → [Fe(EDTA)]- + •OH + OHFenton chemistry is the generation of a hydroxyl radical by Fe(EDTA) that is oxidized by
peroxide. The neutral hydroxyl radical abstracts a hydrogen from deoxyribose or ribose, resulting in
strand scission. Preferred sites are H5′ > H4′ > H3′ ≈ H2′ (deoxyribose).
Blue: H4’
Red: H5’,5”
SASA:
solvent
accessible
surface
area
Cleavage is determined by solvent accessibility, and for the famous Drew-Dickerson DNA
dodecamer, cleavage efficiency reveals how sequence dependence in B-form structure exposes
or protects the deoxyribose.
Bishop et al., Chemical Biol in press (Tullius lab)
The pattern of protection
is mapped onto the
secondary structure.
Those riboses from bases
involved in tertiary
interactions are not
cleaved.
The 414 nucleotide Tetrahymena Group I intron is one of the most extensively studied RNAs.
It has been the model RNA system for folding studies for the past 20 years. Ask first: Does it
have an inside and an outside? Proteins have solvent-accessible surfaces and internal cores;
is this true for a large folded RNA?
Without Mg2+ there is uniform
cleavage over the entire RNA.
With Mg2+ there are areas that are
preferentially cleaved and some
that were completely protected.
The difference between traces
were mapped onto the secondary
structure of the intron.
Protected sites are shaded.
Cleaved sites are outlined.
The authors made several points:
Proteins have tightly packed interiors as a
natural consequence of their nonpolar and
hydrophobic amino acid side chains that
avoid solvation.
In contrast, the planar bases of a duplex are
in the middle of the RNA helix and the
anionic phosphates and polar sugars are on
the outside of a duplex.
How does this structure lend itself to
compaction?
The authors suggest that tertiary hydrogen
binding interactions between bases, sugars,
and phosphates like those in tRNA will be
present in the structure of the Group I intron.
Stacking interactions also contribute to
tertiary interactions.
“Finally, magnesium ions, neutralizing the
anionic phosphates and perhaps bridging
helices, could allow the backbones of
different helices to be packed close
together”.
In 1998, repeat the probing experiment, with two differences: observe temporal
folding using hydroxyl radicals generated from an X-ray beam.
Experiment: prepare end-labeled RNA. Mix it in the stopped flow with Mg2+ to a final concentration of 10
mM. After mixing, start sampling by irradiating with the beam and collecting samples. Time resolution is
10 msec. Run the samples out on a denaturing polyacrylamide gel and quantify the cleavage (protection)
with time. (Sclavi et al., 1998. Science 279: 1940-1945.)
Here are the data describing the time dependence of protection of sites.
Ybar is the fractional saturation of single protected sites, determined
from the power dependence of the beam p = plower + (pupper – plower)Ybar
and Ybar = 1 – e-kt. P is the saturation, pupper, plower the upper and lower
limits of the transition curve, k the first order rate constant, and t time in
seconds.
Curve A is protection of nt 174-176: k = 2.7 (-1.3, + 1.8) s-1
Curve B is nt 183-189. k = 0.9 (±0.3) s-1
Curve C is nt 57-59; k = 0.20 (±0.05) s-1
Open symbols are controls of pre-equilibrated RNA with Mg2+.
Map these data onto the secondary structure:
P indicates a
duplex
Regions with similar folding rates are
colored-coded.
Green is fast: 2 sec-1.
Orange has a tetraloop/receptor
and a fast folding rate of ~1 sec-1.
Pink folds slower: 0.2 sec
Yellow folds slowest: 0.06 sec
MODEL:
Russell et al. ((2000) Nat Str Biol 7:367-370) characterized this
form and others using SAXS (small angle X-ray scattering).
They found that when the intron folds in the presence of Mg2+, it is compact.
Moreover, the compaction happened fast, but the native structure wasn’t
completely formed until later. This led them to propose “electrostatic collapse”
for the RNA.
General principles:
RNA folding often requires Mg2+ ions.
RNA folding is hierarchical.
RNA molecules can misfold.
Concept: An RNA folding funnel.
M is Misfolded, and N is correct
Native fold.
Intermediates are also shown.
Paths depend on ions, temperature,
mutations, starting structures.
How do you think an RNA folds during
transcription?
Proteins also have folding funnels – how might they differ from the RNA funnels?
More ligands and nucleic acids and folding.
DNA also has close associations with ions. This is a Dickerson dodecamer crystal
structure that is very high resolution.
spermidine
Na+
Hexahydrate Mg2+
[Shui et al (Williams lab) 1998. Biochemistry 37:8341]
The “Spine of Hydration”
Nucleic acids are extensively hydrated, and the concept that there is a network of
ordered water molecules held in place through hydrogen bonding to bases and
phosphates is generally accepted. The “spine of hydration” was thought to occur in the
minor groove of B-form DNA based on early crystal structures of the Dickerson
dodecamer. Shui t al. revisited that study, and concluded that many of those waters
were in fact ions (Na+) that constituted the first layer of the spine. The waters were in a
second ordered layer.
More ligands and nucleic acids.
Ligand = protein.
The arrangement of hydrogen bond donors and acceptors allows a protein to
distinguish among AT, TA, CG, and GC in the major groove. In the minor groove, only
AT and GC pairs can be discriminated.
Protein:NA recognition mechanisms.
1) Coulombic interactions (with consequential ion release)
2) van der Waals (dipole-dipole and induced dipole)
3) Solvent driven (hydrophobic effect)
4) Hydrogen bonding
These interactions will be highly dependent on solution conditions of
temperature, salt concentrations, and pH. These conditions must be explicitly
stated in any description of protein binding to RNA or DNA.
1) Coulombic interactions (with consequential ion release)
Protein + DNA <=> Protein:DNA
Kobs = [PD]/[P][D]
This is too simple, since DNA (and RNA) are polyanions and bind counterions.
Logically, since Protein binds a nucleic acid, it must also ‘bind’ anions.
When the nucleic acid binds protein, it must release its counterions and waters
from sites that will interact with protein (vice versa for the protein).
P(aM+, bX-, cH+, dH2O) + D(eM+, fH2O) <=> PD(gM+, hX-, jH2O)
So a more accurate equilibrium reaction is
P + D <=> PD + xM+ (x = g-(e+a))
so increasing the concentration of M+ will
shift the equilibrium to the left (free P and
D).
2) van der Waals (dipole-dipole and induced dipole)
London dispersion forces are weak interactions that are typically induced-dipole.
4. Hydrogen bonding
Recognition of a specific site is often described in terms of ‘direct readout’ – amino acids
of the protein ‘recognize’ the 3D arrangement of hydrogen bond donors and acceptors on
the nucleic acid. ‘Indirect readout’ – the protein recognizes conformational features of the
nucleic acid.
Hydrogen bonding is the most common devise to obtain specificity of interactions, since
hydrogen bonding has preferences for length and bond angle.
However, it is not sufficient to consider direct interactions between protein and nucleic
acid since many specific interactions are mediated by water molecules (not necessarily
visible in crystal structures).
Energetically, residues that are involved in intermolecular hydrogen bonding are often
hydrogen bonded to water in the free state, so there is not a large energy gain in
formation of the protein:nucleic acid hydrogen bond (about -1.1 to -1.7 kcal/mol H-bond).
But, if a hydrogen bond to water is not replaced by an equivalent hydrogen bond, then
there is an energy loss associated with complex formation.
Specificity due to hydrogen bonding is more related to losing a hydrogen bond than
forming one, although the opposing effects are often impossible to separate.
Essential features for modulating the binding of a protein to a nucleic acid are:
1> Reversible binding.
2> Competitive binding.
The same protein for different sites or many proteins for the same site.
3> Modulation of binding affinity and specificity by small effector ligands
4> Competition between different protein subunits.
Binding can be modulated in two ways:
1> Thermodynamic, or equilibrium control.
In this case, regulation is achieved by equilibrium binding affinities of various
proteins for their DNA/RNA sites, and so the percent site occupancy by a given
protein is the key.
2> Kinetic control.
The rates of complex formation or dissociation are most important.
To describe complex formation, it is necessary to know the binding affinity and rates of
binding and dissociation.
In practical terms, in order to understand regulation by a protein:nucleic acid interaction, it
is necessary to know the binding mechanism.
What proteins bind to DNA and RNA?
The helix-turn-helix motif binds to DNA.
These aren’t stable out of the context
of the whole protein.
Helix-turn-helix binds to DNA duplexes.
Zinc finger specificity can be
modulated.
Three tandem
fingers bind to DNA.
What’s the
advantage of having
more than one
finger?
Wolfe et al., (Pabo lab)
Used a leucine zipper linked to
two Zn-fingers to create a new
DNA binding protein.
Leucine zippers themselves can bind
DNA (fos/jun, GCN4, bZIP)
Complex DNA binding proteins
E. coli SSB protein
(Lohman lab, WUMS)
EcoR1 restriction enzyme + DNA
(Rosenberg lab, U Pitt)
Proteins that bind RNA.
1. The Arginine-rich motif
(ARM)
BIV tat and HIV1 Rev peptides can
fit into the major groove because
there is a dramatic deformation of
the A-form duplex. It is wider due
to the bulged nucleotides so that
the peptide can make contact with
bases, sugars, and phosphates.
Due to the number of interactions
(hydrogen bonds, electrostatics,
and stacking hydrophobic amino
acids), the dissociation constants
of these small complexes can be
nanomolar.
1BIV,1MNB
1ETF
1A4T
Weiss & Narayana (1998) Biopolymers 48:167
P22: NAKTR RHER9R RKLAI ERDTI
The P22 peptides WT, Pro9, and Ala9
mutants do not show evidence of a stable
helix. Binding of the Ala9 peptide is at least
as tight as the wt peptide, but the Pro
mutant doesn’t bind. [A construct with four
Ala added to the terminus does have the
CD signature of an α-helix.]
wt
Pro9
Ala9
The point is: there is not one unique way for an ARM peptide to make specific contact
with an RNA. It is difficult to predict and almost impossible to model.
A is P22/BoxB. The α-helix fits into
the groove of the RNA and bends
over the GAAAA loop where its Trp
stacks with an adenosine. The
peptide bends at R11 to allow the
helical sidechains to stack with the
nucleobases.
B is HIV Rev/RRE. The REV
peptide forms an α-helix, but
positions itself in the RNA
bulge. The peptide contacts
both RNA strands.
C is BIV Tat/TAR. The Tat
peptide forms a β hairpin as it
positions itself in the RNA
bulge. The peptide contacts
both RNA strands.
2. dsRBD
Double-stranded RNA
binding domains
(dsRBM) are nonspecific,
but sensitive to A-form
structure.
This is the domain from
PKR (Protein Kinase R).
The affinity comes from many contacts between the protein and 2′ OH groups in the
minor groove. If a DNA were A-form, these contacts would be missing, but the protein
could still make electrostatic contacts with the phosphates. dsRBD binding is structureselective, but not sequence-specific.
Other very important proteins have this motif:
ADAR1, the RNA-specific adenosine deaminase that converts adenosine to inosine in duplexes contains
three dsRBDs.
DCR (Dicer) is the enzyme that cleaves double-stranded RNAs into 21 base-pair pieces. These small
duplex RNAs go on to become incorporated into the RISC, where they are bound by Ago and become the
templates for RNAi cleavage of mRNAs. DCR has one dsRBD.
Argonaute (Ago) proteins have two dsRBDs. They bind to miRNA and siRNA as part of the process of
gene regulation by translation repression (the current model for miRNA activity) or mRNA degradation
(RNAi).
3. RNA Recognition Motif (RRM). It is the most common eukaryotic RNA
binding domain. RRMs are identified by their conserved sequences
Consensus
RNP-2
LFVGNL
IY I KL
RNP-1
KGFGFVXF
R YA
Y
Two or three aromatic residues are solvent-exposed
on the surface of the β sheet
RNP1 begins in Loop3 and extends through β3.
RNP2 extends through β1.
Birney, E.,
Kumar, S.,
Krainer, A.R.
1993. Analysis of
the RNArecognition motif
and RS and RGG
domains:
conservation in
metazoan premRNA splicing
In 2011,
there
factors
. Nucleic
are 64620 RRM sequences
the NCBI Protein Database
Acids Res.in
21(25):
5803–5816.
and 400-500 structures.
The human U1A protein is the best studied of all the RRMs. It binds Stemloop II of U1 snRNA.
A cocrystal shows how the
RNA sits on the surface of the
β-sheet and how a protein loop
pokes through the 10
nucleotide RNA loop.
What we have learned: Prediction of an RNA:protein interaction will have
unique difficulties. You can’t simply dock the molecules, since both of them
change their conformation. You can’t design a new RNA binding protein the
same way you can design a new zinc finger DNA binder, or a helix/turn/helix.
Download