Genomic SELEX to identify RNA targets of plant RNA binding proteins

advertisement
1
Genomic SELEX to identify RNA targets of plant RNA binding
proteins
Olga Bannikova and Andrea Barta
Max F. Perutz Laboratories, Medical University of Vienna, Dr. Bohrgasse 9/3, A-1030 Wien, Austria
Address correspondence to: Andrea Barta or Olga Bannikova, Max F. Perutz Laboratories, Medical
University of Vienna, Dr. Bohrgasse 9/3, A-1030 Wien, Austria ; email:
andrea.barta@meduniwien.ac.at; olga.bannikova@univie.ac.at;
1. Abstract
Systematic evolution of ligands by exponential enrichment (SELEX) is an elegant technique
and allows the isolation of RNA and DNA sequences which directly interact with a protein of
interest. Genomic SELEX is an expression level independent selection method which is
useful when multiple RNA targets are expected. These RNAs might be expressed in different
conditions, or are differentially localized, or have diverse expression levels. Therefore, this
method allows the determination of RNA sequences within a particular genome which are
potential binding partners of a particular protein. At first a DNA library is constructed by
random priming of sheared Arabidopsis DNA with a direct and reverse primer and selection
of fragments with a desired length of 200-300 nucleotides. The RNA library is constructed by
transcribing the DNA fragments with T7 polymerase. Several rounds of selection with a
protein of choice yield a highly specific pool of potential RNA targets. This pool is best
sequenced by a deep sequencing method such as 454 sequencing technology and
sequences of the selected library are analyzed by bioinformatics methods. The result in a
two month time experiment is a collection of RNAs which are binding targets for the protein
used for selection. These RNAs usually allow the determination of binding motifs.
Keywords: RNA selection, RNA binding proteins,
Bannikova and Barta, Genomic Selex
2
2. Introduction
Systematic evolution of ligands by exponential enrichment (SELEX) is a combination
of a combinatorial chemistry approach and experimental molecular biology techniques
allowing the isolation of high affinity binding partners to a given molecular target [1]. The first
SELEX experiments used randomized artificial RNA-aptamer libraries from which strong
binders for proteins or small molecules were selected. Typically, the initial aptamer library
contains around 1015 to 1016 DNA oligonucleotides with a randomized central part and fixed
flanking regions [2]. Such a pool can be easily converted to RNA by in-vitro transcription and
then, after the selection step, back to DNA via RT-PCR. The selection is based on incubation
of the library with the target molecule followed by separation of the unbound fraction from the
formed complex which is usually performed on nitrocellulose filter [3, 4]. The nucleic acids of
the selected complex are isolated and then amplified. Such cycles are repeated several
times to select for high affinity binders for the target molecule. The aim of this selection is the
isolation of RNA or DNA oligonucleotides which have the strongest binding affinity to the
target of interest. These tight binding partners are often used in diagnostic and therapeutic
applications [5].
Several problems can emerge throughout the selection procedure. One of the
possible troubles is losses of the bound oligonucleotides or enrichment of unspecific binding
targets due specificity selection procedure. To overcome that problem few other methods for
the separation step were described [for review see 6]. As the described SELEX method is
optimized to select the winners for best binding to a particular molecule, they have their
benefits if such artificial aptamers are used in clinical studies. Applications range from
inhibition of a particular protein with this aptamers to using them as substitutions for
antibodies. However, these selected aptamer sequences might not be present in the genome
of the organism as such strong binders might be detrimental for the function of the protein
used in the selection. Biological procedures necessitate a certain dynamic equilibrium as
proteins have to bind specifically in one situation but often have to be released again in order
to continue the biological process. Therefore, using a randomized aptamer library for the
SELEX procedure is not optimal for finding natural binding sites for your favorite RNA binding
protein, as the best binder might not corresponded to real binders occurring in vivo in the
cell.
To overcome some of the problems in the field of RNA –protein interactions, another
type of SELEX was developed termed Genomic SELEX. In contrast to a randomized library,
only sequences occurring in the genome of a specific organism are used for library
construction allowing the search for real-existing DNAs or RNAs from the same organism of
interest.
Bannikova and Barta, Genomic Selex
3
The first DNA library for Genomic SELEX was developed in 1997 by B.S. Singer [7] for
E.coli, S.cerevisiae and human genomes. Subsequently, several proteins have been used to
find in vivo RNA targets via Genomic SELEX experiments [8, 9]. These experiments showed
the ability of Genomic SELEX not only to find known targets for these proteins, but detected
many new targets which then were proven in in vivo studies.
The selection procedure does not vary significantly from a randomized aptamer
SELEX. The only difference lies in the library construction where short genomic sequences
are prepared with a Klenow dependent addition of adapters to the DNA. These adaptors are
chosen to permit the construction of an RNA library which now contains most genomic DNA
sequences of an organism in form of RNA sequences. This allows now to select for protein
binding regions independent of their level of transcription. Unsurprisingly, many problems
which have to be dealt with are similar in Genomic SELEX as in randomized aptamer-based
selection.
A critical point for this genomic approach is the loss of weak but biologically
significant binders during selection. Consequently the stringency of selection conditions
especially in the first rounds of SELEX should be decreased which boosts the diversity of
RNA targets in comparison to selecting only a few winners. Other interference in such
experiments is the possibility of forming secondary structures between the middle part and
the adapter sequences in a library which could lead to unspecific selection of targets. An
approach to handle this issue was suggested by Wen and Grey [10] which developed a
primer-free Genomic SELEX. Finally, genomic SELEX as any other SELEX procedure
remains an in vitro technique and results of the selection procedure have to be proven by
other in vitro methods like a gel shift assay and by experiments to show binding and activity
in vivo (CLIP or ChIP-chip technology) [1].
Here we present our variant of Genomic SELEX which was developed for an
Arabidopsis thaliana (as well as an S. pombe) RNA-library where we tried to avoid possible
problems inherently residing in SELEX techniques. This protocol can be performed in two
months and results in a highly-specific RNA-pool saturated with targets sequences for the
protein of interest. These RNAs are suitable for further sequencing and bioinformatic
analysis.
3. Protocol
Protocol 1: Genomic library construction
3.1 DNA preparation
3.1.1 Genomic DNA isolation
Genomic DNA was isolated from 2-weeks old seedling of wild-type Arabidopsis
thaliana (Columbia) using the DNAeasy plant mini kit (Qiagen) by following the manufactures
Bannikova and Barta, Genomic Selex
4
instructions. The optimal amount for 30 µg of pure genomic DNA is about 2 gram plant tissue
grounded in liquid nitrogen. DNA should be in TE (10 mM Tris/HCl pH 8.0, 1 mM EDTA)
buffer, its concentration quantified by measuring OD260nm and DNA purity should be checked
on a 1% agarose gel. If DNA is still contaminated with RNA an additional RNase treatment is
required following the standard protocol.
3.1.2. DNA fragmentation
30 µg of pure genomic DNA was placed in a 13 ml round-bottom falcon tube and
fragmented by ultrasound treatment using a Bandelin Sonoplus UW2070 device with a MS73
microtip. DNA was sonicated 8 times with 10 pulses for 10 seconds at 70% power.
Sonication should produce fragments from 100 bp to 4 kb in length, which should be checked
by agarose gel electrophoresis and compared with unshared control DNA (Figure 1A).Then
sheared DNA was precipitated overnight in the presence of 1/10 volume 3 M NaOAc pH 5.4,
and 3 volumes of absolute EtOH. The pellet was resuspended in 100 μl of TE buffer.
3.2 Library development
3.2.1 Primers design
Two pairs of primers were used in the protocol. The first pair was needed for the
introduction of adaptor sequences to the sheared DNA. The second pair introduced the T7
promoter sequence for creating the RNA library. The adapters should meet several
requirements: 1. they should be able to prime at any sequence in genome, using randomized
8 nucleotides. 2. Have fixed sequence for further manipulation. 3. Sequences of the primers
should not be present in the genome.
Based on the features above, forward and reversed primers for the Klenow reaction were:
Fran: AGGGGAATTCGGAGCGGGGCAGCNNNNNNNNN
Rran: CGGGATCCTCGGGGCTGGGATGNNNNNNNNN
The second pair of primers should be complementary to the fixed part of the first primer pair.
In addition, the forward primer must have a T7 promoter sequence for the in vitro
transcription.
Fclcf: CCAAGTAATACGACTCACTATAGGGGAATTCGGAGCGGG
Rclcr: CGGGATCCTCGGGGCTG
3.2.2 Primer labeling
Bannikova and Barta, Genomic Selex
5
In order to visualize the incorporation of the randomized primer-adaptors to the
genome, a part of the adaptors was radioactively labeled at their 5’ end with [γ32P] ATP and
T4 polynucleotide kinase using standard protocols. Unincorporated nucleotides were
separated by a G-50 column from GEHealthcare following the manufacturer’s instructions.
3.2.3 Introducing adaptor sequences to the genomic DNA by Klenow reaction
The starting material was about 25 µg of sheared and purified DNA (3.1.2.) at a
concentration of 1mM. The concentration of the primers should be enough to allow annealing
once every 40 nucleotides as described in [8]. To monitor the reaction process, part of the
forward and the reverse primer (containing the randomized sequence) were kinased. Then
the reaction mixture was split into two tubes to control the introduction of the forward and
reverse primer separately. The DNA solution was mixed with 255 μM primer Rran to a final
concentration of 12 μM and one tube was supplemented with 2 µM radioactive Rran. Both
tubes were incubated for 3 min at 93oC, and then placed on ice and further treated in parallel.
After addition of 10x Klenow buffer and deoxyribonucleotides to a final concentration of 1
mM, the reaction was started with 67 U of Klenow exo-minus enzyme (Fermentas) and
incubated for 5 min on ice. Incubation was for 25 min at room temperature and then 5 min at
50oC. The reaction was inactivated by adding EDTA (final conc. 15 mM) and heating for 10
min at 75oC. The reaction mixture was cleaned up from low molecular weight substances
with YM-30 Millipore columns. At this point the efficiency of incorporation could be monitored
in the radioactive sample by a denaturing 8% polyacrylamide gel with 7 M urea (Figure 1B).
The same protocol was applied for the forward primer reaction and now the radioactive Fran
primer was added to the non-radioactive sample. Note, that the specific activity of the
primers was low, so no high incorporation of radioactivity was expected. But, typically a
smear of labeled DNA above the radioactive primer should be visible (Figure 1B).
3.2.4. Gel-purification and size selection of DNA fragments
At this point the DNA has to be separated from unincorporated primers. The two
samples from the previous steps were combined and fractionated for about 2h at 100V on a
preparative 8% denaturing polyacrylamide (7 M urea) gel with labeled size markers.
Then DNA of 100 to 700 bp was extracted from the gel which was divided into small pieces;
frozen to -80oC for 15 min, and then eluted with buffer (10 mM Tris-HCl pH=8.0, 2 mM EDTA
pH=8.0, 0.3 M NaOAc pH=5.4). The mixture was heated for 5 min at 95oC and left overnight
at 25oC shaking (900 rpm). Alternatively shaking could be at 65oC for 3 hours.
Bannikova and Barta, Genomic Selex
6
Next, the gel mixture was filtrated though a 0.22 µm nitrocellulose filter (Millipore) and
precipitated with 2 volume of EtOH for at least 3 hours.
3.2.5. Introduction of the T7 promoter
For the introduction of the T7 promoter sequence the second pair of primers (Fclcf and
Rclcr) were used for the PCR reaction. The number of PCR cycles shouldn’t be more then 10
to avoid artificial byproducts. One of the available proof-reading DNA polymerases (we
usually use Phusion polymerase from Finnzyme) should be used to prevent mutations.
Typically, PCR was performed: 40 s denaturation at 95oC, 40 s annealing at 55oC (-3oC
below the Tm of primer), 20 s elongation at 72oC. The PCR reaction was cleaned via
phenol/chloroform extraction followed by PCR-clean up kit which has a low cut off to leave
small DNA fragments in the library (e.g. Nucleospin extract II (Macherey-Nagel).
3.2.6. Library cleaning and verification
At his point the DNA library was almost ready, but still can contain some undesirable
features, such as fragments containing the same primer (forward or reverse) at both ends.
To get rid of such products, the library was subjected to an in vitro transcription reaction
followed by a reverse transcription and PCR reaction (RT-PCR).
In order to produce a lot of RNA (usually around 50 µg) from a limited amount of
DNA, a High Yield Transcription kit (e.g. from Fermentas) is strongly recommended.
Transcribed RNA was extracted with phenol/chloroform and precipitated with 2 volumes of
EtOH overnight. For the precise quantification of RNA, it was crucial to purify the RNA with
any clean-up kit (e.g. MegaClear RNA clean-up kit from Ambion). After that the quality of the
RNA was checked by agarose gel electrophoreses.
For reverse transcription of RNA and a one-step RT-PCR kit from Qiagen was used,
because it contained two types of reverse transcriptases which allowed reverse transcription
of low and high abundant transcripts from the mixture. In case of using this kit, the number of
PCR cycles should be kept to 7-9 to decrease formation of unspecific products. After
checking the concentration of the DNA library, it could be further amplified using one of the
proof-reading polymerases. Now, the DNA library is ready for a selection procedure and can
be store at -20oC for at least 6 months or for a longer period at -80oC.
It is important to check the quality and comprehensiveness of the library by choosing 5 - 15
single copy gene primers for amplification of an appropriate size product from the library by
standard PCR [7].
Bannikova and Barta, Genomic Selex
7
Protocol 2: Affinity selection of RNA targets
3.3. In vitro transcription
In each cycle of the SELEX procedure, the same amount of starting DNA from the
previous cycle was used. Optimally, about 1 µg of DNA should be utilized as starting material
for in vitro transcription. Transcription should be performed as described in section 3.2.6.
3.4. Selection of RNAs bound to the protein of choice
The protein of interest was usually a recombinant protein purified from a prokaryotic
or eukaryotic protein expression system. In our case, a recombinant GST-tagged protein (the
RRM and Zn-knuckle domain from atCyp59 [11]) expressed in E.coli was used for selection.
The protein should be dialyzed into binding buffer with a 1:1000 excess.
There were several important points to take care off in this step. First, at each cycle 10µg
clean RNA (without remaining DNA and nucleotides) was used. Second, each RNA-protein
complex behaved differently; meaning that in each case information about binding conditions
(such as buffer, working pH and binding temperature) should be empirically assembled. If
nothing is known about the protein of interest or of a similar protein, PBS buffer (0.135 M
NaCl, 27 mM KCl, 8mM Na2HPO4, 2 mM NaH2P04) with 10 mM Mg2+ could be used. Third,
for binding and washing siliconized eppendorf tubes were recommended to prevent loss of
glutathione beads.
Before performing the binding reaction of the protein to the GST- beads, an aliquot of
40 µl 50% 4B glutathione sepharose (which is sufficient to bind up to 8 µg protein) was
blocked with 0.5 ml tRNA 200 µg/ml (Sigma) and incubated 30 min at 4oC by slow rotation
and then washed three times with 10 volumes of binding buffer (1xPBS with 10mM MgCl2). In
parallel, the protein-RNA binding reaction was performed in solution (100 µl reaction mixture
contained 10 µg of RNA in binding buffer, supplemented with protease inhibitor cocktail
(Roche) and RNAse inhibitor (Invitrogen)). This mixture was heated for 5 min at 70oC and
then left for 10 min at 25oC (refolding process). Then the dialyzed protein was added in a 3:1
molar excess of RNA over protein in the first 3 cycles. Then the stringency was increased
and a ratio of 10:1 was used in later cycles. The binding reaction was incubated for 30 min
at 4oC (or another temperature appropriate for the protein of interest). Then, the blocked and
washed GST-beads were added to the binding reaction and incubated for another 30 min.
Then the beads were washed 3 times with 10 volumes of binding buffer and then eluted
twice with 100 µl of reduced glutathione elution buffer according to the manufactures
instructions (GEHealthcare). When performing the control – anti-GST selection, purified
Bannikova and Barta, Genomic Selex
8
GST-tag protein at the 10:1 molar ratio was used instead of the protein of interest. To release
RNA from the RNA-protein complex, 400 µl FES buffer (20 mM citric buffer pH 5.0, 7 M urea,
1mM EDTA pH 8.0) and 400 µl of Phenol pH 6.0 was added and vigorously shaken for 10
min at 900 rpm. Then 200 µl of H2O was added and the mixture extracted with an equal
volume of phenol/chloroform/isoamylalcohol : 25/24/1. The RNA was precipitated overnight
by adding 40 μg of glycogen, 1/10 volume of 3 M NaOAc pH 5.4, and 2 volumes of EtOH.
The precipitate was dissolved in 20 µl water, cleaned from residuals of phenol with
Megaclear RNA clean up kit and then the yield of the selected RNA was measured.
3.5. RT-PCR, amplification.
In each step the selected RNA was reverse transcribed as described in section 3.2.3
and then the cycle of in vitro transcription and selection was repeated up to 7-12 times. The
number of cycles depends on the affinity to the protein and the diversity of the RNA targets.
Generally, selection may be stopped when half of the selected RNA pool binds to the protein.
For example, in a 10:1 excess of RNA over protein in the binding reaction, half of the
possible yield is 5% of RNA binding to the protein (Figure 2).
4. Example from an experiment
Figure 1b shows the size distribution of the radioactive labeled DNA library on a 8%
denaturing polyacrylamide gel after the ligation of the first pair of primer-adaptors.
Figure 2 shows a typical example of RNA recovery after each cycle in an SELEX experiment
using the RRM and Zn-knuckle domain of atCyp59 [11].
Table 1 shows a statistics analysis of the sequenced RNA pool after the 11th cycle, enriched
with binding targets for the AtCyp59 protein after a SELEX experiment.
5. Troubleshooting
Problem
No PCR product for a particular gene during
Reason + solution

library verification step
Library is not representative. Check
the purity of genomic DNA and try to
increase the amount of the first pair of
primers during Klenow extension
reaction.

Primers chosen amplify a longer gene
product than the average size of the
Bannikova and Barta, Genomic Selex
9
DNA library. Choose primers to
amplify a smaller fragment of the gene
No RNA recovered from the selection step

Binding conditions are not appropriate.
Try to optimize buffer, pH,
temperature, incubation time, amount
of MgCl2.

Complex was not eluted from the
beads. Ensure that glutathione buffer
was stored in proper conditions. Use
only fresh aliquots.
Yield of RNA over cycles doesn’t increase.

RNA-protein complex is very specific
to one RNA. Decrease stringency of
the selection.
Size of DNA library shifts to longer
fragments than in the original library.

Too many cycles of PCR during
selection. Decrease number of cycles
to a minimum of 7-10 or avoid second
amplification step after RT-PCR
6. References
1. Djordjevic, M. (2007). SELEX experiments: new prospects, application and data analysis
in inferring regulatory pathways. Biomol. Engineering 24, 179-189
2. Gold, L. (1995). Oligonucleotides as re search, diagnostic, and therapeutic agents. J. boil.
Chem. 270 (23), 13581-13584.
3. Tuerk, C., Gold, L. (1990). Systematic evolution of ligands by exponential enrichment:
RNA ligands to bacteriophage T4 DNA polymerase. Science 249, 505-510
4. Schneider, D., Gold, L., Platt, T., (1993). Selective enrichment of RNA species for tightbinding to Escherichia-Coli Rho-factor. FASEB J. 7, 201-207
5. Bunka, D.H., Stockley, P.G., (2006) Aptamers come of age-at last. Nat. Rev. Microbiol.
4(8), 588-596.
6. Gopinath, S.C. (2007). Methods developed for SELEX. Anal. Bioanal. Chem. 387, 171182.
7. Singer, B.S., Shtatland,T., Brown, D., Gold, L., (1997). Libraries for genomic SELEX.
Nucleic Acid Research 25 (4), 781-786
8. Lorenz, C., Pelchrzim, F., Schroeder, R., (2006). Genomic systematic evolution of
ligandsby exponential enrichment (Genomic SELEX) for the identification of protein-binding
RNAs independent of their expression levels. Nat.Protocols 1(5), 2204-2212
Bannikova and Barta, Genomic Selex
10
9. Kim, S., Shi, H., Lee, d., Lis, J.T., (2003). Specific SR protein-dependent splicing
substrates identified through genomic SELEX. Nucleic Acids Research 31(7), 1955-1961
10. Wen, J.-D., Gray, D. M., (2004). Selection of genomic sequences that bind tighly to Ff
gene 5 protein: primer-free genomic SELEX. Nucleic Acids Research 32(22), 182-192
11. Gullerova, M., Barta, A., Lorkovic, Z., (2006). AtCyp59 is a multidomain cyclophilin from
Arabidopsis thaliana that interacts with SR proteins and the C-terminal domain of the RNA
polymerase II. RNA 12(4), 631-643
Acknowledgements:
The authors are grateful to M. Kalyna for fruitful discussions. This work was funded by the
EU FP6 Programme Network of Excellence on Alternative Splicing (EURASNET) [LSHG-CT2005-518238]; the Austrian Science Foundation (FWF: SFB-F017/10/11; DK W1207, RNA
Biology) and the Austrian GEN-AU program (ncRNAs).
Figure legends
Figure1. Library development
A. Ethidium bromide stained 1.2 % agarose gel shows size distributions of genomic
Arabidopsis thaliana DNA after isolation (lane 1) and after fragmentation via ultrasound (lane
2). Lane M1 is a lamda HindIII marker, lane M2 is 100 bp DNA size ladder (Fermentas).
B. Example of an autoradiogram showing size distribution of the fragmented DNA after
ligation of primer-adaptors. Lane 2 and 4 are control lanes showing radioactively labeled
reverse (Rran) and forward (Fran) primers, respectively. Lanes 1 and 3 are DNA from the first
and second Klenow’ extension reaction, respectively. Lane M is a φX174 DNA/HinfI labeled
size marker (Promega).
Figure 2. Recovery of RNA during the selection cycles with atCyp59 as a bait
Example of a typical Genomic SELEX experiment is showing increasing recovery of selected
RNAs with increasing numbers of selection cycles. The first 4 cycles are carried out in
relaxed selection conditions with a molar excess of RNA over protein of 3:1. These
conditions allow selection of ligands with low affinity. Further cycles of selection are
performed with excess RNA over protein of 10:1. In the penultimate step, a counter selection
step is included with GST-tag protein alone to assure exclusion of binders to the GST-tag
from the pool.
Bannikova and Barta, Genomic Selex
11
Bannikova and Barta, Genomic Selex
12
Bannikova and Barta, Genomic Selex
13
Overview: Scheme of a Genomic SELEX experiment
Outcome:
RNA pool enriched in RNA targets for the protein of interest, ready for sequencing
Questions to be answered:
What are the direct RNA binding partners for your protein of interest?
What are the RNA binding motifs of the protein?
Bannikova and Barta, Genomic Selex
Download