Supp Material

advertisement
Supplemental Information
for
Regulation of alternative splicing in Drosophila by 56 RNA binding proteins
Angela N. Brooks, Michael O. Duff, Gemma May, Li Yang, Mohan Bolisetty,
Jane Landolin, Ken Wan, Jeremy Sandler, Susan E. Celniker,
Brenton R. Graveley*, Steven E. Brenner*
*Corresponding Authors
Brooks et al.
2
Brooks et al.
Supplemental Figure 1: RT-PCR validations of RNAi knockdowns. a. RT-PCR was
performed on all samples using primers to RP49 as a control for RNA quantities. b. The
efficiency of depletion was monitored RT-PCR of the target gene in both replicates of
the RNAi samples and compared to the levels in both replicates of the untreated control
samples.
3
Brooks et al.
4
Brooks et al.
Supplemental Figure 2: Correlation between JuncBASE PSI, a. RT-PCR PSI,
and b. Bradley et al. PSI. Scatter plot of thePSI as calculated by JuncBASE from
the RNA-seq data, a. from the RT-PCR reactions or b. reported in Bradley et. al. 2015.
Values for splicing events significantly affected by the knockdown of one or more
proteins as determined by JuncBASE are shown. A best fit regression line is shown with
R2 based on Spearman.
5
Brooks et al.
Supplemental Figure 3
RBP Knockdown Efficiency
1.0
tra2
x16
Rm62
nonA-l
Hrb27C
B52
Ref1
Caper
Hrb87F
HnRNP-K
Upf1
CG1646
RnpS1
sqd
Rbp1
Syp
Sxl
ytr
Rbp1-like
CG30122
heph
Cnot4
shep
spoon
mask
Rnp4F
qkr58E-3
elav
Srp54
bol
qkr54B
eIF3-S9
msi
Dp1
Rox8
SC35
Rsf1
qkr58E-2
SF1
barc
Fmr1
CG6227
RpS3
CG6841
glo
mub
CG7878
CG7971
hay
pea
eIF3ga
snRNP-U1-70K
tsu
Psi
rump
rin
Hrb98DE
0.9
0.8
RBP Gene
0.6
0.5
0.4
RNAi Efficiency
0.7
0.3
0.2
tra2
x16
Rm62
nonA-l
Hrb27C
B52
Ref1
Caper
Hrb87F
HnRNP-K
Upf1
CG1646
RnpS1
sqd
Rbp1
Syp
Sxl
ytr
Rbp1-like
CG30122
heph
Cnot4
shep
spoon
mask
Rnp4F
qkr58E-3
elav
Srp54
bol
qkr54B
eIF3-S9
msi
Dp1
Rox8
SC35
Rsf1
qkr58E-2
SF1
barc
Fmr1
CG6227
RpS3
CG6841
glo
mub
CG7878
CG7971
hay
pea
eIF3ga
snRNP-U1-70K
tsu
Psi
rump
rin
Hrb98DE
0.1
0.0
RNAi Sample
Supplemental Figure 3: Depletion efficiency/specificity of each RNA Binding
Protein Sample. The FPKM of all 56 RNA Binding Protein genes was calculated in
each sample and the depletion efficiency calculated as [(FPKMuntreated - FPKMknockdown)/
FPKMuntreated]. The results are plotted as a heatmap with the RNAi samples in columns
and the RNA Binding Protein genes in rows. The bright diagonal indicates efficient
depletion of the target gene.
6
Brooks et al.
Supplemental Figure 4
SR
hnRNP
Core
EJC
Other
Novel
B52
SRP54
SC35
RBP1
RBP1-LIKE
RSF1
X16
SQD (HRP40)
HRB27C (HRP48)
HRB87F (HRP36)
GLO
CG30122
HNRNP-K
HRB98DE (HRP38)
RUMP
MUB
HEPH
YTR
SYP
RM62
SNRNP-U1-70K
PEA
SF1
CG6841 (PRP6)
CG1646 (PRP39)
CG6227 (PRP5)
RNP4F
TSU
UPF1
REF1
RNPS1
CAPER
PSI
BARC
ROX8
SXL
HAY
ELAV
RIN
QKR58E-3
NONA-l
TRA2
CG7971 (SR300)
CG7878 (DMRH26)
EIF3GA
SHEP
MASK
EIF3-S9
RPS3
DP1
CNOT4
FMR1
BOL
QKR58E-2
SPOON
QKR54B
0
200
400
600
800
1000
Number of affected genes
-4
-3
-2
-1
1
2
log2(RNAi FPKM/Untreated FPKM)
3
4
Supplemental Figure 4: Significant changes in gene expression observed upon
knockdown of 56 proteins. The fold change in FPKM levels of all was calculated in
each RNAi sample in comparison to the untreated control sample. The number of genes
that were significantly affected and the magnitude of those changes are plotted for each
sample.
7
Brooks et al.
Supplemental Figure 5: Comparison of the number of gene expression and
splicing events affected in each sample. For each protein, the number of gene
expression and splicing events that were significantly affected was calculated and
plotted as the rank order.
8
Brooks et al.
Supplemental Figure 6: Cross regulation of gene expression between the 56 RNA
binding proteins. The fold change in FPKM levels of each of the 56 RNA binding
proteins tested was calculated in each RNAi sample in comparison to the untreated
control sample and plotted as a heatmap representation. Fold change of target gene is
not shown and is colored grey.
9
Brooks et al.
Supplemental Figure 7: Specific and shared effects by 56 proteins. For each class
of splicing event, the number of events that were affected be each number proteins is
plotted. Most events are only affected by one protein, but many are affected by more
than one.
10
Brooks et al.
Supplemental Figure 8: Schematic depiction of the various classes of alternative
splicing events. The various classes of alternative splicing events are depicted.
11
Brooks et al.
Supplemental Tables
Supplemental Table 1: RNA Binding Proteins Studied. The RNA binding proteins
analyzed in this study are listed. The gene names (CG number and synonyms), domain
types, category, and number of uniquely aligned reads are indicated. Additionally, the
ranges of correlation of exon coverage from RNA sequencing lanes between biological
replicates are shown. The categories are not mutually exclusive. Core; core component
of the spliceosome. EJC; exon junction complex or nonsense-mediated decay. Prior, No
Prior; Prior or no prior evidence for a role in splicing regulation.
Supplemental Table 2: Primers used for RT-PCR Validation of RNAi Depletion
Supplemental Table 3: Annotation and PSI values for all splicing events observed
from RNA-seq
Supplemental Table 4: Annotation of all splicing events considered significantly
affected by at least one of the 56 proteins. Percent spliced in (Psi) values are given
only for samples with differential splicing from reference, otherwise “NA” is given.
Supplemental Table 5: Significant changes in gene expression upon knockdown
of 56 proteins.
12
Brooks et al.
Supplemental Methods
Identifying D. melanogaster proteins with an RRM or KH domain
All D. melanogaster protein sequences were obtained from Uniprot (Jain et al. 2009).
Each sequence was searched against the Pfam database (Finn et al. 2010) using
hmmpfam (hmmer.org) for the presence of the Pfam domains RRM_1, RRM_2, RRM_3,
KH_1, KH_2 with the default cutoff score. To find additional proteins that may have
been missed by Pfam, each sequence was also compared against SMART (Letunic et
al. 2009) domains RRM, RRM_1, and KH and against the Prosite (Hulo et al. 2006)
domains RRM (PS50102), KH_TYPE_1(PS50084), KH_TYPE_2(PS50823). The
SMART hmms were scanned with hmmsearch (hmmer.org) and Prosite domains with
the local version of ScanProsite, ps_scan.pl (Gattiker et al. 2002).
Determining expression of putative splicing regulator genes from Affymetrix tiling
array data
Before selecting target genes, we checked for their expression in S2-DRSC cells. S2DRSC mRNA expression data was available from 38 bp Affymetrix tiling arrays
(Cherbas et al. 2011). Transcribed fragments (transfrags) from the data were selected
using a bandwidth of 0, maxgap 90, and minrun 50. Genes with at least 10% transfrag
coverage were considered expressed. For genes not passing the transfrag coverage
13
Brooks et al.
cutoff, probe intensities were reviewed manually and additional genes were called
expressed.
RNAi depletion
RNA interference was performed essentially as described previously (Brooks et al.
2011). Vectors encoding double-stranded RNAs for the target mRNAs were generated
as described previously (Park et al. 2004; Park and Graveley 2005). Briefly, cDNA
fragments encoding the specific dsRNA were amplified by RT-PCR with gene-specific
primers from total RNA isolated from S2-DRSC cells, cloned into the pCRII-TOPO
vector (Invitrogen), and sequenced to verify the identities of the inserts. DNA templates
were amplified with M13 forward and M13 reverse primers and the PCR products were
used in individual in vitro transcription reactions with the Ampliscribe High Yield
Transcription SP6 (Epicentre) kit and T7 kits (Epicentre) to generate the sense and
antisense RNA strands. After DNase I digestion, the single-stranded RNAs were
annealed to generate dsRNAs. Integrity of the PCR products, the single-stranded RNA
transcripts, and dsRNAs were monitored by agarose gel electrophoresis.
S2-DRSC cells (obtained from the Drosophila Genomics Resource Center at Indiana
University) were cultured with Schneider’s medium (Sigma/Aldrich) plus 10% heatinactivated fetal calf serum (FCS) (HyClone) at 27°C. One day prior to dsRNA treatment,
cells were split into six-well culture dishes at a density of 1X106 cells/mL. Immediately
prior to the addition of dsRNA, the culture medium was replaced with fresh Schneider’s
medium without FCS, followed by the addition of 20 µg of each dsRNA directly into the
FCS-free medium and the cells incubated for 5 h at 27°C. After incubation with the
14
Brooks et al.
dsRNA, 10% FCS was added back to cell culture. After 2 d, a second dose of 20 µg of
dsRNA was added to each well in the same manner as described above and the cells
incubated for two additional days after the re-addition of 10% FCS. After the dsRNA
treatment, total RNA was isolated using TRIzol reagent (Invitrogen) according to the
manufacturer’s directions. Parallel dsRNA treatments and total RNA preparations were
performed independently for each replicate. Untreated S2-DRSC cells were used as a
reference. To monitor the level of mRNA depletion, primer sets (Supplementary Table
2) that amplify regions of the targeted mRNAs outside of the dsRNA region were used
for RT-PCR amplification, and compared with the results from the untreated cells
(Supplemental Figure 1).
Sequences of dsRNAs used for RNA Interference
>glo:
GTGAAGCTTCGTGGTCTGCCATATGCCGTCACTGAGCAGCAAATCGAGGAGTTCTTCTCTGGGTTGGATATCAA
AACGGATCGGGAGGGCATACTTTTTGTTATGGACAGAAGGGGTCGTGCAACTGGGGAAGCTTTTGTTCAGTTCG
AAAGCCAGGACGACACTGAGCAAGCCTTGGGCCGAAATCGGGAAAAAATTGGGCACAGGTATATTGAGATATTC
CGCAGCTCGATTGCTGAAATGAAGAGGGCCACAGGCGCCGGTGGCGGTGTCGGAGGACGCCCTGGCCCTTATGA
CATACGTGATCGTGGTGC
>rump:
ATACGACTACCGTTGGCAGGATCTGAAGGATCTGTTCCGCCGCATCGTCGGCTCCATTGAGTACGTCCAGCTGT
TCTTCGATGAGAGCGGCAAGGCTCGCGGCTGTGGCATCGTAGAGTTCAAGGATCCGGAGAACGTACAGAAGGCC
TTGGAGAAAATGAACCGCTATGAGGTGAATGGCCGCGAACTGGTGGTCAAGGAGGATCACGGCGAGCAGCGCGA
TCAATACGGACGCATTGTGCGAGATGGTGGTGGTGGTGGAGGCGGCGGTGGCGGCGTACAAGGAGGCAATGGTG
GCAACAATGGAGGAGGTGGCGGCGGTGGCCGTGACCACATGGATGACCGCGATCGGGGTTTCTCCCGGCGAGAC
15
Brooks et al.
GACGACAGACTATCTGGGCGTAATAATTTTAACATGATGTCAAATGATTATAATAATTCGTCGAATTACAATTT
GTATGGGCTTTCTGCTTCGTTTTTG
>msi
GAAGGTCGAGTGCAAGAAGGCACAGCCCAAGGAAGCAGTCACACCGGCTGCTCAGCTTCTCCAGAAGCGCATTA
TGTTGGGCACCCTCGGCGTCCAGCTGCCCACAGCTCCTGGCCAGCTGATTGGAGCCCGTGGTGCCGGCGTGGCC
ACCATGAACCCACTGGCCATGCTTCAAAATCCCACACAGCTACTGCAATCCCCGGCAGCAGCCGCTGCCGCCCA
GCAGGCCGCCCTCATATCACAGAACCCATTTCAAGTACAAAACGCCGCTGCGGCAGCCTCGATTGCCAATCAGG
CTGGCTTCGGCAAGCTGTTGACCACATATCCGCAGACTGCGCTGCATAGCGTCAGATATGCACCCTACTCGATC
CCCGCCAGCGCCGCCACTGCCAACGCCGCCTTGATGCAGGCTCATCAGGCGCAAAGCGTGGCCGCCGCTGCCCA
TCATCACCAGCAGCAGCAACAGCAGCAGCATCATCAC
>Hrb98DE
AACTACGGCAACCAGAATGGTGGCGGCAACTGGAACAACGGTGGCAACAACTGGGGCAACAACCGCGGGGGTAA
CGACAACTGGGGCAACAACAGCTTCGGTGGTGGCGGCGGCGGCGGTGGTGGTTATGGCGGTGGCAACAACAGCT
GGGGCAATAACAATCCGTGGGACAATGGCAATGGAGGCGGCAACTTTGGAGGCGGCGGCAACAATTGGAACAAT
GGTGGCAATGATTTTGGAGGCTACCAGCAGAACTATGGCGGCGGTCCGCAGCGAGGTGGCGGCAACTTCAACAA
CAATCGCATGCAGCCCTACCAAGGAGGTGGTGGATTCAAAGCAGGCGGTGGCAATCAAGGCAACTATGGCGGAA
ACAATCAGGGCTTCAATAACGGTGGC
>Hrb87F
CTACCGCACCACAGATGATGGCCTGAAGGCTCACTTCGAGAAGTGGGGCAACATTGTCGACGTGGTGGTGATGA
AGGATCCCAAGACGAAGCGCTCTCGCGGCTTCGGTTTCATCACGTACTCCCAGTCGTACATGATCGACAATGCG
CAGAATGCCAGGCCACACAAGATCGATGGACGCACCGTGGAGCCCAAGAGGGCTGTGCCACGCCAGGAGATCGA
TTCCCCGAATGCGGGAGCCACGGTAAAGAAGCTCTTTGTGGGCGGGCTTCGAGACGATCACGATGAAGAGTGCC
TGCGCGAGTACTTCAAGGACTTTGGCCAGATCGTGAGCGTGAACATTGTTTCCGACAAGGACACCGGCAAGAAG
CGCGGCTTCGCCTTCATTGAGTTC
>Syp
GACAGTTCCTGGAATCGAACCTGGAGCACGTGTCAAACAAGTCCGCCTACCTATGCGGCGTGATGAAGACGTAC
CGACAGAAGAGTCGAGCCAGCCAACAGGGCGTGGCCGCGCCCGCAACTGTCAAAGGTCCCGACGAGGACAAGAT
16
Brooks et al.
CAAGAAAATCCTCGAGCGCACCGGCTACACATTAGATGTGACGACAGGTCAGCGTAAATACGGCGGACCGCCGC
CGCATTGGGAGGGAAATGTGCCAGGCAACGGTTGCGAGGTTTTCTGCGGCAAGATACCCAAGGACATGTACGAG
GACGAACTGATTCCGCTATTCGAGAACTGCGGCATAATCTGGGACCTACGACTCATGATGGACCCGATGACGGG
CACAAATCGTGGTTATGCATTTGTCACATTCACAAATCGCGAAGCGGCCGTCAATGCAGTGCGACAGCTCGATA
ATC
>sqd
GGAAACTGTTTGTCGGTGGTCTGAGCTGGGAAACGACTGAGAAGGAACTCCGCGATCACTTCGGCAAATATGGC
GAGATCGAGAGCATCAATGTCAAGACAGATCCCCAGACCGGTCGGTCCCGAGGATTCGCCTTCATCGTGTTTAC
AAACACCGAGGCCATTGACAAAGTCAGCGCCGCGGATGAGCACATAATCAACAGCAAGAAGGTCGATCCCAAGA
AGGCCAAGGCCAGGCACGGCAAGATCTTTGTCGGCGGCCTCACCACAGAGATCAGCGATGAGGAGATTAAGACC
TACTTTGGACAGTTCGGCAATATCGT
>HnRNP-K
AAATACTTTGAGGAGCGCGACGAGGACTTTGATGTGCGTCTACTTATACACCAGAGCTTGGCCGGCTGCGTCAT
TGGCAAAGGTGGACAAAAGATCAAGGAGATCCGCGATCGCATCGGCTGCCGCTTTTTGAAGGTCTTCTCGAATG
TGGCACCACAGAGCACAGATCGAGTGGTGCAGACCGTTGGCAAGCAGAGCCAGGTCATCGAAGCGGTTCGTGAG
GTGATCACACTTACACGGGACACTCCCATCAAGGGGGCGATACATAACTATGATCCTATGAACTTTGACCGCGT
ATATGCCGATGAGTACGGTGGCTATGGC
>CG30122
GATGAAGGTGGTGGACCTGCGCAACGAGCTCCAGTCGCGCGGCCTGGACACCAAAGGAGTCAAAGCGGTGCTCG
TCGAGCGCCTGAGGGCATATGTGGAAGGAGGAGCCGGCGACGGTGAAAATGCGCCGGTCACACCAAGCCGCCGT
CAGCGTCGCACGCGCTCTATGTCCCGCTCTCCATCGCCGGTGCAAGCTGCTCCCGTGGCCGCAGAACCAGTGCT
CGATACTCTCGAAGAGGAGGAGCAGCCGGAGGATAAGACAGTGCCACAGCCAGAACCAGAAAGTGAACAGCCAG
CAGCCGAGCCGGAACCAGAACAAAGTGAGCCGGAGGAAGCTGAGCCAGCTGCAGCAGTGACAGAGGACACAACC
GTCAACCAAG
>Srp54
CTTGACCAACACGGTGTTCATCGATCGCGCCCTAATTGTCATACCCGTTCTGGCCATACCCGAGGAGTATCGGG
CCCTGGAGATGCTCAAGAACGGAACCATTGTGCCGGGACTCCAGAAGCCGGACTCCAAGCTACCGCCCGAAGTC
17
Brooks et al.
ATTAACCGCATCGAGGGACAGCTGCCGCAGCAAGTGATCAAGACGTACGACCCCAAGTTGGTGGAATTCAATCT
GCCGGAGTACCCGGCCTTACCCTCGTTCTACGATGCGCGCAAAATCGAGGAGATTCGGCGCACCATTATCGTGT
GCGATGTTAAGAACGAGTGGCGGCTAGACGATCTGATGGAATGCTTTCAGCGCGCTGGGGAGGTGAAGTATGCC
CGTTGGGCCGAGAAGGATAACAAGACGTACTGCATGATTGAGTTCTGCGAACAGACCAGCATTATTCACGCCCT
GCGCATGCAGGGCCAGGAGTTCAAGGGTGGCCAT
>Rsf1
GTTCACAAAGTATGGCAAGCTGAATTCGGTGTGGATAGCCTTCAATCCGCCGGGATTTGCGTTCGTCGAGTTCG
AGCACCGCGACGACGCCGAAAAGGCGTGCGACATACTGAACGGATCCGAGCTGCTCGGCTCCCAGCTGCGCGTG
GAGATCTCAAAAGGGCGGCCACGCCAGGGTAGGCGTGGCGGACCCATGGACAGGGGCGGACGACGCGGCGACTT
TGGCCGGCACAGCATCACAAGCGGTGGTAGCGGCGGAGGCGGTTTCCGGCAGCGCGGATCCAGCGGATCCTCAA
GCCGGCACACGGAGCGGGGCTATAGCTCCGGCCGATCAGGTGCAAGCAGCTATAATGGCAGAGAGGGCGGCGGC
AGCGGCTTCAATCGCCGCGAGGTTTACGGCGGTGGACGCGACAGCAGCCGCTACAGCAGCGGAAGTAG
>Rbp1-like
ACAAGTCCAGTGGTACAACAAATACCAAAAATCCATTACAGAACCGGAGGAGCAGCACCTCCAGCCACATACAT
ATTCATACATAATGCCACGCTACCGTGAATGGGATTTAGCCTGCAAAGTTTACGTGGGCAATCTGGGATCCTCG
GCTCCAAATACGAGATCGAGAACGCCTTTAGCAAATACGGACCCTTGCGCAACGTCTGGGTGGCCCGCAATCCG
CCCGGTTTCGCCTTCGTCGAGTTCGAGGATCGTCGCGACGCTGAGGATGCGACCCGTGGCCTCGACGGCACCCG
CTGCTGTGGCACCCGCATCCGTGTCGAAATGTCATCAGGCCGTTCACGA
>x16
AAGGTGTACGTGGGCGATCTGGGCAACAATGCCCGGAAGAACGACCTGGAGTATGTATTTGGAGCGTACGGCAG
TTTGCGCAGCGTCTGGATAGCCCGCAATCCGCCGGGCTTCGCCTTCGTGGAGTTTGAGAGTGCCCGCGATGCGG
CGGATGCGGTGCGCGGATTGGACGGACGGACGGTTTGCGGGCGCCGAGCCCGTGTGGAATTGTCCACCGGAAAG
TATGCTAGGTCCGGCGGTGGTGGTGGCGGAGGTGGTGGAGGCGGTGGTGGTGGAGGACTCGGAGGACGCGACCG
AGGCGGCGGTGGTCGTGGGGACGATAAGTGCTACGAGTGCGGCGGACGGGGGCATTTCGCTCGCCACTGTCGCG
AAAGGAAGGCCAGGCAGCGACGCAGAAGCAACTCATTCAGCAGATCTCGGAGCACATCGCGACGCAGGCGCACT
CGCTCCAAGTCCGGAACTCGAT
>Rbp1
18
Brooks et al.
TGCCGCGATATAGGGAGTGGGACTTGGCCTGCAAGGTGTACGTGGGAAACCTGGGCTCCTCGGCGTCCAAGCAC
GAGATAGAAGGCGCATTTGCCAAATATGGACCCCTGCGAAACGTGTGGGTGGCCCGCAATCCACCAGGTTTCGC
CTTTGTCGAATTTGAGGATCGCCGTGACGCGGAAGACGCAACGCGTGCCCTGGACGGAACACGCTGCTGCGGCA
CTAGGATTCGCGTAGAGATGTCTTCGGGTCGCTCGCGCGATCGCCGGCGCGGAGAAGGCGGCAGTAGTGGTCGC
TCTGGTTCCGGACGCTACAGGTCACGTTCGCCACGTCGCTCCCGATCGCCCCGCAGCCGCAGCTTCTCGCGCGA
TCGTCGAAGTCGCTCGGATTCTCGGGATCGTCATTAA
>SC35
GGATCGCTACACACGTGAGAGCCGCGGATTCGCATTTGTTCGCTTCTATGACAAACGTGATGCCGAGGACGCAC
TGGAGGCCATGGATGGTCGCATGCTAGACGGCAGGGAGCTCCGCGTACAGATGGCCCGCTACGGACGCCCCTCT
TCGCCCACTCGCAGCTCCAGTGGTCGTCGTGGCGGAGGAGGAGGCGGTGGTTCCGGCGGGCGTCGTCGGTCACG
TTCTCGCTCCCCAATGCGCCGTCGTTCGCGCAGTCCGCGTCGCCGATCATACTCCCGTTCCCGCTCGCCTGGTA
GCCACTCGCCGGAACGCCGATCCAAATTTTCACGCAGTCCAGTACGCGGCGACAGCCGCAATGGAATCGGAAGC
GGATCTGGAGGACTGGCCCCAGCCGCGTCTCGTAGTCGCAGTCGCTCCTAGATATCGACGTCACGTTCCATTTA
GTGGGAGTGCGAGATATGACTCGCTG
>B52
CATCAAAAATGGCTACGGCTTTGTGGAATTCGAAGACTATCGTGATGCCGACGATGCCGTCTATGAACTGAATG
GCAAAGAGCTGCTTGGCGAACGTGTGGTTGTTGAACCCGCCAGGGGTACCGCTCGTGGCAGCAACCGCGACCGC
TACGACGATCGATATGGTGGTCGGCGGGGGGGCGGGGGCGGTCGTTACAACGAAAAAAACAAAAATTCCAGATC
ATCCTCTCGTTATGGCCCACCGTTGCGCACTGAGTACCGACTGATTGTGGAGAATTTGTCTAGCCGCGTTAGCT
GGCAGGATCTCAAGGATTACATGCGCCAGGCTGGCGAGGTCACCTATGCCGATGCCCACAAGCAGCGTCGCAAT
GAGGGCGTGGTTGAGTTCGCCTCGTTGTCGGACATGAAGACGGCCATTGAGAAGTTGGATGACACCGAGCT
>ytr
CCCAAATTCACAACAGAAAAATTTCCAAACTCACACACAGACAATTACTTAAAGTATTTGAAAACTTACCACAG
CCCGAAACGCACTCTTCACATACCCATACCCTTATCCTTACCCATACCCATACCCATCCTCATCCGCATCCAGA
TCCCCATCCCGATCCCAACCGCAATCGAAAAGACCTCCCAACAAACTTCCTCCATACGCTGCAA
>CG7971
19
Brooks et al.
AAATCTCCAGCCCATTCCCCAGAGGCGCCACCGAAGAAGTCGGTGCCAACGCCAGCCTTCAATCCCTTTAAGGC
GGCCGAGGATACTGTTAACGACATCCTTGGCACAAAGTCGGTGATGGTGGCCCTGGAACAGACTAAGCGACAGC
GGGCGGCTTCCAGCTCTAGCTCGGATTCCGACAGCTCCGGTAGTAGCTCGACTTCCTCGCGTACGCCATCGCCT
AAGCCCACACCTAGGAAACAAAAGAAGAGGAGCAAGACCCCAGAGCTAAAAGAGGTGAAGAAGGAGATTAGCCC
CAGAAAGG
>tra2
AACAAGTACGGACCTATCGAACGCATCCAGATGGTGATTGACGCACAAACACAGCGTTCCCGGGGCTTTTGTTT
CATTTACTTTGAGAAACTCAGCGATGCCCGCGCGGCTAAGGACAGCTGCTCCGGAATAGAAGTGGATGGTCGCC
GTATTCGCGTCGATTTCTCTATAACCCAACGGGCTCATACCCCAACTCCGGGTGTGTATTTGGGTCGTCAGCCG
CGTGGAAAAGCTCCACGCTCATTTTCACCGCGTAGAGGACGCCGTGTGTATCACGATCGCTCCGCTTCGCCCTA
TGACAACTATCGTGATCGCTATGATTACCGCAACGATCGCTACGACCGTAATCTCCGCAGGAGCCCTAGTCGCA
ACCGTTACACTCGCAACAGGAGCTACAGCCGTTCACGCTCTCCGCAACTACGTCGAACTTCATCGCGCTATTAA
AGCGCCTGGGGAGGAGGCTACTTCATTAACTCGTGCTCCTAAGTTCGCCCAACT
>heph
GCGCGGAAGCGACGAACTTTTGAGTCAAGCAGCGGTCATGGCGCCCGCTTCCGACAATAACAATCAGGACCTGG
CCACAAAGAAGGCCAAACTGGAGCCGGGCACTGTGCTGGCCGGCGGAATTGCCAAGGCCTCAAAAGTCATCCAC
TTGCGCAACATTCCGAACGAGTCCGGCGAGGCAGATGTGATTGCCCTGGGCATTCCGTTTGGACGTGTGACCAA
CGTGCTGGTGCTCAAGGGCAAGAACCAGGCTTTCATCGAGATGGCCGACGAGATCTCCGCAACGTCAATGGTGT
CCTGTTACACAGTAACTCCGCCCCAGATGCGCGGCCGCATGGTCTACGTGCAGTTTTCTAATCATCGCGAACTA
AAGACGGACCAAGGTCACAAC
>mub
ATCCATCGGTGACACTCACAATAAGGCTGATTATGCAAGGAAAGGAGGTTGGTAGTATTATTGGTAAAAAGGGT
GAAATTGTCAACAGATTTCGTGAAGAGTCTGGTGCCAAAATCAACATTTCGGATGGCTCATGCCCGGAACGTAT
TGTGACTGTGTCTGGTACAACTAATGCAATCTTTTCGGCATTCACGCTCATTACAAAGAAGTTCGAAGAGTGGT
GCTCGCAGTTCAATGATGTAGGCAAAGTTGGTAAAACTCAAATACCCATTCGATTGATTGTGCCCGCCAGTCAA
TGTGGATCGTTAATTGGCAA
>Upf1
20
Brooks et al.
GGAGGAGCTATGGAAGGAGAATATTGAGGCCACGTTTCAGGATCTGGAGAAGCCAGGCATTGACTCGGAGCCAG
CACATGTGCTACTCCGCTACGAGGATGGCTATCAGTACGAGAAGACCTTTGGGCCGCTGGTCCGCCTTGAGGCC
GAATACGACCAAAAACTGAAGGAGTCTGCCACGCAGGAGAACATCGAAGTACGCTGGGACGTCGGCCTCAACAA
AAAGACCATTGCCTACTTTACGCTGGCGAAGACCGATTCGGACATGAAGCTCATGCATGGCGACGAGCTGCGCC
TGCATTATGTGGGCGAGCTGTACAATCCGTGGAGCGAGATCGGCCACGTTATCAAGGTGCCGGACAATTTCGGC
GATGACGTCGGCCTGGAGCTGAAATCCTCAACGAATGCCCCGGTTAAGTGCACCAGTAACTTTACGGTGGACTT
CATCTGGAAGTGCACGTCATTTGATCGCATGACACGTGCTCTGTGCAAATTCGCCATCGA
>qkr58E-2
AGCATGAGAACGAACACAACGCCAACGCAGACGGCGAGAAGGCCCAGCCGGCGCCGGCGGTCCAGAAGTACATG
CAGGAGCTCATGACGGAGCGATCGCGCATGGAAAACCACTTCCCCCTGGCGGTGAAGCTAATTGACGAAGCTCT
GGAGCGTGTGCAGCTAAACGGACGCATTCCCACGAGAGACCAGTACGCCGATGTCTACCAGCAGCGCACCATCA
AGCTGTCCCAAAAAGTGCACGTGCCCATCAAGGACAAGAAGTTCAACTATGTGGGCAAGCTACTGGGGCCCAAG
GGCAACTCACTGCGTCGCCTGCAGGAGGAGACGCAGTGCAAGATCGTCATACTTGGTCGCTTCTCAATGAAGGA
TCGC
>SF1
CAGGATCACGATCAATCCAGAATCCCGTCGCTCTTCGACCGACAGCAGGGATTGGAAACCATCAGGGAGGAAGG
ACGTGAGCAGCGGTTTGATCTTACTCAGACCATCCAGGAGCTAATGGGCAATGCTGGAGGCAACAAAGGATTCG
CCTCGTTCTTTAACAGCCAGAACAGCAACGACTCCACCAGCAATGGAGCATTCGATAACTCAGCGGACAGCGCT
GCGGAGCGAAAGAGGAAGCGGAAGTCTCGCTGGGGCGGCAGTGAAAACGACAAGACCTTCATTCCTGGAATGCC
CACAATTCTGCCCTCCACCCTGGACCCGGCACAGCAGGAGGCCTACCTAGTTCAATTTCAAATCGAGGAGATTA
GTCGCAAGC
>spoon
CTGCCCGGTGTAGCATTTATACTCGGCGTCTTTTGGTTTCGGCGTAGATATAAAAATTGTTTAGACAAGCCCGA
CGACGAGGACTCATCGGCCATCAATGACTCGTCGATTGAACCAACTGTGCAGGCGCGCAAGGCCAACGGAGTCC
TGCAGAATGGCAAGCTGCCACAGCAGTCGGCCAGCAAGTCGATGAACATCAACGGAACTTTAGTTAACGGTAGC
GGAAGCGGTAGCGGAAGTAGCAGTGATGAGAAGGACAGCCCCACTACCATGTTGTATGGTAAATCAGCACCAAT
CAAAATCC
21
Brooks et al.
>CG7878
CACCAGGAGTACGTCGTTTAGCGCAGAGCTATATGAAGAATCCCATCCAGGTGTGTGTCGGATCGCTCGATCTG
GCAGCCACGCACTCGGTGAAACAAATTATTAAATTGATGGAGGATGACATGGACAAATTCAACACCATTACATC
TTTCGTTAAGAACATGTCCAGTACGGACAAGATCATCATATTTTGTGGACGCAAGGTTCGTGCTGACGACCTAT
CCAGTGAACTTACGCTGGATGGTTTCATGACCCAGTGCATTCATGGTAATCGCGATCAGATGGATCGTGAGCAG
GCTATTGCCGATATTAAGTCCGGCGTCGTGCGCATTCTGGTTGCTACCGATGTGGCATCACGTGGCCTGGACAT
TGAGGATATCACACATGTCATCAACTATGATTTTCCGCACAACATCGAGGAGTATGTGCACCGTGTTGG
>mask
AACACAGGCTCTGGCTCTGGATCCAATAATAACAATAACAACACCAATCAAAACCCCAACAGACAGTTGAATCA
TAATTTACCCCGAATCGCTGCCGCCAGACAATCGATAGCCGCCGCTCTATTGAAAAACAGCGGGCGGAAGATTC
TGACGGCCAAGAATGAGCCACTGACGACGACGGAGTCATCAGGCGTTTTAACCAACACACCTTTACCCAGCAAT
AGCCGATTGAAAGTTAACAACAACAACAACACCAATAACACTGCCAAGATGTCTGGAACTAGTAGCAGTCAGTC
CTCGGCCACGCCCACACCGCCCACGGCCAGCAGCAGCACAACCACCACAACAACAACGAACATCAGCACCGGAG
GCGGTGGGAGTGGCAGCAGTGGCGGTGGCGGTGGGAGTACCACGGTCATTGCCAATCCCGCATCGGTAACCAAC
ACCGGAGCTGGAA
>Fmr1
AGACCGAGGAGTCTGTGCAGCGTGCCCGCGCGATGCTCGAATACGCCGAGGAGTTCTTCCAGGTGCCCAGGGAG
TTGGTGGGCAAGGTGATTGGCAAGAATGGGCGCATTATCCAGGAGATTGTGGACAAGAGTGGCGTGTTTCGAAT
CAAGATCGCTGGCGACGATGAACAGGATCAGAACATACCACGTGAGCTGGCGCATGTACCCTTTGTGTTCATTG
GCACCGTGGAGAGCATTGCAAATGCCAAAGTGCTGTTGGAGTATCATCTGTCGCACCTGAAGGAAGTAGAACAG
TTGCGTCAGGAGAAGATGGAGATTGATCAGCAGCTTCGCGCCATCCAGGAATCCTCCATGGGCTCCACACAGAG
CTTCCCAGTGACGCGGCGCTCTGAGCGCGGCTACAGCAGTGACATTGAGTCGGTGCGCTCTATGCGCGGCGGTG
GTGGCGGCCAGCGTGGTCGTGTACGCGGACGTGGT
>Psi
CGTTATCATGTTGCGTGGTCAAAGGGATACAGTCACTAAGGGGCGCGAAATGATTCAGAACATGGCCAATCGGG
CTGGCGGGGGACAGGTGGAGGTGCTGTTGACGATCAATATGCCGCCACCGGGACCTAGCGGGTATCCACCTTAC
CAGGAGATCATGATTCCGGGCGCCAAGGTGGGCTTGGTCATTGGCAAGGGCGGCGATACCATTAAACAGCTGCA
22
Brooks et al.
GGAGAAGACCGGAGCCAAAATGATCATCATCCAGGACGGACCAAACCAGGAGCTGATCAAACCCCTTCGCATAT
CCGGCGAGGCGCAGAAGATAGAGCACGCCAAGCAGAT
>Dp1
CTACGAGGAGAACTTCACATTCGAGGTGATGACGGTTAATCCTTCGTACTACAAGCACATCATCGGTAAGGCTG
GAGCCAACGTAAATCGCCTGAAGGATGAACTGAAGGTTAACATTAACATCGAAGAGCGCGAGGGCCAGAACAAC
ATCCGTATCGAGGGTCCCAAGGAGGGAGTACGGCAGGCGCAGCTTGAATTACAAGAAAAAATCGACAAACTGGA
AAACGAAAAATCGAAGGATGTGATCATCGACCGCCGTCTCCATCGTTCTATTATCGGAGCTAAGGGCGAGAAGA
TTCGCGAGGTGAAGGACCGCTACCGCCAGGTTACAATCACGATACCTACGCCCCAGGAGAATACCGATATTGTG
AAGCTGCGCGGACCCAAGGAGGATGTGGACAAGTGTCACAAGGATCTGCTTAAGCTGGTCAAGGAGATTCAGGA
ATCGTCGCACATTATCGAGGTGC
>RpS3
CATTGAGTTGTACGCCGAGAAGGTGGCCGCTCGTGGCCTGTGCGCCATTGCCCAGGCTGAGTCGCTGAGGTACA
AGCTCACCGGAGGACTGGCCGTCCGTCGTGCTTGCTATGGTGTGCTCCGCTACATCATGGAGTCGGGAGCCAAG
GGCTGCGAGGTCGTCGTGTCCGGCAAACTGCGTGGTCAGCGTGCCAAGTCGATGAAATTCGTCGATGGCCTGAT
GATCCATTCGGGAGATCCGTGCAACGACTATGTCGAGACCGCCACCCGTCATGTGCTCCTCCGCCAGGGAGTGC
TTGGTATCAAGGTCAAGGTCATGTTGCC
>Cnot4
CTAGCAATAGAACGAGGGCGGATCGTGGAAAAGATCGGACCACGGCTAGTGCAAAGGAGCAGAAGAAGAGCAAG
GAAGCTGCTCCAGCACCTGCAGCAAGTAAACCGGCGGAGCGGGTTGAAACAAGCGAGAGTACAATAAGACAAAA
GAAGGCGGAAGTAACAGAAAGCTGTGAAGATAACTTACCACAAAAGAGATTAGCGGGAACAAACGTTCAAAGAT
CTGTGAGCTCTTGTAGCGAAAATAGCGAAGGACACGTCTCTGAGAGTAGCTTAAGTGAGAAGAGTTTAACTGGT
GATTATGTGGAGGAAAAGTGCAATAGTGTGAATTCGGAAAGCCAGCAAGAAAGTG
>eIF3-S9
CCTGGAGAAGCTGAAGTTGGTCATCAACAAGCTGTTTTCGAACTACGGAGAAATCGTCAATGTGGTCTATCCCG
TCGACGAGGAGGGCAAGACCAAGGGCTACGCCTTCATGGAGTACAAGCAGGCCAGACAGGCGGAGGAAGCCGTC
AAGAAGCTCAACAATCATCGCCTAGACAAAAACCACACCTTTGCCGTCAATCTCTTCACCGATTTCCAAAAGTA
CGAAAACATCCCCGAGAAGTGGGAGCCGCCAACCGTGCAGACCTTCAAAGTGC
23
Brooks et al.
>rin
AGATCCACAACCGAATCCAGCAGCTGAACTTCAACGATTGCCACGCGAAGATCAGCCAGGTTGATGCCCAGGCC
ACTTTGGGCAACGGTGTGGTGGTTCAGGTCACCGGGGAGCTATCCAATGATGGCCAGCCGATGCGGCGTTTTAC
CCAGACGTTCGTTCTGGCCGCTCAGTCGCCGAAGAAGTACTACGTGCACAACGACATCTTCCGCTATCAGGATC
TCTACATCGAGGACGAGCAGGATGGCGAGTCGCGATCGGAGAACGATGAGGAGCACGAT
>barc
GGAGAATACAATCCCGCTCTGAAGCCCAAACGCAAGAAGAAGGACAAAGAGAAATTGCAAAAGATGAAGGAAAA
GTTATTTGATTGGCGTCCAGATAAATTGCGTGGCGAACGGTCAAAGAATGAGAAAACCGTCATCATTAAAAACC
TCTTCACCCCAGAACTCTTTGAGAAGGAAGTGGAGCTCATATTGGAGTACCAAAACAATCTGCGTGAGGAGTGC
AGCAAATGCGGGATGGTCCGTAAAGTGGTTATCTATGATCGCCATCCTGATGGTGTAGCCCAGATCAACATGGC
CTCGCCGGAGGAAGCTGACCTCGTCATTCAAATGATGCAGGGGCGTTATTTTGGACAGCGGCAACTAAGTGCGG
AGGCCTGGGATGGCAAGACCAAATACAAAATTGAGGAATCAGCTGTCGAGGCGCATGAACGGCTTTCCAAATGG
GATGAATTCTTGGCAGAAG
AAGAAACCG
>eIF3ga
GAGGTGGAGCTCGACTATGGTGGACTACCTCCGACGACGGAGACGGTGGAGAACGGACAGAAGTACGTGACGGA
GTACAAGTACAACAAGGACGACAAGAAGACGAAGGTGGTGCGCACGTACAAGATATCCAAGCAGGTGGTGCCCA
AGACGGTGGCCAAGCGACGCACCTGGACGAAGTTCGGCGACTCGAAGAACGACAAGCCCGGCCCCAACTCGCAG
ACGACCATGGTGTCCGAGGAGATCATCATGCAGTTCCTCAACTCCAAGGAGGACGAGAAGGCCAACGATCCGCT
GCTAGATCCCACCAAGAATATTGCCAAGTGCCG
>RnpS1
ATTCATGTCGGTCGGCTTACCCGCAACGTTACCAAGGACCATGTGTTCGAGATATTTAGCAGCTTTGGGGATGT
GAAGAATGTGGAGTTTCCCGTAGATCGTTTTCATCCTAACTTCGGACGCGGCGTGGCGTTTGTGGAATATGCCA
CACCCGAGGATTGTGAGTCGGCCATGAAGCATATGGATGGCGGGCAGATAGATGGCCAGGAGATTACGGTATCC
CCGGTTGTCTTAGTAAAACAGAGGCCGCCCATGCGTCGTCCTTCGCCACCGATGCGCCGTCCGCAAAACAACCG
CTGGCGATCCCCACCCCAGTTCAATAGGTTCAACAATCGTGGAGG
>Ref1
24
Brooks et al.
AACAGCGCTTGGAAGCACGATATGTACGACGGACCGAAGAGGGGTGCCGTCGGTGGAGGATCTGGACCCACCCG
CCTCATCGTCGGTAACCTGGACTACGGCGTATCCAACACGGACATCAAGGAGCTCTTCAACGACTTTGGTCCGA
TAAAGAAGGCGGCAGTGCACTACGATCGCTCCGGTCGCTCGTTGGGCACCGCTGACGTGATTTTCGAACGTCGC
GCCGACGCCTTGAAGGCCATTAAACAGTACCATGGCGTACCTTTGGACGGACGCCCTATGACCATTCAGCTGGC
CGTCTCAGACGTGGCCGTGTTGACCCGTCCCGTAGCCGCCACCGATGTCAAGCGTCGCGTGGGTGGTACTGCAC
CAACTTCATTCAA
>tsu
CCGATGTGTTGGACATTGACAATGCGGAGGAGTTCGAGGTGGACGAGGACGGTGACCAGGGCATTGTGCGCCTG
AAGGAAAAGGCGAAGCACCGCAAGGGACGCGGATTTGGAAGCGACAGTAACACCCGAGAGGCGATCCACAGCTA
CGAGCGTGTGCGCAACGAGGACGACGATGAGCTGGAACCTGGTCCACAAAGGTCCGTCGAGGGCTGGATACTGT
TTGTCACCTCTATCCATGAGGAGGCGCAGGAGGACGAGATTCAGGAAAAGTTCTGCGATTACGGAGAAATCAAG
AACATTCACCTGAACCTCGACCGGCGTACTGGGTTCTCAAAGGGATACGCTCTCG
>shep
ATGGATTCCGGGTTACATGATGACTCAGGTAGATGATCAGACTTCGTATTCTCCACAGTACATGCAGATGGCAG
CTGCCCCTCCGCTGGGAGTAACCTCATACAAACCGGAGGCGGTTAACCAGGTGCAGCCCCGTGGCATCTCGATG
ATGGTTAGCGGTGATACGGGCGTGCCATATGGAACAATGATGCCTCAGTTGGCCACCCTGCAGATTGGCAACTC
TTATATTAGTCCAACTTATCCATATTATGCACCACCACCAACTATTATACCAACAATGCCAATGACAGATTCCG
AACAGGCTAGCA
>snRNP-U1-70K
TTTAAGACGAGGAACTTCAGGAAAAGGTAAAACAAAACAAAAAAGCCCACAAAATGACCCAATATCTGCCGCCG
AATCTGCTGGCGCTGTTCGCGGCACGGGAGCCCATCCCGTTCATGCCGCCGGTGGACAAGCTGCCGCACGAGAA
GAAGTCTCGCGGCTACCTGGGAGTGGCCAAGTTCATGGCCGATTTCGAGGATCCCAAGGACACGCCGCTGCCGA
AAACGGTGGAAACGCGTCAGGAGCGGCTGGAGCGACGCCGGCGCGAGAAGGCCGAGCAAGTGGCCTACAAGCTG
GAGCGTGAGATAGCGCTGTGGGACCCCACAGAGATCAAAAATGCCACGGAGGACCCGTTTCGCACGCTGTTCAT
TGCACGCATCAACTACGACACGTCCGAGTCGAAGCTGCGGCGTGAGTTCGAGTTCTACGGGCCCATCAAGAAGA
TCGTCCTGATCCACGACCAGGAATCAGGTAAACCCAAGGGCTACGCCTTCATCGAGTACGAGCA
>Caper
25
Brooks et al.
ACACGCAGGCTGAGAAGAATCGTCTCCAGAATGCAGCGCCGGCATTCCAACCGAAGAGTCACACGGGTCCCATG
CGCCTCTACGTGGGATCACTGCACTTCAACATTACCGAGGACATGCTGCGGGGCATATTCGAGCCCTTTGGCAA
GATCGATGCCATTCAACTGATCATGGATACGGAGACGGGCCGATCCAAGGGCTACGGCTTTATCACGTACCACA
ATGCTGACGATGCCAAAAAGGCTCTGGAACAGCTGAACGGCTTTGAACTGGCCGGTCGGCTCATGAAAGTGGGC
AATGTGACGGAGCGACTGGACATGAATACCACCTCGCTGGACAC
>nonA-l
AACTGATGACGACCTACGGGAGATGTTCAAGCCATATGGCGAGATCGGCGATATATTCTCGAACCCGGAGAAGA
ACTTTACATTCCTGAGGCTAGACTACTACCAAAATGCTGAGAAGGCCAAACGCGCTTTAGATGGCTCCTTGCGC
AAGGGACGAGTGCTGCGTGTCCGCTTTGCGCCCAACGCCATTGTGCGTGTGACTAATCTCAACCAGTTCGTGTC
CAACGAGCTGCTGCACCAGTCCTTTGAGATCTTTGGACCCATCGAGCGCGCCGTTATCTGCGTAGACGATCGCG
GTAAGCATACCGGCGAAGGCATTGTTGAGTTCGCCAAGAAGTCCTCGGCCAGCGCCTGTCTGCGCCTGTGCAAC
GAAAAATGCTTCTTCTTGACTGCTTCATTGCGTCCGTGTCTGGTGGAACCGATGGAGGTGAACAACGACAATGA
CG
>Hrb27C
TGAGCACGTGACCAACGAGCGGTACATCAATCTGAATGGCAAGCAGGTCGAAATCAAGAAGGCCGAGCCTCGTG
ATGGATCTGGCGGCCAAAACTCCAACAACAGTACCGTGGGAGGCGCCTATGGCAAGCTTGGTAACGAGTGCAGC
CACTGGGGACCGCACCATGCTCCCATCAACATGATGCAGGGCCAGAATGGCCAGATGGGTGGACCGCCGCTGAA
TATGCCCATTGGAGCGCCGAATATGATGCCTGGCTATCAGGGTTGGGGCACCTCGCCGCAGCAGCAACAATACG
GCTACGGCAACAGTGGCCCAGGATCGTACCAGGGATGGGGAGCTCCACCAGGACCCCAGGGACCACCACCGCAG
TGGTCGAACTACGCTGGACCTCAGCAGACGCAGGGCTACGGCGGATACGACATGTATAACTCGACGTCGACCGG
AGCTCCTTCGGGACCATCGGGCGGCGGCAGCTGGAACTCGTGGAACATGCCACCTA
>Sxl
TAATCTCTGCGGATTGTCGCTGGGCAGCGGTGGTAGTGATGATCTCATGAACGATCCTCGGGCAAGCAACACCA
ACCTGATTGTCAACTACTTGCCCCAGGACATGACCGATCGCGAGCTGTACGCCCTATTCAGAGCCATTGGACCC
ATCAACACGTGCAGAATCATGCGAGACTATAAGACTGGCTACAGTTTTGGTTATGCTTTCGTGGACTTCACATC
GGAAATGGACTCGCAGCGTGCTATTAAAGTGCTGAATGGCATCACAGTGCGCAACAAGCGGCTTAAGGTTTCCT
26
Brooks et al.
ATGCACGTCCCGGCGGAGAATCGATCAAGGACACCAATCTGTATGTGACCAATCTGCCGCGTACCATAACCGAC
GATCAGCTGGACACGATCTTCGGCAAGTACGGTTCCATTGTGCAGA
>Rox8
CCGGTGTAAAGGGAAGTCAACGCCACACCTTCGAGGAAGTGTATAACCAGTCGAGCCCCACCAACACCACCGTA
TACTGTGGCGGATTCCCGCCGAATGTCATCAGTGACGACCTGATGCACAAGCACTTCGTCCAGTTTGGTCCCAT
CCAGGACGTGCGGGTCTTCAAGGACAAGGGCTTCTCGTTCATCAAGTTTGTTACCAAGGAGGCAGCCGCCCACG
CCATCGAGCACACGCACAACAGCGAGGTACATGGAAACCTGGTAAAGTGCTTCTGGGGCAAAGAGAACGGAGGC
GATAACTCGGCCAATAACCTCAATGCCGCCGCTGCCGCGGCAGCAGCCTCTGCCAATGTTGCCGCCGTTGCGGC
AGCCAATGCTGCGGTTGCCGCTGGAGCGGGTATGCCCGGTCAGATGATGACGCAGCAACAG
>bol
CTGATCTAACCCGCGTCTTCAGCGCCTATGGCACGGTAAAGAGCACCAAAATCATCGTGGATCGAGCAGGTGTG
AGCAAGGGCTACGGATTCGTCACCTTCGAGACGGAGCAGGAGGCGCAAAGACTGCAAGCGGATGGTGAATGCGT
GGTACTAAGAGATCGGAAGCTGAACATTGCACCGGCCATCAAAAAGCAGCCCAATCCTCTGCAGTCAATTGTGG
CCACAAACGGAGCCGTCTACTATACCACCACGCCGCCGGCACCGATCAGCAATATACCCATGGATCAGTTCGCA
GCCGCTGTATATCCGCCAGCCGCTGGAGTGCCAGCCATCTACCCACCTTCAGCCATGCAATATCAGCCATTCTA
TCAGTACTACAGTGTGCCAATGAATGTACCCACCATTTGGC
>Rnp4F
CAGGAGGAGGAGCACAAGTCGGAGGAGCTGCGCCAACGATCGCGCCCAACCTGGCCACCGTCGTCCGCCGGCGG
GGATATGACCACCATTGAGTTGATCTCATCGGACGACGAGCCGTCAGTGGAGGAGACTGAGGGAGGCAATGCCG
CTGGCCGTGGCAGAGCGCGCAATGATTCCAGCAGCAGTAGCGATGATGTGGGCGTGATCGAAGGCTCGGAATTG
GAATCGAACAGTGAGGTGTCCAGTGACAGTGACAGTGATAGCGACAACGCTGGCGGCGGAAATCAGCTAGAGCG
CTCGTATCAGGAGCTGAATGCGTTGCCCAGCAAAAAGTTTGCCCAAATGGTCTCGCTCATTGGAATCGCATTCA
AA
>elav
TCGGGATCGCAAAATGGCAGCAACGGCAGCACGGAGACGCGCACAAACCTTATTGTCAACTACTTGCCGCAAAC
AATGACCGAAGACGAGATCCGTTCGCTCTTCTCCAGCGTCGGCGAGATTGAGTCGGTGAAGCTGATACGCGACA
AGTCGCAGGTCTACATCGATCCTCTCAATCCGCAGGCGCCCAGCAAGGGCCAAAGTCTGGGCTACGGCTTTGTT
27
Brooks et al.
AACTATGTCCGGCCGCAAGATGCCGAGCAGGCTGTTAATGTTCTAAACGGCCTGCGACTGCAGAACAAAACCAT
AAAGGTGTCGTTTGCCCGCCCGTCGTCCGATGCCATTAAAGGCGCCAACCTTTATGTGTCGGGGCTGCCAAAGA
CGATGACCCAGCAGGAACTGGAGGCCATCTTCGCACCATTCGGAGCAATAATCACATCGCGCATTCTGCAGAAC
GCTGGCAACGATACGCAGACGAAAG
>pea
ATGGACGAGCTGCAGAAGTTGGAGTACCTTTCGCTGGTCTCGAAGATTTGCACTGAGCTAGACAACCACTTGGG
CATCAACGACAAGGACCTGGCCGAGTTTATCATCGATTTAGAAAACAAAAATCGCACATATGACACATTTCGCA
AGGCTTTGCTGGATAATGGCGCCGAATTCCCAGACTCCCTGGTCCAGAACCTGCAGCGCATCATTAATCTTATG
CGCCCCAGCAGACCTGGCGGCGCTAGCCAGGAGAAAACTGTCGGCGACAAGAAGGAAGACAAGAAATCGCAACT
TTTGAAAATGTTTCCCGGCCTCGCTTTGCCCAATGACACCTACA
>CG1646
AATATAATCCGGGCAGTCCCACATCTGAGAGCAACGACGCACAGCCCTCAGAGAAAAAACTCAAGGTCGAAGAA
TCGGAGCCCAAGGAGAAAAAGAAGGAAAAGGAGCGCGATAAAGATAAGGAGAAGGATAAGGACAAAGATAATAA
TAAGGATAAGGAAAAGGAGCGAAAGAAGCTGCCGGACCTAGATAAGTACTGGAGAGCTGTCAAAGAAGACTCCA
CCGACTTCACCGGCTGGACGTACTTGCTGCAATATGTTGACAATGAGTCTGATGCGGAGGCGGCGCGCGAGGCC
TACGACACATTCCTGTCCCACTATCCTTACTGCTACGGATATTGGCGCAAGTATGCCGACTACGAGAAGCGCAA
GGGCATCAAGGCAAACTGCTATAAGGTGTTTGAGCGCGGACTGGAGGCGATTCCGCTGTCCGTGGATCTGTGGA
TCCACTACCTAATGCACGTTAAGTCCAATCACGGAGATGATGA
>CG6227
GTGCCCAACCACTACGAGGACTATGTTCACAGATGTGGTCGCACCGGTCGAGCGGGCAAAAAGGGCAGCGCCTA
CACGTTTATCACACCGGAGCAATCGCGCTATGCCGGCGACATTATCCGCGCCATGGACCTATCAGGCACACTGA
TTCCCGCCGAGCTGCAGGCACTGTGGACGGAGTATAAGGCGCTCCAGGAGGCCGAGGGCAAGACGGTGCACACG
GGCGGCGGCTTTAGCGGCAAGGGCTTCAAGTTCGACGAGCAGGAGTTCAATGCCGCCAAGGAGAGCAAGAAGCT
GCAGAAGGCGGCCTTGGGACTGGCCGATTCCGATGATGAGGAGGA
>CG6841
ATGCCCTCCAAATATTTCCCTCGAAGAAAAGCATCTGGTTGCGAGCCGCCTACTTTGAAAAGAACCATGGCACC
CGCGAATCTTTGGAGGCCCTGTTGCAGCGAGCCGTGGCTCATTGTCCTAAATCGGAGATTCTCTGGCTGATGGG
28
Brooks et al.
GGCCAAATCCAAATGGATGGCTGGAGACGTTCCAGCCGCGAGAGGCATTTTGTCCTTGGCTTTCCAGGCCAATC
CCAATTCCGAGGACATTTGGTTGGCTGCCGTTAAGTTGGAATCAGAGAACTCGGAATATGAGCGGGCGAGACGC
TTGTTAGCCAAGGCTAGAGGATCGGCACCGACACCAAGGGTGATGATGAAATCAGCTCGCCTGGAATGGGCTTT
GGAAAAGTTCGACGAAGCT
>Rm62
CATCTACGACACCAGCGAGAGCCCCGGCAAGATTATCATATTCGTGGAGACAAAGCGACGCGTGGACAACCTGG
TGCGCTTCATCCGCAGCTTCGGAGTCCGTTGTGGAGCTATTCACGGTGACAAGTCGCAATCAGAACGAGACTTT
GTGCTCCGTGAGTTCCGCTCGGGCAAGTCCAACATTCTGGTGGCCACCGATGTGGCGGCCCGTGGACTAGACGT
GGACGGCATCAAGTATGTCATCAACTTTGACTACCCGCAAAACAGCGAGGACTACATCCATCGCATCGGTCGCA
CAGGACGATCCAACACAAAGGGCACCTCTTTCGCCTTCTTCACCAAG
>hay
GATCACGGAAATCGACCACTTTGGGTTGCGCCCAATGGTCACGTCTTCCTGGAATCATTCTCGCCCGTCTATAA
GCATGCCCACGATTTTCTTATCGCCATTTCGGAGCCCGTCTGCCGACCCGAACACATTCACGAGTACAAACTTA
CCGCATACAGTTTATATGCCGCCGTTTCGGTGGGACTGCAAACCCATGACATTGTGGAATACTTGAAGAGATTG
AGCAAGACCAGCATTCCCGAAGGCATCCTTGAGTTTATACGACTCTGCACCCTATCCTATGGCAAGGTCAAGCT
GGTCTTGAAGC
>qkr54B
AACTGCTGGAAGGCGAGATAGAAAAGGTCCAGACCACAGGAAGGATTCCTTCCAGAGAGCAAAAGTATGCCGAT
ATCTATAGAGAGAAGCCGCTGCGGATCTCGCAACGTGTTTTAGTTCCCATTAGAGAACATCCCAAGTTCAACTT
CGTTGGAAAACTGCTGGGGCCCAAGGGCAACTCCCTTCGCCGCCTTCAGGAGGAGACCCTTTGCAAGATGACCG
TCCTGGGCCGCAACTCTATGCGCGATCGAGTCAAAGAAGAGGAATTGCGCAGCTCCAAGGATCCCAAGTACGCT
CACCTCAACAGCGATCTGCATGT
>qkr58E-3
AACGACGAGGTTTCACACGAACAGCTGCGCGAGCTGATGGAAATGGATCCCGAGTCAGCCAAAAACATTCACGG
ACCGAATCTGGAGGCCTACAGATCTGTCTTCGACAAGAAGTTTGGAGGCAACAGCAATGGGGCTCCCAAATACA
TCAACCTGATTAAGAGAGCTGCGGAAAATCCGCCCGAAGTCGACGATGTGGAGGAGGTGGCCTATGAGTATGAA
29
Brooks et al.
CATCGTATGCCCCCCAAGCGTCCGCCTACGGGCTATGAGTACAGCAAACCACGTCCATCAATAATACCGACAAA
CGCAGCGGCATATAAACGTCCATATCCGACTGACATG
RNA-seq
RNA-seq libraries were prepared from 10 µg of total RNA from each sample using the
Illumina mRNA-seq library preparation kits as described by the manufacturer. Libraries
were sequenced on an Illumina GAIIx using single reads of 75 or 76 bp in length. Each
knockdown was performed in biological duplicate. After quality control analysis of the
correlation of junction and exon read counts, duplicate samples were combined.
RNA-seq alignment strategy
RNA-seq reads were aligned with Bowtie (Langmead et al. 2009) against a reference
sequence consisting of the genome, annotated splice junctions, and unannoated splice
junctions. All RNA-seq reads were trimmed to 75 base pairs for consistency. To ensure
a 6 bp overhang, splice junction sequences were formed by joining sequences 69 bp of
exon on each side of the splice junction.
Novel splice junctions included all novel combinations of exon-exon junctions within the
same gene and different genes, with splice sites within 2 kb of each other. No length
restrictions were made for novel junctions within the same gene. Additional novel
junctions were derived from an annotated splice site and an unannotated splice site (GT
or AG dinucleotide) within 2 kb away.
30
Brooks et al.
To remove potential false positive novel junctions, each unannotated splice junction was
given a Shannon-entropy score as previously described (Graveley et al. 2011). Any
novel junction with an entropy score ≥ 3 in one of the 57 samples (56 RNAi samples + 1
untreated) was used for further splicing analysis.
Identifying potential RNAi off-target effects
To determine instances in which the dsRNAs used to deplete the target transcripts
encoding RNA binding proteins may have had unintended targets, we first constructed a
Bowtie index of the MDv1 transcriptome annotation (Graveley et al. 2011).
We then
generated a set of all possible 20 nt dsRNA fragments by sliding a 20 nt window over
the dsRNA sequences and then used Bowtie (with options --all -y -v 2) to align
the dsRNA fragments to the transcriptome index. There appear to be potential off-target
effects for other RNA binding proteins in two cases. First, we identified multiple dsRNA
fragments from the msi dsRNA that aligned to the Hrb87F (hrp36) mRNA and observed
significant decrease in
Hrb87F
gene expression
in
the
msi
RNAi sample
(Supplementary Figure 4). We therefore excluded the msi RNAi experiments from our
analysis, but the raw data is still available from GEO and the modENCODE data
repository. Similarly, there are 70 20 nt fragments from the Hrb87F (hrp36) dsRNA that
align to the Hrb98DE (hrp38) gene and there is a ~2.2-fold change in the levels of
Hrb98DE (hrp38) in the Hrb87F (hrp36) RNAi sample. As the impact on the potential
off-target is less than that in the msi case, we have kept this sample in the analysis,
though it is possible that off-target effects may account for some of the effects observed
in the Hrb87F (hrp36) RNAi samples.
31
Brooks et al.
Supplemental References
Boyle AP, Araya CL, Brdlik C, Cayting P, Cheng C, Cheng Y, Gardner K, Hillier L,
Janette J, Jiang L et al. 2013. Comparative analysis of regulatory information and
circuits across diverse species. Nature: Submitted.
Brooks AN, Hansen KD, Hundal A, Dudoit S, Meyerson M, Brenner SE. 2013.
JuncBASE: a junction-based analysis of splicing events from RNA-seq data.
Submitted.
Brooks AN, Yang L, Duff MO, Hansen KD, Park JW, Dudoit S, Brenner SE, Graveley
BR. 2011. Conservation of an RNA regulatory map between Drosophila and
mammals. Genome Res 21(2): 193-202.
Cherbas L, Willingham A, Zhang D, Yang L, Zou Y, Eads BD, Carlson JW, Landolin JM,
Kapranov P, Dumais J et al. 2011. The transcriptional diversity of 25 Drosophila
cell lines. Genome Res 21(2): 301-314.
Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P,
Ceric G, Forslund K et al. 2010. The Pfam protein families database. Nucleic
Acids Res 38(Database issue): D211-222.
Gattiker A, Gasteiger E, Bairoch A. 2002. ScanProsite: a reference implementation of a
PROSITE scanning tool. Applied bioinformatics 1(2): 107-108.
Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin JM, Yang L, Artieri CG, van
Baren MJ, Boley N, Booth BW et al. 2011. The developmental transcriptome of
Drosophila melanogaster. Nature 471: 473-479.
Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, Langendijk-Genevaux PS, Pagni
M, Sigrist CJ. 2006. The PROSITE database. Nucleic Acids Res 34(Database
issue): D227-230.
Jain E, Bairoch A, Duvaud S, Phan I, Redaschi N, Suzek BE, Martin MJ, McGarvey P,
Gasteiger E. 2009. Infrastructure for the life sciences: design and implementation
of the UniProt website. BMC Bioinformatics 10: 136.
Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-efficient
alignment of short DNA sequences to the human genome. Genome Biol 10(3):
R25.
Letunic I, Doerks T, Bork P. 2009. SMART 6: recent updates and new developments.
Nucleic Acids Res 37(Database issue): D229-232.
Negre N, Brown CD, Ma L, Bristow CA, Miller SW, Wagner U, Kheradpour P, Eaton ML,
Loriaux P, Sealfon R et al. 2011. A cis-regulatory map of the Drosophila genome.
Nature 471(7339): 527-531.
Park JW, Graveley BR. 2005. Use of RNA interference to dissect the roles of transacting factors in alternative pre-mRNA splicing. Methods 37(4): 341-344.
Park JW, Parisky K, Celotto AM, Reenan RA, Graveley BR. 2004. Identification of
alternative splicing regulators by RNA interference in Drosophila. Proc Natl Acad
Sci USA 101(45): 15974-15979.
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL,
Wold BJ, Pachter L. 2010. Transcript assembly and quantification by RNA-seq
reveals unannotated transcripts and isoform switching during cell differentiation.
Nat Biotechnol 28(5): 511-515.
32
Brooks et al.
33
Download