Supplemental Information for Regulation of alternative splicing in Drosophila by 56 RNA binding proteins Angela N. Brooks, Michael O. Duff, Gemma May, Li Yang, Mohan Bolisetty, Jane Landolin, Ken Wan, Jeremy Sandler, Susan E. Celniker, Brenton R. Graveley*, Steven E. Brenner* *Corresponding Authors Brooks et al. 2 Brooks et al. Supplemental Figure 1: RT-PCR validations of RNAi knockdowns. a. RT-PCR was performed on all samples using primers to RP49 as a control for RNA quantities. b. The efficiency of depletion was monitored RT-PCR of the target gene in both replicates of the RNAi samples and compared to the levels in both replicates of the untreated control samples. 3 Brooks et al. 4 Brooks et al. Supplemental Figure 2: Correlation between JuncBASE PSI, a. RT-PCR PSI, and b. Bradley et al. PSI. Scatter plot of thePSI as calculated by JuncBASE from the RNA-seq data, a. from the RT-PCR reactions or b. reported in Bradley et. al. 2015. Values for splicing events significantly affected by the knockdown of one or more proteins as determined by JuncBASE are shown. A best fit regression line is shown with R2 based on Spearman. 5 Brooks et al. Supplemental Figure 3 RBP Knockdown Efficiency 1.0 tra2 x16 Rm62 nonA-l Hrb27C B52 Ref1 Caper Hrb87F HnRNP-K Upf1 CG1646 RnpS1 sqd Rbp1 Syp Sxl ytr Rbp1-like CG30122 heph Cnot4 shep spoon mask Rnp4F qkr58E-3 elav Srp54 bol qkr54B eIF3-S9 msi Dp1 Rox8 SC35 Rsf1 qkr58E-2 SF1 barc Fmr1 CG6227 RpS3 CG6841 glo mub CG7878 CG7971 hay pea eIF3ga snRNP-U1-70K tsu Psi rump rin Hrb98DE 0.9 0.8 RBP Gene 0.6 0.5 0.4 RNAi Efficiency 0.7 0.3 0.2 tra2 x16 Rm62 nonA-l Hrb27C B52 Ref1 Caper Hrb87F HnRNP-K Upf1 CG1646 RnpS1 sqd Rbp1 Syp Sxl ytr Rbp1-like CG30122 heph Cnot4 shep spoon mask Rnp4F qkr58E-3 elav Srp54 bol qkr54B eIF3-S9 msi Dp1 Rox8 SC35 Rsf1 qkr58E-2 SF1 barc Fmr1 CG6227 RpS3 CG6841 glo mub CG7878 CG7971 hay pea eIF3ga snRNP-U1-70K tsu Psi rump rin Hrb98DE 0.1 0.0 RNAi Sample Supplemental Figure 3: Depletion efficiency/specificity of each RNA Binding Protein Sample. The FPKM of all 56 RNA Binding Protein genes was calculated in each sample and the depletion efficiency calculated as [(FPKMuntreated - FPKMknockdown)/ FPKMuntreated]. The results are plotted as a heatmap with the RNAi samples in columns and the RNA Binding Protein genes in rows. The bright diagonal indicates efficient depletion of the target gene. 6 Brooks et al. Supplemental Figure 4 SR hnRNP Core EJC Other Novel B52 SRP54 SC35 RBP1 RBP1-LIKE RSF1 X16 SQD (HRP40) HRB27C (HRP48) HRB87F (HRP36) GLO CG30122 HNRNP-K HRB98DE (HRP38) RUMP MUB HEPH YTR SYP RM62 SNRNP-U1-70K PEA SF1 CG6841 (PRP6) CG1646 (PRP39) CG6227 (PRP5) RNP4F TSU UPF1 REF1 RNPS1 CAPER PSI BARC ROX8 SXL HAY ELAV RIN QKR58E-3 NONA-l TRA2 CG7971 (SR300) CG7878 (DMRH26) EIF3GA SHEP MASK EIF3-S9 RPS3 DP1 CNOT4 FMR1 BOL QKR58E-2 SPOON QKR54B 0 200 400 600 800 1000 Number of affected genes -4 -3 -2 -1 1 2 log2(RNAi FPKM/Untreated FPKM) 3 4 Supplemental Figure 4: Significant changes in gene expression observed upon knockdown of 56 proteins. The fold change in FPKM levels of all was calculated in each RNAi sample in comparison to the untreated control sample. The number of genes that were significantly affected and the magnitude of those changes are plotted for each sample. 7 Brooks et al. Supplemental Figure 5: Comparison of the number of gene expression and splicing events affected in each sample. For each protein, the number of gene expression and splicing events that were significantly affected was calculated and plotted as the rank order. 8 Brooks et al. Supplemental Figure 6: Cross regulation of gene expression between the 56 RNA binding proteins. The fold change in FPKM levels of each of the 56 RNA binding proteins tested was calculated in each RNAi sample in comparison to the untreated control sample and plotted as a heatmap representation. Fold change of target gene is not shown and is colored grey. 9 Brooks et al. Supplemental Figure 7: Specific and shared effects by 56 proteins. For each class of splicing event, the number of events that were affected be each number proteins is plotted. Most events are only affected by one protein, but many are affected by more than one. 10 Brooks et al. Supplemental Figure 8: Schematic depiction of the various classes of alternative splicing events. The various classes of alternative splicing events are depicted. 11 Brooks et al. Supplemental Tables Supplemental Table 1: RNA Binding Proteins Studied. The RNA binding proteins analyzed in this study are listed. The gene names (CG number and synonyms), domain types, category, and number of uniquely aligned reads are indicated. Additionally, the ranges of correlation of exon coverage from RNA sequencing lanes between biological replicates are shown. The categories are not mutually exclusive. Core; core component of the spliceosome. EJC; exon junction complex or nonsense-mediated decay. Prior, No Prior; Prior or no prior evidence for a role in splicing regulation. Supplemental Table 2: Primers used for RT-PCR Validation of RNAi Depletion Supplemental Table 3: Annotation and PSI values for all splicing events observed from RNA-seq Supplemental Table 4: Annotation of all splicing events considered significantly affected by at least one of the 56 proteins. Percent spliced in (Psi) values are given only for samples with differential splicing from reference, otherwise “NA” is given. Supplemental Table 5: Significant changes in gene expression upon knockdown of 56 proteins. 12 Brooks et al. Supplemental Methods Identifying D. melanogaster proteins with an RRM or KH domain All D. melanogaster protein sequences were obtained from Uniprot (Jain et al. 2009). Each sequence was searched against the Pfam database (Finn et al. 2010) using hmmpfam (hmmer.org) for the presence of the Pfam domains RRM_1, RRM_2, RRM_3, KH_1, KH_2 with the default cutoff score. To find additional proteins that may have been missed by Pfam, each sequence was also compared against SMART (Letunic et al. 2009) domains RRM, RRM_1, and KH and against the Prosite (Hulo et al. 2006) domains RRM (PS50102), KH_TYPE_1(PS50084), KH_TYPE_2(PS50823). The SMART hmms were scanned with hmmsearch (hmmer.org) and Prosite domains with the local version of ScanProsite, ps_scan.pl (Gattiker et al. 2002). Determining expression of putative splicing regulator genes from Affymetrix tiling array data Before selecting target genes, we checked for their expression in S2-DRSC cells. S2DRSC mRNA expression data was available from 38 bp Affymetrix tiling arrays (Cherbas et al. 2011). Transcribed fragments (transfrags) from the data were selected using a bandwidth of 0, maxgap 90, and minrun 50. Genes with at least 10% transfrag coverage were considered expressed. For genes not passing the transfrag coverage 13 Brooks et al. cutoff, probe intensities were reviewed manually and additional genes were called expressed. RNAi depletion RNA interference was performed essentially as described previously (Brooks et al. 2011). Vectors encoding double-stranded RNAs for the target mRNAs were generated as described previously (Park et al. 2004; Park and Graveley 2005). Briefly, cDNA fragments encoding the specific dsRNA were amplified by RT-PCR with gene-specific primers from total RNA isolated from S2-DRSC cells, cloned into the pCRII-TOPO vector (Invitrogen), and sequenced to verify the identities of the inserts. DNA templates were amplified with M13 forward and M13 reverse primers and the PCR products were used in individual in vitro transcription reactions with the Ampliscribe High Yield Transcription SP6 (Epicentre) kit and T7 kits (Epicentre) to generate the sense and antisense RNA strands. After DNase I digestion, the single-stranded RNAs were annealed to generate dsRNAs. Integrity of the PCR products, the single-stranded RNA transcripts, and dsRNAs were monitored by agarose gel electrophoresis. S2-DRSC cells (obtained from the Drosophila Genomics Resource Center at Indiana University) were cultured with Schneider’s medium (Sigma/Aldrich) plus 10% heatinactivated fetal calf serum (FCS) (HyClone) at 27°C. One day prior to dsRNA treatment, cells were split into six-well culture dishes at a density of 1X106 cells/mL. Immediately prior to the addition of dsRNA, the culture medium was replaced with fresh Schneider’s medium without FCS, followed by the addition of 20 µg of each dsRNA directly into the FCS-free medium and the cells incubated for 5 h at 27°C. After incubation with the 14 Brooks et al. dsRNA, 10% FCS was added back to cell culture. After 2 d, a second dose of 20 µg of dsRNA was added to each well in the same manner as described above and the cells incubated for two additional days after the re-addition of 10% FCS. After the dsRNA treatment, total RNA was isolated using TRIzol reagent (Invitrogen) according to the manufacturer’s directions. Parallel dsRNA treatments and total RNA preparations were performed independently for each replicate. Untreated S2-DRSC cells were used as a reference. To monitor the level of mRNA depletion, primer sets (Supplementary Table 2) that amplify regions of the targeted mRNAs outside of the dsRNA region were used for RT-PCR amplification, and compared with the results from the untreated cells (Supplemental Figure 1). Sequences of dsRNAs used for RNA Interference >glo: GTGAAGCTTCGTGGTCTGCCATATGCCGTCACTGAGCAGCAAATCGAGGAGTTCTTCTCTGGGTTGGATATCAA AACGGATCGGGAGGGCATACTTTTTGTTATGGACAGAAGGGGTCGTGCAACTGGGGAAGCTTTTGTTCAGTTCG AAAGCCAGGACGACACTGAGCAAGCCTTGGGCCGAAATCGGGAAAAAATTGGGCACAGGTATATTGAGATATTC CGCAGCTCGATTGCTGAAATGAAGAGGGCCACAGGCGCCGGTGGCGGTGTCGGAGGACGCCCTGGCCCTTATGA CATACGTGATCGTGGTGC >rump: ATACGACTACCGTTGGCAGGATCTGAAGGATCTGTTCCGCCGCATCGTCGGCTCCATTGAGTACGTCCAGCTGT TCTTCGATGAGAGCGGCAAGGCTCGCGGCTGTGGCATCGTAGAGTTCAAGGATCCGGAGAACGTACAGAAGGCC TTGGAGAAAATGAACCGCTATGAGGTGAATGGCCGCGAACTGGTGGTCAAGGAGGATCACGGCGAGCAGCGCGA TCAATACGGACGCATTGTGCGAGATGGTGGTGGTGGTGGAGGCGGCGGTGGCGGCGTACAAGGAGGCAATGGTG GCAACAATGGAGGAGGTGGCGGCGGTGGCCGTGACCACATGGATGACCGCGATCGGGGTTTCTCCCGGCGAGAC 15 Brooks et al. GACGACAGACTATCTGGGCGTAATAATTTTAACATGATGTCAAATGATTATAATAATTCGTCGAATTACAATTT GTATGGGCTTTCTGCTTCGTTTTTG >msi GAAGGTCGAGTGCAAGAAGGCACAGCCCAAGGAAGCAGTCACACCGGCTGCTCAGCTTCTCCAGAAGCGCATTA TGTTGGGCACCCTCGGCGTCCAGCTGCCCACAGCTCCTGGCCAGCTGATTGGAGCCCGTGGTGCCGGCGTGGCC ACCATGAACCCACTGGCCATGCTTCAAAATCCCACACAGCTACTGCAATCCCCGGCAGCAGCCGCTGCCGCCCA GCAGGCCGCCCTCATATCACAGAACCCATTTCAAGTACAAAACGCCGCTGCGGCAGCCTCGATTGCCAATCAGG CTGGCTTCGGCAAGCTGTTGACCACATATCCGCAGACTGCGCTGCATAGCGTCAGATATGCACCCTACTCGATC CCCGCCAGCGCCGCCACTGCCAACGCCGCCTTGATGCAGGCTCATCAGGCGCAAAGCGTGGCCGCCGCTGCCCA TCATCACCAGCAGCAGCAACAGCAGCAGCATCATCAC >Hrb98DE AACTACGGCAACCAGAATGGTGGCGGCAACTGGAACAACGGTGGCAACAACTGGGGCAACAACCGCGGGGGTAA CGACAACTGGGGCAACAACAGCTTCGGTGGTGGCGGCGGCGGCGGTGGTGGTTATGGCGGTGGCAACAACAGCT GGGGCAATAACAATCCGTGGGACAATGGCAATGGAGGCGGCAACTTTGGAGGCGGCGGCAACAATTGGAACAAT GGTGGCAATGATTTTGGAGGCTACCAGCAGAACTATGGCGGCGGTCCGCAGCGAGGTGGCGGCAACTTCAACAA CAATCGCATGCAGCCCTACCAAGGAGGTGGTGGATTCAAAGCAGGCGGTGGCAATCAAGGCAACTATGGCGGAA ACAATCAGGGCTTCAATAACGGTGGC >Hrb87F CTACCGCACCACAGATGATGGCCTGAAGGCTCACTTCGAGAAGTGGGGCAACATTGTCGACGTGGTGGTGATGA AGGATCCCAAGACGAAGCGCTCTCGCGGCTTCGGTTTCATCACGTACTCCCAGTCGTACATGATCGACAATGCG CAGAATGCCAGGCCACACAAGATCGATGGACGCACCGTGGAGCCCAAGAGGGCTGTGCCACGCCAGGAGATCGA TTCCCCGAATGCGGGAGCCACGGTAAAGAAGCTCTTTGTGGGCGGGCTTCGAGACGATCACGATGAAGAGTGCC TGCGCGAGTACTTCAAGGACTTTGGCCAGATCGTGAGCGTGAACATTGTTTCCGACAAGGACACCGGCAAGAAG CGCGGCTTCGCCTTCATTGAGTTC >Syp GACAGTTCCTGGAATCGAACCTGGAGCACGTGTCAAACAAGTCCGCCTACCTATGCGGCGTGATGAAGACGTAC CGACAGAAGAGTCGAGCCAGCCAACAGGGCGTGGCCGCGCCCGCAACTGTCAAAGGTCCCGACGAGGACAAGAT 16 Brooks et al. CAAGAAAATCCTCGAGCGCACCGGCTACACATTAGATGTGACGACAGGTCAGCGTAAATACGGCGGACCGCCGC CGCATTGGGAGGGAAATGTGCCAGGCAACGGTTGCGAGGTTTTCTGCGGCAAGATACCCAAGGACATGTACGAG GACGAACTGATTCCGCTATTCGAGAACTGCGGCATAATCTGGGACCTACGACTCATGATGGACCCGATGACGGG CACAAATCGTGGTTATGCATTTGTCACATTCACAAATCGCGAAGCGGCCGTCAATGCAGTGCGACAGCTCGATA ATC >sqd GGAAACTGTTTGTCGGTGGTCTGAGCTGGGAAACGACTGAGAAGGAACTCCGCGATCACTTCGGCAAATATGGC GAGATCGAGAGCATCAATGTCAAGACAGATCCCCAGACCGGTCGGTCCCGAGGATTCGCCTTCATCGTGTTTAC AAACACCGAGGCCATTGACAAAGTCAGCGCCGCGGATGAGCACATAATCAACAGCAAGAAGGTCGATCCCAAGA AGGCCAAGGCCAGGCACGGCAAGATCTTTGTCGGCGGCCTCACCACAGAGATCAGCGATGAGGAGATTAAGACC TACTTTGGACAGTTCGGCAATATCGT >HnRNP-K AAATACTTTGAGGAGCGCGACGAGGACTTTGATGTGCGTCTACTTATACACCAGAGCTTGGCCGGCTGCGTCAT TGGCAAAGGTGGACAAAAGATCAAGGAGATCCGCGATCGCATCGGCTGCCGCTTTTTGAAGGTCTTCTCGAATG TGGCACCACAGAGCACAGATCGAGTGGTGCAGACCGTTGGCAAGCAGAGCCAGGTCATCGAAGCGGTTCGTGAG GTGATCACACTTACACGGGACACTCCCATCAAGGGGGCGATACATAACTATGATCCTATGAACTTTGACCGCGT ATATGCCGATGAGTACGGTGGCTATGGC >CG30122 GATGAAGGTGGTGGACCTGCGCAACGAGCTCCAGTCGCGCGGCCTGGACACCAAAGGAGTCAAAGCGGTGCTCG TCGAGCGCCTGAGGGCATATGTGGAAGGAGGAGCCGGCGACGGTGAAAATGCGCCGGTCACACCAAGCCGCCGT CAGCGTCGCACGCGCTCTATGTCCCGCTCTCCATCGCCGGTGCAAGCTGCTCCCGTGGCCGCAGAACCAGTGCT CGATACTCTCGAAGAGGAGGAGCAGCCGGAGGATAAGACAGTGCCACAGCCAGAACCAGAAAGTGAACAGCCAG CAGCCGAGCCGGAACCAGAACAAAGTGAGCCGGAGGAAGCTGAGCCAGCTGCAGCAGTGACAGAGGACACAACC GTCAACCAAG >Srp54 CTTGACCAACACGGTGTTCATCGATCGCGCCCTAATTGTCATACCCGTTCTGGCCATACCCGAGGAGTATCGGG CCCTGGAGATGCTCAAGAACGGAACCATTGTGCCGGGACTCCAGAAGCCGGACTCCAAGCTACCGCCCGAAGTC 17 Brooks et al. ATTAACCGCATCGAGGGACAGCTGCCGCAGCAAGTGATCAAGACGTACGACCCCAAGTTGGTGGAATTCAATCT GCCGGAGTACCCGGCCTTACCCTCGTTCTACGATGCGCGCAAAATCGAGGAGATTCGGCGCACCATTATCGTGT GCGATGTTAAGAACGAGTGGCGGCTAGACGATCTGATGGAATGCTTTCAGCGCGCTGGGGAGGTGAAGTATGCC CGTTGGGCCGAGAAGGATAACAAGACGTACTGCATGATTGAGTTCTGCGAACAGACCAGCATTATTCACGCCCT GCGCATGCAGGGCCAGGAGTTCAAGGGTGGCCAT >Rsf1 GTTCACAAAGTATGGCAAGCTGAATTCGGTGTGGATAGCCTTCAATCCGCCGGGATTTGCGTTCGTCGAGTTCG AGCACCGCGACGACGCCGAAAAGGCGTGCGACATACTGAACGGATCCGAGCTGCTCGGCTCCCAGCTGCGCGTG GAGATCTCAAAAGGGCGGCCACGCCAGGGTAGGCGTGGCGGACCCATGGACAGGGGCGGACGACGCGGCGACTT TGGCCGGCACAGCATCACAAGCGGTGGTAGCGGCGGAGGCGGTTTCCGGCAGCGCGGATCCAGCGGATCCTCAA GCCGGCACACGGAGCGGGGCTATAGCTCCGGCCGATCAGGTGCAAGCAGCTATAATGGCAGAGAGGGCGGCGGC AGCGGCTTCAATCGCCGCGAGGTTTACGGCGGTGGACGCGACAGCAGCCGCTACAGCAGCGGAAGTAG >Rbp1-like ACAAGTCCAGTGGTACAACAAATACCAAAAATCCATTACAGAACCGGAGGAGCAGCACCTCCAGCCACATACAT ATTCATACATAATGCCACGCTACCGTGAATGGGATTTAGCCTGCAAAGTTTACGTGGGCAATCTGGGATCCTCG GCTCCAAATACGAGATCGAGAACGCCTTTAGCAAATACGGACCCTTGCGCAACGTCTGGGTGGCCCGCAATCCG CCCGGTTTCGCCTTCGTCGAGTTCGAGGATCGTCGCGACGCTGAGGATGCGACCCGTGGCCTCGACGGCACCCG CTGCTGTGGCACCCGCATCCGTGTCGAAATGTCATCAGGCCGTTCACGA >x16 AAGGTGTACGTGGGCGATCTGGGCAACAATGCCCGGAAGAACGACCTGGAGTATGTATTTGGAGCGTACGGCAG TTTGCGCAGCGTCTGGATAGCCCGCAATCCGCCGGGCTTCGCCTTCGTGGAGTTTGAGAGTGCCCGCGATGCGG CGGATGCGGTGCGCGGATTGGACGGACGGACGGTTTGCGGGCGCCGAGCCCGTGTGGAATTGTCCACCGGAAAG TATGCTAGGTCCGGCGGTGGTGGTGGCGGAGGTGGTGGAGGCGGTGGTGGTGGAGGACTCGGAGGACGCGACCG AGGCGGCGGTGGTCGTGGGGACGATAAGTGCTACGAGTGCGGCGGACGGGGGCATTTCGCTCGCCACTGTCGCG AAAGGAAGGCCAGGCAGCGACGCAGAAGCAACTCATTCAGCAGATCTCGGAGCACATCGCGACGCAGGCGCACT CGCTCCAAGTCCGGAACTCGAT >Rbp1 18 Brooks et al. TGCCGCGATATAGGGAGTGGGACTTGGCCTGCAAGGTGTACGTGGGAAACCTGGGCTCCTCGGCGTCCAAGCAC GAGATAGAAGGCGCATTTGCCAAATATGGACCCCTGCGAAACGTGTGGGTGGCCCGCAATCCACCAGGTTTCGC CTTTGTCGAATTTGAGGATCGCCGTGACGCGGAAGACGCAACGCGTGCCCTGGACGGAACACGCTGCTGCGGCA CTAGGATTCGCGTAGAGATGTCTTCGGGTCGCTCGCGCGATCGCCGGCGCGGAGAAGGCGGCAGTAGTGGTCGC TCTGGTTCCGGACGCTACAGGTCACGTTCGCCACGTCGCTCCCGATCGCCCCGCAGCCGCAGCTTCTCGCGCGA TCGTCGAAGTCGCTCGGATTCTCGGGATCGTCATTAA >SC35 GGATCGCTACACACGTGAGAGCCGCGGATTCGCATTTGTTCGCTTCTATGACAAACGTGATGCCGAGGACGCAC TGGAGGCCATGGATGGTCGCATGCTAGACGGCAGGGAGCTCCGCGTACAGATGGCCCGCTACGGACGCCCCTCT TCGCCCACTCGCAGCTCCAGTGGTCGTCGTGGCGGAGGAGGAGGCGGTGGTTCCGGCGGGCGTCGTCGGTCACG TTCTCGCTCCCCAATGCGCCGTCGTTCGCGCAGTCCGCGTCGCCGATCATACTCCCGTTCCCGCTCGCCTGGTA GCCACTCGCCGGAACGCCGATCCAAATTTTCACGCAGTCCAGTACGCGGCGACAGCCGCAATGGAATCGGAAGC GGATCTGGAGGACTGGCCCCAGCCGCGTCTCGTAGTCGCAGTCGCTCCTAGATATCGACGTCACGTTCCATTTA GTGGGAGTGCGAGATATGACTCGCTG >B52 CATCAAAAATGGCTACGGCTTTGTGGAATTCGAAGACTATCGTGATGCCGACGATGCCGTCTATGAACTGAATG GCAAAGAGCTGCTTGGCGAACGTGTGGTTGTTGAACCCGCCAGGGGTACCGCTCGTGGCAGCAACCGCGACCGC TACGACGATCGATATGGTGGTCGGCGGGGGGGCGGGGGCGGTCGTTACAACGAAAAAAACAAAAATTCCAGATC ATCCTCTCGTTATGGCCCACCGTTGCGCACTGAGTACCGACTGATTGTGGAGAATTTGTCTAGCCGCGTTAGCT GGCAGGATCTCAAGGATTACATGCGCCAGGCTGGCGAGGTCACCTATGCCGATGCCCACAAGCAGCGTCGCAAT GAGGGCGTGGTTGAGTTCGCCTCGTTGTCGGACATGAAGACGGCCATTGAGAAGTTGGATGACACCGAGCT >ytr CCCAAATTCACAACAGAAAAATTTCCAAACTCACACACAGACAATTACTTAAAGTATTTGAAAACTTACCACAG CCCGAAACGCACTCTTCACATACCCATACCCTTATCCTTACCCATACCCATACCCATCCTCATCCGCATCCAGA TCCCCATCCCGATCCCAACCGCAATCGAAAAGACCTCCCAACAAACTTCCTCCATACGCTGCAA >CG7971 19 Brooks et al. AAATCTCCAGCCCATTCCCCAGAGGCGCCACCGAAGAAGTCGGTGCCAACGCCAGCCTTCAATCCCTTTAAGGC GGCCGAGGATACTGTTAACGACATCCTTGGCACAAAGTCGGTGATGGTGGCCCTGGAACAGACTAAGCGACAGC GGGCGGCTTCCAGCTCTAGCTCGGATTCCGACAGCTCCGGTAGTAGCTCGACTTCCTCGCGTACGCCATCGCCT AAGCCCACACCTAGGAAACAAAAGAAGAGGAGCAAGACCCCAGAGCTAAAAGAGGTGAAGAAGGAGATTAGCCC CAGAAAGG >tra2 AACAAGTACGGACCTATCGAACGCATCCAGATGGTGATTGACGCACAAACACAGCGTTCCCGGGGCTTTTGTTT CATTTACTTTGAGAAACTCAGCGATGCCCGCGCGGCTAAGGACAGCTGCTCCGGAATAGAAGTGGATGGTCGCC GTATTCGCGTCGATTTCTCTATAACCCAACGGGCTCATACCCCAACTCCGGGTGTGTATTTGGGTCGTCAGCCG CGTGGAAAAGCTCCACGCTCATTTTCACCGCGTAGAGGACGCCGTGTGTATCACGATCGCTCCGCTTCGCCCTA TGACAACTATCGTGATCGCTATGATTACCGCAACGATCGCTACGACCGTAATCTCCGCAGGAGCCCTAGTCGCA ACCGTTACACTCGCAACAGGAGCTACAGCCGTTCACGCTCTCCGCAACTACGTCGAACTTCATCGCGCTATTAA AGCGCCTGGGGAGGAGGCTACTTCATTAACTCGTGCTCCTAAGTTCGCCCAACT >heph GCGCGGAAGCGACGAACTTTTGAGTCAAGCAGCGGTCATGGCGCCCGCTTCCGACAATAACAATCAGGACCTGG CCACAAAGAAGGCCAAACTGGAGCCGGGCACTGTGCTGGCCGGCGGAATTGCCAAGGCCTCAAAAGTCATCCAC TTGCGCAACATTCCGAACGAGTCCGGCGAGGCAGATGTGATTGCCCTGGGCATTCCGTTTGGACGTGTGACCAA CGTGCTGGTGCTCAAGGGCAAGAACCAGGCTTTCATCGAGATGGCCGACGAGATCTCCGCAACGTCAATGGTGT CCTGTTACACAGTAACTCCGCCCCAGATGCGCGGCCGCATGGTCTACGTGCAGTTTTCTAATCATCGCGAACTA AAGACGGACCAAGGTCACAAC >mub ATCCATCGGTGACACTCACAATAAGGCTGATTATGCAAGGAAAGGAGGTTGGTAGTATTATTGGTAAAAAGGGT GAAATTGTCAACAGATTTCGTGAAGAGTCTGGTGCCAAAATCAACATTTCGGATGGCTCATGCCCGGAACGTAT TGTGACTGTGTCTGGTACAACTAATGCAATCTTTTCGGCATTCACGCTCATTACAAAGAAGTTCGAAGAGTGGT GCTCGCAGTTCAATGATGTAGGCAAAGTTGGTAAAACTCAAATACCCATTCGATTGATTGTGCCCGCCAGTCAA TGTGGATCGTTAATTGGCAA >Upf1 20 Brooks et al. GGAGGAGCTATGGAAGGAGAATATTGAGGCCACGTTTCAGGATCTGGAGAAGCCAGGCATTGACTCGGAGCCAG CACATGTGCTACTCCGCTACGAGGATGGCTATCAGTACGAGAAGACCTTTGGGCCGCTGGTCCGCCTTGAGGCC GAATACGACCAAAAACTGAAGGAGTCTGCCACGCAGGAGAACATCGAAGTACGCTGGGACGTCGGCCTCAACAA AAAGACCATTGCCTACTTTACGCTGGCGAAGACCGATTCGGACATGAAGCTCATGCATGGCGACGAGCTGCGCC TGCATTATGTGGGCGAGCTGTACAATCCGTGGAGCGAGATCGGCCACGTTATCAAGGTGCCGGACAATTTCGGC GATGACGTCGGCCTGGAGCTGAAATCCTCAACGAATGCCCCGGTTAAGTGCACCAGTAACTTTACGGTGGACTT CATCTGGAAGTGCACGTCATTTGATCGCATGACACGTGCTCTGTGCAAATTCGCCATCGA >qkr58E-2 AGCATGAGAACGAACACAACGCCAACGCAGACGGCGAGAAGGCCCAGCCGGCGCCGGCGGTCCAGAAGTACATG CAGGAGCTCATGACGGAGCGATCGCGCATGGAAAACCACTTCCCCCTGGCGGTGAAGCTAATTGACGAAGCTCT GGAGCGTGTGCAGCTAAACGGACGCATTCCCACGAGAGACCAGTACGCCGATGTCTACCAGCAGCGCACCATCA AGCTGTCCCAAAAAGTGCACGTGCCCATCAAGGACAAGAAGTTCAACTATGTGGGCAAGCTACTGGGGCCCAAG GGCAACTCACTGCGTCGCCTGCAGGAGGAGACGCAGTGCAAGATCGTCATACTTGGTCGCTTCTCAATGAAGGA TCGC >SF1 CAGGATCACGATCAATCCAGAATCCCGTCGCTCTTCGACCGACAGCAGGGATTGGAAACCATCAGGGAGGAAGG ACGTGAGCAGCGGTTTGATCTTACTCAGACCATCCAGGAGCTAATGGGCAATGCTGGAGGCAACAAAGGATTCG CCTCGTTCTTTAACAGCCAGAACAGCAACGACTCCACCAGCAATGGAGCATTCGATAACTCAGCGGACAGCGCT GCGGAGCGAAAGAGGAAGCGGAAGTCTCGCTGGGGCGGCAGTGAAAACGACAAGACCTTCATTCCTGGAATGCC CACAATTCTGCCCTCCACCCTGGACCCGGCACAGCAGGAGGCCTACCTAGTTCAATTTCAAATCGAGGAGATTA GTCGCAAGC >spoon CTGCCCGGTGTAGCATTTATACTCGGCGTCTTTTGGTTTCGGCGTAGATATAAAAATTGTTTAGACAAGCCCGA CGACGAGGACTCATCGGCCATCAATGACTCGTCGATTGAACCAACTGTGCAGGCGCGCAAGGCCAACGGAGTCC TGCAGAATGGCAAGCTGCCACAGCAGTCGGCCAGCAAGTCGATGAACATCAACGGAACTTTAGTTAACGGTAGC GGAAGCGGTAGCGGAAGTAGCAGTGATGAGAAGGACAGCCCCACTACCATGTTGTATGGTAAATCAGCACCAAT CAAAATCC 21 Brooks et al. >CG7878 CACCAGGAGTACGTCGTTTAGCGCAGAGCTATATGAAGAATCCCATCCAGGTGTGTGTCGGATCGCTCGATCTG GCAGCCACGCACTCGGTGAAACAAATTATTAAATTGATGGAGGATGACATGGACAAATTCAACACCATTACATC TTTCGTTAAGAACATGTCCAGTACGGACAAGATCATCATATTTTGTGGACGCAAGGTTCGTGCTGACGACCTAT CCAGTGAACTTACGCTGGATGGTTTCATGACCCAGTGCATTCATGGTAATCGCGATCAGATGGATCGTGAGCAG GCTATTGCCGATATTAAGTCCGGCGTCGTGCGCATTCTGGTTGCTACCGATGTGGCATCACGTGGCCTGGACAT TGAGGATATCACACATGTCATCAACTATGATTTTCCGCACAACATCGAGGAGTATGTGCACCGTGTTGG >mask AACACAGGCTCTGGCTCTGGATCCAATAATAACAATAACAACACCAATCAAAACCCCAACAGACAGTTGAATCA TAATTTACCCCGAATCGCTGCCGCCAGACAATCGATAGCCGCCGCTCTATTGAAAAACAGCGGGCGGAAGATTC TGACGGCCAAGAATGAGCCACTGACGACGACGGAGTCATCAGGCGTTTTAACCAACACACCTTTACCCAGCAAT AGCCGATTGAAAGTTAACAACAACAACAACACCAATAACACTGCCAAGATGTCTGGAACTAGTAGCAGTCAGTC CTCGGCCACGCCCACACCGCCCACGGCCAGCAGCAGCACAACCACCACAACAACAACGAACATCAGCACCGGAG GCGGTGGGAGTGGCAGCAGTGGCGGTGGCGGTGGGAGTACCACGGTCATTGCCAATCCCGCATCGGTAACCAAC ACCGGAGCTGGAA >Fmr1 AGACCGAGGAGTCTGTGCAGCGTGCCCGCGCGATGCTCGAATACGCCGAGGAGTTCTTCCAGGTGCCCAGGGAG TTGGTGGGCAAGGTGATTGGCAAGAATGGGCGCATTATCCAGGAGATTGTGGACAAGAGTGGCGTGTTTCGAAT CAAGATCGCTGGCGACGATGAACAGGATCAGAACATACCACGTGAGCTGGCGCATGTACCCTTTGTGTTCATTG GCACCGTGGAGAGCATTGCAAATGCCAAAGTGCTGTTGGAGTATCATCTGTCGCACCTGAAGGAAGTAGAACAG TTGCGTCAGGAGAAGATGGAGATTGATCAGCAGCTTCGCGCCATCCAGGAATCCTCCATGGGCTCCACACAGAG CTTCCCAGTGACGCGGCGCTCTGAGCGCGGCTACAGCAGTGACATTGAGTCGGTGCGCTCTATGCGCGGCGGTG GTGGCGGCCAGCGTGGTCGTGTACGCGGACGTGGT >Psi CGTTATCATGTTGCGTGGTCAAAGGGATACAGTCACTAAGGGGCGCGAAATGATTCAGAACATGGCCAATCGGG CTGGCGGGGGACAGGTGGAGGTGCTGTTGACGATCAATATGCCGCCACCGGGACCTAGCGGGTATCCACCTTAC CAGGAGATCATGATTCCGGGCGCCAAGGTGGGCTTGGTCATTGGCAAGGGCGGCGATACCATTAAACAGCTGCA 22 Brooks et al. GGAGAAGACCGGAGCCAAAATGATCATCATCCAGGACGGACCAAACCAGGAGCTGATCAAACCCCTTCGCATAT CCGGCGAGGCGCAGAAGATAGAGCACGCCAAGCAGAT >Dp1 CTACGAGGAGAACTTCACATTCGAGGTGATGACGGTTAATCCTTCGTACTACAAGCACATCATCGGTAAGGCTG GAGCCAACGTAAATCGCCTGAAGGATGAACTGAAGGTTAACATTAACATCGAAGAGCGCGAGGGCCAGAACAAC ATCCGTATCGAGGGTCCCAAGGAGGGAGTACGGCAGGCGCAGCTTGAATTACAAGAAAAAATCGACAAACTGGA AAACGAAAAATCGAAGGATGTGATCATCGACCGCCGTCTCCATCGTTCTATTATCGGAGCTAAGGGCGAGAAGA TTCGCGAGGTGAAGGACCGCTACCGCCAGGTTACAATCACGATACCTACGCCCCAGGAGAATACCGATATTGTG AAGCTGCGCGGACCCAAGGAGGATGTGGACAAGTGTCACAAGGATCTGCTTAAGCTGGTCAAGGAGATTCAGGA ATCGTCGCACATTATCGAGGTGC >RpS3 CATTGAGTTGTACGCCGAGAAGGTGGCCGCTCGTGGCCTGTGCGCCATTGCCCAGGCTGAGTCGCTGAGGTACA AGCTCACCGGAGGACTGGCCGTCCGTCGTGCTTGCTATGGTGTGCTCCGCTACATCATGGAGTCGGGAGCCAAG GGCTGCGAGGTCGTCGTGTCCGGCAAACTGCGTGGTCAGCGTGCCAAGTCGATGAAATTCGTCGATGGCCTGAT GATCCATTCGGGAGATCCGTGCAACGACTATGTCGAGACCGCCACCCGTCATGTGCTCCTCCGCCAGGGAGTGC TTGGTATCAAGGTCAAGGTCATGTTGCC >Cnot4 CTAGCAATAGAACGAGGGCGGATCGTGGAAAAGATCGGACCACGGCTAGTGCAAAGGAGCAGAAGAAGAGCAAG GAAGCTGCTCCAGCACCTGCAGCAAGTAAACCGGCGGAGCGGGTTGAAACAAGCGAGAGTACAATAAGACAAAA GAAGGCGGAAGTAACAGAAAGCTGTGAAGATAACTTACCACAAAAGAGATTAGCGGGAACAAACGTTCAAAGAT CTGTGAGCTCTTGTAGCGAAAATAGCGAAGGACACGTCTCTGAGAGTAGCTTAAGTGAGAAGAGTTTAACTGGT GATTATGTGGAGGAAAAGTGCAATAGTGTGAATTCGGAAAGCCAGCAAGAAAGTG >eIF3-S9 CCTGGAGAAGCTGAAGTTGGTCATCAACAAGCTGTTTTCGAACTACGGAGAAATCGTCAATGTGGTCTATCCCG TCGACGAGGAGGGCAAGACCAAGGGCTACGCCTTCATGGAGTACAAGCAGGCCAGACAGGCGGAGGAAGCCGTC AAGAAGCTCAACAATCATCGCCTAGACAAAAACCACACCTTTGCCGTCAATCTCTTCACCGATTTCCAAAAGTA CGAAAACATCCCCGAGAAGTGGGAGCCGCCAACCGTGCAGACCTTCAAAGTGC 23 Brooks et al. >rin AGATCCACAACCGAATCCAGCAGCTGAACTTCAACGATTGCCACGCGAAGATCAGCCAGGTTGATGCCCAGGCC ACTTTGGGCAACGGTGTGGTGGTTCAGGTCACCGGGGAGCTATCCAATGATGGCCAGCCGATGCGGCGTTTTAC CCAGACGTTCGTTCTGGCCGCTCAGTCGCCGAAGAAGTACTACGTGCACAACGACATCTTCCGCTATCAGGATC TCTACATCGAGGACGAGCAGGATGGCGAGTCGCGATCGGAGAACGATGAGGAGCACGAT >barc GGAGAATACAATCCCGCTCTGAAGCCCAAACGCAAGAAGAAGGACAAAGAGAAATTGCAAAAGATGAAGGAAAA GTTATTTGATTGGCGTCCAGATAAATTGCGTGGCGAACGGTCAAAGAATGAGAAAACCGTCATCATTAAAAACC TCTTCACCCCAGAACTCTTTGAGAAGGAAGTGGAGCTCATATTGGAGTACCAAAACAATCTGCGTGAGGAGTGC AGCAAATGCGGGATGGTCCGTAAAGTGGTTATCTATGATCGCCATCCTGATGGTGTAGCCCAGATCAACATGGC CTCGCCGGAGGAAGCTGACCTCGTCATTCAAATGATGCAGGGGCGTTATTTTGGACAGCGGCAACTAAGTGCGG AGGCCTGGGATGGCAAGACCAAATACAAAATTGAGGAATCAGCTGTCGAGGCGCATGAACGGCTTTCCAAATGG GATGAATTCTTGGCAGAAG AAGAAACCG >eIF3ga GAGGTGGAGCTCGACTATGGTGGACTACCTCCGACGACGGAGACGGTGGAGAACGGACAGAAGTACGTGACGGA GTACAAGTACAACAAGGACGACAAGAAGACGAAGGTGGTGCGCACGTACAAGATATCCAAGCAGGTGGTGCCCA AGACGGTGGCCAAGCGACGCACCTGGACGAAGTTCGGCGACTCGAAGAACGACAAGCCCGGCCCCAACTCGCAG ACGACCATGGTGTCCGAGGAGATCATCATGCAGTTCCTCAACTCCAAGGAGGACGAGAAGGCCAACGATCCGCT GCTAGATCCCACCAAGAATATTGCCAAGTGCCG >RnpS1 ATTCATGTCGGTCGGCTTACCCGCAACGTTACCAAGGACCATGTGTTCGAGATATTTAGCAGCTTTGGGGATGT GAAGAATGTGGAGTTTCCCGTAGATCGTTTTCATCCTAACTTCGGACGCGGCGTGGCGTTTGTGGAATATGCCA CACCCGAGGATTGTGAGTCGGCCATGAAGCATATGGATGGCGGGCAGATAGATGGCCAGGAGATTACGGTATCC CCGGTTGTCTTAGTAAAACAGAGGCCGCCCATGCGTCGTCCTTCGCCACCGATGCGCCGTCCGCAAAACAACCG CTGGCGATCCCCACCCCAGTTCAATAGGTTCAACAATCGTGGAGG >Ref1 24 Brooks et al. AACAGCGCTTGGAAGCACGATATGTACGACGGACCGAAGAGGGGTGCCGTCGGTGGAGGATCTGGACCCACCCG CCTCATCGTCGGTAACCTGGACTACGGCGTATCCAACACGGACATCAAGGAGCTCTTCAACGACTTTGGTCCGA TAAAGAAGGCGGCAGTGCACTACGATCGCTCCGGTCGCTCGTTGGGCACCGCTGACGTGATTTTCGAACGTCGC GCCGACGCCTTGAAGGCCATTAAACAGTACCATGGCGTACCTTTGGACGGACGCCCTATGACCATTCAGCTGGC CGTCTCAGACGTGGCCGTGTTGACCCGTCCCGTAGCCGCCACCGATGTCAAGCGTCGCGTGGGTGGTACTGCAC CAACTTCATTCAA >tsu CCGATGTGTTGGACATTGACAATGCGGAGGAGTTCGAGGTGGACGAGGACGGTGACCAGGGCATTGTGCGCCTG AAGGAAAAGGCGAAGCACCGCAAGGGACGCGGATTTGGAAGCGACAGTAACACCCGAGAGGCGATCCACAGCTA CGAGCGTGTGCGCAACGAGGACGACGATGAGCTGGAACCTGGTCCACAAAGGTCCGTCGAGGGCTGGATACTGT TTGTCACCTCTATCCATGAGGAGGCGCAGGAGGACGAGATTCAGGAAAAGTTCTGCGATTACGGAGAAATCAAG AACATTCACCTGAACCTCGACCGGCGTACTGGGTTCTCAAAGGGATACGCTCTCG >shep ATGGATTCCGGGTTACATGATGACTCAGGTAGATGATCAGACTTCGTATTCTCCACAGTACATGCAGATGGCAG CTGCCCCTCCGCTGGGAGTAACCTCATACAAACCGGAGGCGGTTAACCAGGTGCAGCCCCGTGGCATCTCGATG ATGGTTAGCGGTGATACGGGCGTGCCATATGGAACAATGATGCCTCAGTTGGCCACCCTGCAGATTGGCAACTC TTATATTAGTCCAACTTATCCATATTATGCACCACCACCAACTATTATACCAACAATGCCAATGACAGATTCCG AACAGGCTAGCA >snRNP-U1-70K TTTAAGACGAGGAACTTCAGGAAAAGGTAAAACAAAACAAAAAAGCCCACAAAATGACCCAATATCTGCCGCCG AATCTGCTGGCGCTGTTCGCGGCACGGGAGCCCATCCCGTTCATGCCGCCGGTGGACAAGCTGCCGCACGAGAA GAAGTCTCGCGGCTACCTGGGAGTGGCCAAGTTCATGGCCGATTTCGAGGATCCCAAGGACACGCCGCTGCCGA AAACGGTGGAAACGCGTCAGGAGCGGCTGGAGCGACGCCGGCGCGAGAAGGCCGAGCAAGTGGCCTACAAGCTG GAGCGTGAGATAGCGCTGTGGGACCCCACAGAGATCAAAAATGCCACGGAGGACCCGTTTCGCACGCTGTTCAT TGCACGCATCAACTACGACACGTCCGAGTCGAAGCTGCGGCGTGAGTTCGAGTTCTACGGGCCCATCAAGAAGA TCGTCCTGATCCACGACCAGGAATCAGGTAAACCCAAGGGCTACGCCTTCATCGAGTACGAGCA >Caper 25 Brooks et al. ACACGCAGGCTGAGAAGAATCGTCTCCAGAATGCAGCGCCGGCATTCCAACCGAAGAGTCACACGGGTCCCATG CGCCTCTACGTGGGATCACTGCACTTCAACATTACCGAGGACATGCTGCGGGGCATATTCGAGCCCTTTGGCAA GATCGATGCCATTCAACTGATCATGGATACGGAGACGGGCCGATCCAAGGGCTACGGCTTTATCACGTACCACA ATGCTGACGATGCCAAAAAGGCTCTGGAACAGCTGAACGGCTTTGAACTGGCCGGTCGGCTCATGAAAGTGGGC AATGTGACGGAGCGACTGGACATGAATACCACCTCGCTGGACAC >nonA-l AACTGATGACGACCTACGGGAGATGTTCAAGCCATATGGCGAGATCGGCGATATATTCTCGAACCCGGAGAAGA ACTTTACATTCCTGAGGCTAGACTACTACCAAAATGCTGAGAAGGCCAAACGCGCTTTAGATGGCTCCTTGCGC AAGGGACGAGTGCTGCGTGTCCGCTTTGCGCCCAACGCCATTGTGCGTGTGACTAATCTCAACCAGTTCGTGTC CAACGAGCTGCTGCACCAGTCCTTTGAGATCTTTGGACCCATCGAGCGCGCCGTTATCTGCGTAGACGATCGCG GTAAGCATACCGGCGAAGGCATTGTTGAGTTCGCCAAGAAGTCCTCGGCCAGCGCCTGTCTGCGCCTGTGCAAC GAAAAATGCTTCTTCTTGACTGCTTCATTGCGTCCGTGTCTGGTGGAACCGATGGAGGTGAACAACGACAATGA CG >Hrb27C TGAGCACGTGACCAACGAGCGGTACATCAATCTGAATGGCAAGCAGGTCGAAATCAAGAAGGCCGAGCCTCGTG ATGGATCTGGCGGCCAAAACTCCAACAACAGTACCGTGGGAGGCGCCTATGGCAAGCTTGGTAACGAGTGCAGC CACTGGGGACCGCACCATGCTCCCATCAACATGATGCAGGGCCAGAATGGCCAGATGGGTGGACCGCCGCTGAA TATGCCCATTGGAGCGCCGAATATGATGCCTGGCTATCAGGGTTGGGGCACCTCGCCGCAGCAGCAACAATACG GCTACGGCAACAGTGGCCCAGGATCGTACCAGGGATGGGGAGCTCCACCAGGACCCCAGGGACCACCACCGCAG TGGTCGAACTACGCTGGACCTCAGCAGACGCAGGGCTACGGCGGATACGACATGTATAACTCGACGTCGACCGG AGCTCCTTCGGGACCATCGGGCGGCGGCAGCTGGAACTCGTGGAACATGCCACCTA >Sxl TAATCTCTGCGGATTGTCGCTGGGCAGCGGTGGTAGTGATGATCTCATGAACGATCCTCGGGCAAGCAACACCA ACCTGATTGTCAACTACTTGCCCCAGGACATGACCGATCGCGAGCTGTACGCCCTATTCAGAGCCATTGGACCC ATCAACACGTGCAGAATCATGCGAGACTATAAGACTGGCTACAGTTTTGGTTATGCTTTCGTGGACTTCACATC GGAAATGGACTCGCAGCGTGCTATTAAAGTGCTGAATGGCATCACAGTGCGCAACAAGCGGCTTAAGGTTTCCT 26 Brooks et al. ATGCACGTCCCGGCGGAGAATCGATCAAGGACACCAATCTGTATGTGACCAATCTGCCGCGTACCATAACCGAC GATCAGCTGGACACGATCTTCGGCAAGTACGGTTCCATTGTGCAGA >Rox8 CCGGTGTAAAGGGAAGTCAACGCCACACCTTCGAGGAAGTGTATAACCAGTCGAGCCCCACCAACACCACCGTA TACTGTGGCGGATTCCCGCCGAATGTCATCAGTGACGACCTGATGCACAAGCACTTCGTCCAGTTTGGTCCCAT CCAGGACGTGCGGGTCTTCAAGGACAAGGGCTTCTCGTTCATCAAGTTTGTTACCAAGGAGGCAGCCGCCCACG CCATCGAGCACACGCACAACAGCGAGGTACATGGAAACCTGGTAAAGTGCTTCTGGGGCAAAGAGAACGGAGGC GATAACTCGGCCAATAACCTCAATGCCGCCGCTGCCGCGGCAGCAGCCTCTGCCAATGTTGCCGCCGTTGCGGC AGCCAATGCTGCGGTTGCCGCTGGAGCGGGTATGCCCGGTCAGATGATGACGCAGCAACAG >bol CTGATCTAACCCGCGTCTTCAGCGCCTATGGCACGGTAAAGAGCACCAAAATCATCGTGGATCGAGCAGGTGTG AGCAAGGGCTACGGATTCGTCACCTTCGAGACGGAGCAGGAGGCGCAAAGACTGCAAGCGGATGGTGAATGCGT GGTACTAAGAGATCGGAAGCTGAACATTGCACCGGCCATCAAAAAGCAGCCCAATCCTCTGCAGTCAATTGTGG CCACAAACGGAGCCGTCTACTATACCACCACGCCGCCGGCACCGATCAGCAATATACCCATGGATCAGTTCGCA GCCGCTGTATATCCGCCAGCCGCTGGAGTGCCAGCCATCTACCCACCTTCAGCCATGCAATATCAGCCATTCTA TCAGTACTACAGTGTGCCAATGAATGTACCCACCATTTGGC >Rnp4F CAGGAGGAGGAGCACAAGTCGGAGGAGCTGCGCCAACGATCGCGCCCAACCTGGCCACCGTCGTCCGCCGGCGG GGATATGACCACCATTGAGTTGATCTCATCGGACGACGAGCCGTCAGTGGAGGAGACTGAGGGAGGCAATGCCG CTGGCCGTGGCAGAGCGCGCAATGATTCCAGCAGCAGTAGCGATGATGTGGGCGTGATCGAAGGCTCGGAATTG GAATCGAACAGTGAGGTGTCCAGTGACAGTGACAGTGATAGCGACAACGCTGGCGGCGGAAATCAGCTAGAGCG CTCGTATCAGGAGCTGAATGCGTTGCCCAGCAAAAAGTTTGCCCAAATGGTCTCGCTCATTGGAATCGCATTCA AA >elav TCGGGATCGCAAAATGGCAGCAACGGCAGCACGGAGACGCGCACAAACCTTATTGTCAACTACTTGCCGCAAAC AATGACCGAAGACGAGATCCGTTCGCTCTTCTCCAGCGTCGGCGAGATTGAGTCGGTGAAGCTGATACGCGACA AGTCGCAGGTCTACATCGATCCTCTCAATCCGCAGGCGCCCAGCAAGGGCCAAAGTCTGGGCTACGGCTTTGTT 27 Brooks et al. AACTATGTCCGGCCGCAAGATGCCGAGCAGGCTGTTAATGTTCTAAACGGCCTGCGACTGCAGAACAAAACCAT AAAGGTGTCGTTTGCCCGCCCGTCGTCCGATGCCATTAAAGGCGCCAACCTTTATGTGTCGGGGCTGCCAAAGA CGATGACCCAGCAGGAACTGGAGGCCATCTTCGCACCATTCGGAGCAATAATCACATCGCGCATTCTGCAGAAC GCTGGCAACGATACGCAGACGAAAG >pea ATGGACGAGCTGCAGAAGTTGGAGTACCTTTCGCTGGTCTCGAAGATTTGCACTGAGCTAGACAACCACTTGGG CATCAACGACAAGGACCTGGCCGAGTTTATCATCGATTTAGAAAACAAAAATCGCACATATGACACATTTCGCA AGGCTTTGCTGGATAATGGCGCCGAATTCCCAGACTCCCTGGTCCAGAACCTGCAGCGCATCATTAATCTTATG CGCCCCAGCAGACCTGGCGGCGCTAGCCAGGAGAAAACTGTCGGCGACAAGAAGGAAGACAAGAAATCGCAACT TTTGAAAATGTTTCCCGGCCTCGCTTTGCCCAATGACACCTACA >CG1646 AATATAATCCGGGCAGTCCCACATCTGAGAGCAACGACGCACAGCCCTCAGAGAAAAAACTCAAGGTCGAAGAA TCGGAGCCCAAGGAGAAAAAGAAGGAAAAGGAGCGCGATAAAGATAAGGAGAAGGATAAGGACAAAGATAATAA TAAGGATAAGGAAAAGGAGCGAAAGAAGCTGCCGGACCTAGATAAGTACTGGAGAGCTGTCAAAGAAGACTCCA CCGACTTCACCGGCTGGACGTACTTGCTGCAATATGTTGACAATGAGTCTGATGCGGAGGCGGCGCGCGAGGCC TACGACACATTCCTGTCCCACTATCCTTACTGCTACGGATATTGGCGCAAGTATGCCGACTACGAGAAGCGCAA GGGCATCAAGGCAAACTGCTATAAGGTGTTTGAGCGCGGACTGGAGGCGATTCCGCTGTCCGTGGATCTGTGGA TCCACTACCTAATGCACGTTAAGTCCAATCACGGAGATGATGA >CG6227 GTGCCCAACCACTACGAGGACTATGTTCACAGATGTGGTCGCACCGGTCGAGCGGGCAAAAAGGGCAGCGCCTA CACGTTTATCACACCGGAGCAATCGCGCTATGCCGGCGACATTATCCGCGCCATGGACCTATCAGGCACACTGA TTCCCGCCGAGCTGCAGGCACTGTGGACGGAGTATAAGGCGCTCCAGGAGGCCGAGGGCAAGACGGTGCACACG GGCGGCGGCTTTAGCGGCAAGGGCTTCAAGTTCGACGAGCAGGAGTTCAATGCCGCCAAGGAGAGCAAGAAGCT GCAGAAGGCGGCCTTGGGACTGGCCGATTCCGATGATGAGGAGGA >CG6841 ATGCCCTCCAAATATTTCCCTCGAAGAAAAGCATCTGGTTGCGAGCCGCCTACTTTGAAAAGAACCATGGCACC CGCGAATCTTTGGAGGCCCTGTTGCAGCGAGCCGTGGCTCATTGTCCTAAATCGGAGATTCTCTGGCTGATGGG 28 Brooks et al. GGCCAAATCCAAATGGATGGCTGGAGACGTTCCAGCCGCGAGAGGCATTTTGTCCTTGGCTTTCCAGGCCAATC CCAATTCCGAGGACATTTGGTTGGCTGCCGTTAAGTTGGAATCAGAGAACTCGGAATATGAGCGGGCGAGACGC TTGTTAGCCAAGGCTAGAGGATCGGCACCGACACCAAGGGTGATGATGAAATCAGCTCGCCTGGAATGGGCTTT GGAAAAGTTCGACGAAGCT >Rm62 CATCTACGACACCAGCGAGAGCCCCGGCAAGATTATCATATTCGTGGAGACAAAGCGACGCGTGGACAACCTGG TGCGCTTCATCCGCAGCTTCGGAGTCCGTTGTGGAGCTATTCACGGTGACAAGTCGCAATCAGAACGAGACTTT GTGCTCCGTGAGTTCCGCTCGGGCAAGTCCAACATTCTGGTGGCCACCGATGTGGCGGCCCGTGGACTAGACGT GGACGGCATCAAGTATGTCATCAACTTTGACTACCCGCAAAACAGCGAGGACTACATCCATCGCATCGGTCGCA CAGGACGATCCAACACAAAGGGCACCTCTTTCGCCTTCTTCACCAAG >hay GATCACGGAAATCGACCACTTTGGGTTGCGCCCAATGGTCACGTCTTCCTGGAATCATTCTCGCCCGTCTATAA GCATGCCCACGATTTTCTTATCGCCATTTCGGAGCCCGTCTGCCGACCCGAACACATTCACGAGTACAAACTTA CCGCATACAGTTTATATGCCGCCGTTTCGGTGGGACTGCAAACCCATGACATTGTGGAATACTTGAAGAGATTG AGCAAGACCAGCATTCCCGAAGGCATCCTTGAGTTTATACGACTCTGCACCCTATCCTATGGCAAGGTCAAGCT GGTCTTGAAGC >qkr54B AACTGCTGGAAGGCGAGATAGAAAAGGTCCAGACCACAGGAAGGATTCCTTCCAGAGAGCAAAAGTATGCCGAT ATCTATAGAGAGAAGCCGCTGCGGATCTCGCAACGTGTTTTAGTTCCCATTAGAGAACATCCCAAGTTCAACTT CGTTGGAAAACTGCTGGGGCCCAAGGGCAACTCCCTTCGCCGCCTTCAGGAGGAGACCCTTTGCAAGATGACCG TCCTGGGCCGCAACTCTATGCGCGATCGAGTCAAAGAAGAGGAATTGCGCAGCTCCAAGGATCCCAAGTACGCT CACCTCAACAGCGATCTGCATGT >qkr58E-3 AACGACGAGGTTTCACACGAACAGCTGCGCGAGCTGATGGAAATGGATCCCGAGTCAGCCAAAAACATTCACGG ACCGAATCTGGAGGCCTACAGATCTGTCTTCGACAAGAAGTTTGGAGGCAACAGCAATGGGGCTCCCAAATACA TCAACCTGATTAAGAGAGCTGCGGAAAATCCGCCCGAAGTCGACGATGTGGAGGAGGTGGCCTATGAGTATGAA 29 Brooks et al. CATCGTATGCCCCCCAAGCGTCCGCCTACGGGCTATGAGTACAGCAAACCACGTCCATCAATAATACCGACAAA CGCAGCGGCATATAAACGTCCATATCCGACTGACATG RNA-seq RNA-seq libraries were prepared from 10 µg of total RNA from each sample using the Illumina mRNA-seq library preparation kits as described by the manufacturer. Libraries were sequenced on an Illumina GAIIx using single reads of 75 or 76 bp in length. Each knockdown was performed in biological duplicate. After quality control analysis of the correlation of junction and exon read counts, duplicate samples were combined. RNA-seq alignment strategy RNA-seq reads were aligned with Bowtie (Langmead et al. 2009) against a reference sequence consisting of the genome, annotated splice junctions, and unannoated splice junctions. All RNA-seq reads were trimmed to 75 base pairs for consistency. To ensure a 6 bp overhang, splice junction sequences were formed by joining sequences 69 bp of exon on each side of the splice junction. Novel splice junctions included all novel combinations of exon-exon junctions within the same gene and different genes, with splice sites within 2 kb of each other. No length restrictions were made for novel junctions within the same gene. Additional novel junctions were derived from an annotated splice site and an unannotated splice site (GT or AG dinucleotide) within 2 kb away. 30 Brooks et al. To remove potential false positive novel junctions, each unannotated splice junction was given a Shannon-entropy score as previously described (Graveley et al. 2011). Any novel junction with an entropy score ≥ 3 in one of the 57 samples (56 RNAi samples + 1 untreated) was used for further splicing analysis. Identifying potential RNAi off-target effects To determine instances in which the dsRNAs used to deplete the target transcripts encoding RNA binding proteins may have had unintended targets, we first constructed a Bowtie index of the MDv1 transcriptome annotation (Graveley et al. 2011). We then generated a set of all possible 20 nt dsRNA fragments by sliding a 20 nt window over the dsRNA sequences and then used Bowtie (with options --all -y -v 2) to align the dsRNA fragments to the transcriptome index. There appear to be potential off-target effects for other RNA binding proteins in two cases. First, we identified multiple dsRNA fragments from the msi dsRNA that aligned to the Hrb87F (hrp36) mRNA and observed significant decrease in Hrb87F gene expression in the msi RNAi sample (Supplementary Figure 4). We therefore excluded the msi RNAi experiments from our analysis, but the raw data is still available from GEO and the modENCODE data repository. Similarly, there are 70 20 nt fragments from the Hrb87F (hrp36) dsRNA that align to the Hrb98DE (hrp38) gene and there is a ~2.2-fold change in the levels of Hrb98DE (hrp38) in the Hrb87F (hrp36) RNAi sample. As the impact on the potential off-target is less than that in the msi case, we have kept this sample in the analysis, though it is possible that off-target effects may account for some of the effects observed in the Hrb87F (hrp36) RNAi samples. 31 Brooks et al. Supplemental References Boyle AP, Araya CL, Brdlik C, Cayting P, Cheng C, Cheng Y, Gardner K, Hillier L, Janette J, Jiang L et al. 2013. Comparative analysis of regulatory information and circuits across diverse species. Nature: Submitted. Brooks AN, Hansen KD, Hundal A, Dudoit S, Meyerson M, Brenner SE. 2013. JuncBASE: a junction-based analysis of splicing events from RNA-seq data. Submitted. Brooks AN, Yang L, Duff MO, Hansen KD, Park JW, Dudoit S, Brenner SE, Graveley BR. 2011. Conservation of an RNA regulatory map between Drosophila and mammals. Genome Res 21(2): 193-202. Cherbas L, Willingham A, Zhang D, Yang L, Zou Y, Eads BD, Carlson JW, Landolin JM, Kapranov P, Dumais J et al. 2011. The transcriptional diversity of 25 Drosophila cell lines. Genome Res 21(2): 301-314. Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K et al. 2010. The Pfam protein families database. Nucleic Acids Res 38(Database issue): D211-222. Gattiker A, Gasteiger E, Bairoch A. 2002. ScanProsite: a reference implementation of a PROSITE scanning tool. Applied bioinformatics 1(2): 107-108. Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin JM, Yang L, Artieri CG, van Baren MJ, Boley N, Booth BW et al. 2011. The developmental transcriptome of Drosophila melanogaster. Nature 471: 473-479. Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, Langendijk-Genevaux PS, Pagni M, Sigrist CJ. 2006. The PROSITE database. Nucleic Acids Res 34(Database issue): D227-230. Jain E, Bairoch A, Duvaud S, Phan I, Redaschi N, Suzek BE, Martin MJ, McGarvey P, Gasteiger E. 2009. Infrastructure for the life sciences: design and implementation of the UniProt website. BMC Bioinformatics 10: 136. Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3): R25. Letunic I, Doerks T, Bork P. 2009. SMART 6: recent updates and new developments. Nucleic Acids Res 37(Database issue): D229-232. Negre N, Brown CD, Ma L, Bristow CA, Miller SW, Wagner U, Kheradpour P, Eaton ML, Loriaux P, Sealfon R et al. 2011. A cis-regulatory map of the Drosophila genome. Nature 471(7339): 527-531. Park JW, Graveley BR. 2005. Use of RNA interference to dissect the roles of transacting factors in alternative pre-mRNA splicing. Methods 37(4): 341-344. Park JW, Parisky K, Celotto AM, Reenan RA, Graveley BR. 2004. Identification of alternative splicing regulators by RNA interference in Drosophila. Proc Natl Acad Sci USA 101(45): 15974-15979. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. 2010. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5): 511-515. 32 Brooks et al. 33