The Complete Guide to RNA Sequencing: Everything You Ever Wanted to Know for Successful RNA Sequencing Based Research Faye Wang, BGI Genomics 01 RNA Sequencing Overview 02 QC principles of RNA-seq CONTENTS 03 mRNA sequencing 04 ncRNA sequencing 05 Data analysis RNA classification: Coding RNA Vs. non-coding RNA RNA transcripts Coding mRNA Non-coding RNA ‘Housekeeping’ RNAs Regulatory RNAs rRNA LncRNA miRNA, (>200nt) siRNA, piRNA (<50nt) tRNA snRNA/snoRNA Telomerase RNA Overview of RNA-seq Applications Transcriptome sequencing is used to reveal the presence, quantity and structure of RNA in a biological sample under specific conditions. Plant & Animal Diagnosis Biomarker Identification development research Plant & Animal Stress Resistance Research Plant-Fungal Interaction Research Application Fields Target Drug Development Pathogenic Regulatory Mechanism Research RNA sequencing and COVID-19 ➢ Fast way to generate data ➢ Cost effective way to get useful data ➢ Cover a broad range of research fields ➢ Mature association analysis solutions to combine with other technologies Standard RNA Sequencing Project Workflow at BGI Delivery QC Clients • • • Total RNA samples Tissues, cell lines samples for total RNA extraction Several protocols for different types of samples • DNBSEQ platform • PE100/PE150/SE50 • • Standard analysis pipelines Customized analysis • Dr. Tom system delivery 01 RNA Sequencing Overview 02 QC principles of RNA-seq CONTENTS 03 mRNA sequencing 04 ncRNA sequencing 05 Data analysis Frequently Asked Questions Q1: How do I understand the Quality Control Report? Q2: The nanodrop QC result from my side looks OK. Why am I informed that the RNA sample has not passed QC? Q3: There are so many kinds of RNA library on the market. Can one protocol fulfill all RNA sample types? Q4: What data can I expect from 50ng input total RNA? Is there any influence on the mapping rate, quality or detected gene number? QC method for RNA samples Concentration • NanoDrop Purity • Gel Electrophoresis Integrity • Agilent2100/Fragment analyzer Quality indicators Method BGI QC method for RNA samples Gel electrophoresis • 28S • 18S • 5S *Human sample Nanodrop • OD260/OD280 • OD260/OD230 Agilent 2100☺ • RNA Integrity number(RIN) • Base Line • Peaks RNA sample QC indicators Example of electropherogram with RNA integrity number (RIN) of 10. The four peaks indicated as markers, 5S denotes small RNA, 18S and 28S denote ribosomal RNA (Norhazlin et al, 2015) Example of RNA QC result Intact RNA, RIN=10 Cell line total RNA RIN=10 Cartilage total RNA RIN=9.8 Partially degraded, RIN=5 Strongly degraded, RIN=3 Hydrothorax total RNA RIN=9.5 Muscle total RNA RIN=8.5 Specific conditions for lower RIN Urine total RNA RIN=2.7 Sperm total RNA RIN=2.4 Placenta total RNA RIN=2.1 Example of RNA QC result from sample type without RIN Algae total RNA (RIN=N/A) Rice root total RNA (RIN=N/A) Bee total RNA (RIN=N/A) Gibberella zeae total RNA (RIN=N/A) Example of QC result from Specific conditions Plant leaf total RNA Prokaryotic vs Eukaryotic total RNA mixture Prokaryotic total RNA 01 RNA Sequencing Overview 02 QC principle of RNA-seq CONTENTS 03 mRNA sequencing 04 ncRNA sequencing 05 Data analysis RNA sample types Tissues Plasma Buffy coat Erythrocytes Cell lines FFPE Cell line/Tissue based mRNA library preparation Cell lines: ≥2×105cells Tissue: ≥30mg mRNA sequencing Total RNA Agilent 2100 QC report Cell lines RIN>7 Tissues ➢ • • Specifications: Enough total RNA:>200ng Good quality : RIN>7 ➢ • Protocol PolyA+ Selection based mRNA enrichment Cell line/tissue based mRNA library preparation Intact RNA, RIN=10 mRNA sequencing Partially degraded, RIN=5 ➢ • • Specifications: Enough total RNA:>200ng RNA degraded: RIN<5 ➢ • Protocol rRNA depletion Strongly degraded, RIN=3 Low input mRNA library preparation ➢ Sample type: UHRR ➢ Library type: PolyA+ selection, ➢ Sequencing: DNBSEQ PE150, 30M reads Protocol /Input rRNA rate Duplication rate Gene detection PolyA+ Selection (20ng~200ng) √ ● √ Gene expression correlation √ mRNA sequencing qPCR correlation Alternative splicing √ ● ● : The duplication rate increased as the input decreased, but still very low (duplication rate<5%); ● : The alternative splicing detection amount slightly decreased as the input decreased; ➢ • • Specifications: 20ng< Total RNA<200ng Good quality : RIN>7 ➢ • Protocol PolyA+ Selection based mRNA enrichment *In brief, if the total RNA quality is really good, we can still get qualified data with lower input. Whole blood based mRNA library preparation mRNA sequencing human, β-globin mRNA ➢ • Plasma About Globin mRNA Buffy coat Erythrocytes • Expressed at high levels in red blood cells (RBCs) and reticulocytes (RBC precursors) . • Up to 70% of the mRNA (by mass) in whole blood total RNA are globin transcripts. • • ➢ • • Specifications: Total RNA extracted from whole blood(H/M/R). Total RNA:>500ng Good quality : RIN>7 Protocol PolyA+ Selection based mRNA enrichment Globin mRNA removal is needed. Whole blood based mRNA library preparation Human-Non-neonate (Contains α and β globin mRNA) Plasma Plasma Human-Neonate (Contains α,β and γ globin mRNA) Buffy coat Mouse/Rat Erythrocytes Other Mammalians Tips: Different globin removal kits covers different species; Even for human whole blood samples, the protocol for neonate or non-neonate can be different. Don’t want to be bothered by the globin mRNA thing? Extract the total RNA from the buffy coat cells. FFPE based mRNA library preparation Exome Capture RNA-seq rRNA depletion RNA-Seq DNase I treated & Purified Total RNA DNase I treated & Purified Total RNA RNA Fragmentation FFPE rRNA depletion cDNA Synthesis Adapter Ligation & PCR RNA Fragmentation Hybridization & capture cDNA Synthesis Clean up & PCR Adapter Ligation & PCR Sequencing mRNA sequencing ➢ • • • • ➢ • Specifications: Total RNA extracted from FFPE samples Total RNA:>100ng Quality: RIN>2; DV200≥30% Protocol rRNA depletion or Exome capture based mRNA enrichment Summary: Library protocols for mRNA No. Sample type Total RNA RIN Protocol 1 Cell line or tissue 20<m<200ng >7 Poly A selection 2 Cell line or tissue M>200ng <5 rRNA depletion 3 Whole Blood M>500ng >7 Globin mRNA removal + Poly A selection 4 Buffy coat cell from whole blood No. 1 No. 2 No. 1 No. 2 5 FFPE slides >100ng >2 Poly A selection rRNA depletion rRNA depletion/exon capture enrichment DNBSEQ Transcriptome Experimental workflow Total RNA mRNA fragmentation and enrichment N6 primer Reverse transcription Second cDNA chain synthesis End-repair Adaptor ligation cDNA amplification & heat separation Ss Circ Rolling circle amplification Rolling circle amplification DNA nanoball generation High Correlation Between DNBSEQ and qPCR Correlation with qPCR (Human) UHRR-1 UHRR-2 High correlation relationship with qPCR* result (UHRR-1 vs qPCR, spearman value>0.88; UHRR-2 vs qPCR, spearman value>0.88) RNA sequencing Vs. Research aims RNA Sequencing Options RNA (Transcriptome) Gene Expression ● ● Alternative Splicing ● ● Gene Fusion ● ● Novel Transcript ● ● Small RNA LncRNA Small RNA LncRNA ● ● 01 RNA Sequencing Overview 02 QC principle of RNA-seq CONTENTS 03 mRNA sequencing 04 ncRNA sequencing 05 Data analysis Application of LncRNA-seq ◆ Application • Study the mechanism of pathogenesis regulation • Discover potential biomarkers and target sites • LncRNA profiling of plants and animals • Regulation of plant and animal growth and development lncRNA regulation Run-Wen Yao, et al. Cellular functions of long noncoding RNAs. Nature Cell Biology .volume 21, pages542–551(2019) Experimental workflow of LncRNA-seq at BGI Total RNA ➢ Sample type: Total RNA with good quality, RIN>7 RNA fragmentation and rRNA removal ➢ Species: Human/mice/rat Add dUTP to second cDNA strain ➢ Sample amount: Total RNA Amount >200ng; Concentration: 40-2500 ng/μL; >10G PE150 data End-repair N6 primer Reverse transcription Adaptor ligation PCR (add UDG)& heat separation Ss Circ Rolling circle amplification Get mRNA and lncRNA at one shoot Higher correlation rate of dUTP library • Left bars: pooled reference; Middle bars: the control library; Right bars: Agilent microarrays • dUTP library had the best correlation and lowest RMSE relative to all three references • MA plots showed an excellent linear relation between the dUTP library and the control library across a broad range of values Levin et al, 2010, Nat methods smallRNA classification miRNA sequencing application on Exosome research Exosomes research -miRNAs • • • • Cancer research Rare disease Signaling pathway biomarker identification • … https://www.novusbio.com/research-areas/cell-biology/Exosome-research-tools UMI micro RNA research workflow ➢ Total RNA • Small RNA :≥20ng • Total RNA:≥1 μg ➢ Low input protocol • Exosome RNA≥1ng Sequence for SE50 Unique molecular identifiers (UMI) Reads UMI To quantify the gene/transcripts expression level by eliminate the PCR bias with UMIs. Higher correlation comparing to non-UMI ➢ Sample type: Human Exosome ➢ Library type: UMI small RNA ➢ Sequencing: SE50 ➢ Data amount: 20M reads UMI Library Normal Library 01 RNA Sequencing Overview 02 QC principle of RNA-seq CONTENTS 03 mRNA sequencing 04 ncRNA sequencing 05 Data analysis An overview of RNA-seq analysis Gene info (sequence/ref) Experiment Function: GO, KEGG, GSEA etc. SNP/InDel Promoter, Transposon Structure RNA splicing UTR length Gene Expression DEGs (Phenotype) Regulation lncRNA regulation micRNA regulation (active/inhibit) Interaction: PPI, KDA Clustering: Heatmap, time series Co-expression Previous study Common Analysis Challenges Q1: I’ve finally got my result, but I would like to do the enrichment in MSigDB rather than KEGG or Go database. Q2: There are thousands of DEGs in my result! How can I filter genes that are related to my research? Q3: I’m not able to do bioinformatics analysis myself, but I would like do further data mining based on the analysis result. Q4: Is there a platform that integrates common tools which could be used to do plotting? Q5: Can I compare my results to external database (ie. TCGA/ARCHS4)? Features of Dr. Tom system Dr. Tom Small tools overview Clustering heatmap Association clustering Pathway Enrichment Association Network KDA Pathway classification Multi-omics association Line Chart Chi-square test BGI RNA-seq & Dr. Tom user case study MOLM-13 and MV4-11 cells exposure to chidamide or MI3 or in combination Apoptosis analysis In AML, most of 11q23 translocations led to fusion proteins involving the mixed lineage leukemia (MLL) gene Chidamide, a novel histone deacetylase (HDAC) inhibitor that inhibits HDAC1-3 Menin-MLL interaction inhibitor MI-3, which acts to inhibit transcription of MLL target genes Western blot analysis RNA-seq Animal study Ye J, Zha J, Shi Y, Li Y, Yuan D, Chen Q, Lin F et al. (2019) Co-inhibition of HDAC and MLL-menin interaction targets MLL-rearranged acute myeloid leukemia cells via disruption of DNA damage checkpoint and DNA repair. Clinical Epigenetics. 11:137 Gene enrichment Enriched pathways include cell cycle, DNA replication, and several DNA repair mechanisms A Combined treatment with chidamide and MI-3 for 24 h B Combined treatment with chidamide and MI-3 for 48h Heatmap generation C D Overlapped 635 genes Venn diagram: DEGs of three treatments E Differed 59 genes Same trends of expression overall These genes were associated with severa key survival signaling pathways Summary ✓ Library protocols are really depends on sample QC result. ✓ BGI provide specific library preparations for mRNA-seq and non-coding RNA-seq ✓ The Dr. Tom data mining system create a friendly environment for user to analyze their data in an easier way info@bgi.com www.bgi.com International Head Offices BGI Americas One Broadway, 14th Floor Cambridge, MA 02142, USA Tel: +16175002741 BGI Europe Ole Maaløes Vej 3, DK-2200 Copenhagen N, Denmark Tel: +4570260806 BGI Asia-Pacific 16 Dai Fu Street, Tai Po Industrial Estate, New Territories, Hong Kong Tel: +85236103510 BGI makes no representations or warranties about the accuracy or suitability of any information in the webinars and related materials; all such content is provided on an “as is” basis. BGI HEREBY DISCLAIMS ALL WARRANTIES REGARDING THE CONTENTS OF THESE INFORMATION AND MATERIALS, INCLUDING WITHOUT LIMITATION ALL WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. BGI hereby disclaims all liability for any claims, losses, or damages in connection with use or application of these information and/or webinars. BGI does not guarantee, warrant, or endorse the products or services of any entity, organization, or person. BGI, or the relevant owner, retains all rights (including copyrights, trademarks, patents as well as any other intellectual property right) in relation to all information and/or materials provided in the webinars (including all texts, graphics and logos).