Uploaded by 王云飞

webinar RNA-seq final version

advertisement
The Complete Guide to RNA Sequencing:
Everything You Ever Wanted to Know for Successful
RNA Sequencing Based Research
Faye Wang, BGI Genomics
01 RNA Sequencing Overview
02 QC principles of RNA-seq
CONTENTS
03 mRNA sequencing
04 ncRNA sequencing
05 Data analysis
RNA classification: Coding RNA Vs. non-coding RNA
RNA transcripts
Coding mRNA
Non-coding
RNA
‘Housekeeping’
RNAs
Regulatory RNAs
rRNA
LncRNA
miRNA,
(>200nt)
siRNA,
piRNA
(<50nt)
tRNA
snRNA/snoRNA
Telomerase RNA
Overview of RNA-seq Applications
Transcriptome sequencing is used to reveal the presence, quantity and structure of RNA in a
biological sample under specific conditions.
Plant & Animal
Diagnosis Biomarker
Identification
development research
Plant & Animal Stress
Resistance Research
Plant-Fungal Interaction
Research
Application
Fields
Target Drug Development
Pathogenic Regulatory
Mechanism Research
RNA sequencing and COVID-19
➢ Fast way to generate data
➢ Cost effective way to get useful data
➢ Cover a broad range of research
fields
➢ Mature association analysis solutions
to combine with other technologies
Standard RNA Sequencing Project Workflow at BGI
Delivery QC
Clients
•
•
•
Total RNA samples
Tissues, cell lines
samples for total RNA
extraction
Several protocols
for different types of
samples
• DNBSEQ platform
•
PE100/PE150/SE50
•
•
Standard analysis pipelines
Customized analysis
• Dr. Tom system delivery
01 RNA Sequencing Overview
02 QC principles of RNA-seq
CONTENTS
03 mRNA sequencing
04 ncRNA sequencing
05 Data analysis
Frequently Asked Questions
Q1: How do I understand the Quality Control Report?
Q2: The nanodrop QC result from my side looks OK. Why am I informed that the
RNA sample has not passed QC?
Q3: There are so many kinds of RNA library on the market. Can one protocol fulfill
all RNA sample types?
Q4: What data can I expect from 50ng input total RNA? Is there any influence on
the mapping rate, quality or detected gene number?
QC method for RNA samples
Concentration
• NanoDrop
Purity
• Gel Electrophoresis
Integrity
• Agilent2100/Fragment
analyzer
Quality
indicators
Method
BGI QC method for RNA samples
Gel electrophoresis
• 28S
• 18S
• 5S
*Human sample
Nanodrop
• OD260/OD280
• OD260/OD230
Agilent 2100☺
• RNA Integrity number(RIN)
• Base Line
• Peaks
RNA sample QC indicators
Example of electropherogram with RNA integrity number (RIN) of 10. The four peaks indicated as markers, 5S denotes
small RNA, 18S and 28S denote ribosomal RNA (Norhazlin et al, 2015)
Example of RNA QC result
Intact RNA,
RIN=10
Cell line total RNA
RIN=10
Cartilage total RNA
RIN=9.8
Partially
degraded,
RIN=5
Strongly
degraded,
RIN=3
Hydrothorax total RNA
RIN=9.5
Muscle total RNA
RIN=8.5
Specific conditions for lower RIN
Urine total RNA
RIN=2.7
Sperm total RNA
RIN=2.4
Placenta total RNA
RIN=2.1
Example of RNA QC result from sample type without RIN
Algae total RNA (RIN=N/A)
Rice root total RNA (RIN=N/A)
Bee total RNA (RIN=N/A)
Gibberella zeae total RNA (RIN=N/A)
Example of QC result from Specific conditions
Plant leaf total RNA
Prokaryotic vs Eukaryotic
total RNA mixture
Prokaryotic total RNA
01 RNA Sequencing Overview
02 QC principle of RNA-seq
CONTENTS
03 mRNA sequencing
04 ncRNA sequencing
05 Data analysis
RNA sample types
Tissues
Plasma
Buffy coat
Erythrocytes
Cell lines
FFPE
Cell line/Tissue based mRNA library preparation
Cell lines: ≥2×105cells
Tissue: ≥30mg
mRNA sequencing
Total RNA Agilent 2100 QC report
Cell lines
RIN>7
Tissues
➢
•
•
Specifications:
Enough total RNA:>200ng
Good quality : RIN>7
➢
•
Protocol
PolyA+ Selection based
mRNA enrichment
Cell line/tissue based mRNA library preparation
Intact RNA,
RIN=10
mRNA sequencing
Partially
degraded,
RIN=5
➢
•
•
Specifications:
Enough total RNA:>200ng
RNA degraded: RIN<5
➢
•
Protocol
rRNA depletion
Strongly
degraded,
RIN=3
Low input mRNA library preparation
➢ Sample type: UHRR
➢ Library type: PolyA+ selection,
➢ Sequencing: DNBSEQ PE150, 30M reads
Protocol
/Input
rRNA
rate
Duplication
rate
Gene
detection
PolyA+
Selection
(20ng~200ng)
√
●
√
Gene
expression
correlation
√
mRNA sequencing
qPCR
correlation
Alternative
splicing
√
●
● : The duplication rate increased as the input decreased, but
still very low (duplication rate<5%);
● : The alternative splicing detection amount slightly decreased
as the input decreased;
➢
•
•
Specifications:
20ng< Total RNA<200ng
Good quality : RIN>7
➢
•
Protocol
PolyA+ Selection based
mRNA enrichment
*In brief, if the total RNA
quality is really good, we can
still get qualified data with
lower input.
Whole blood based mRNA library preparation
mRNA sequencing
human, β-globin mRNA
➢
•
Plasma
About Globin mRNA
Buffy coat
Erythrocytes
• Expressed at high levels in red blood
cells (RBCs) and reticulocytes (RBC
precursors) .
• Up to 70% of the mRNA (by mass) in
whole blood total RNA are globin
transcripts.
•
•
➢
•
•
Specifications:
Total RNA extracted from
whole blood(H/M/R).
Total RNA:>500ng
Good quality : RIN>7
Protocol
PolyA+ Selection based
mRNA enrichment
Globin mRNA removal is
needed.
Whole blood based mRNA library preparation
Human-Non-neonate
(Contains α and β globin mRNA)
Plasma
Plasma
Human-Neonate
(Contains α,β and γ globin mRNA)
Buffy coat
Mouse/Rat
Erythrocytes
Other Mammalians
Tips:
 Different globin removal
kits covers different
species;
 Even for human whole
blood samples, the
protocol for neonate or
non-neonate can be
different.
 Don’t want to be
bothered by the globin
mRNA thing?
Extract the total RNA from
the buffy coat cells.
FFPE based mRNA library preparation
Exome Capture RNA-seq
rRNA depletion RNA-Seq
DNase I treated & Purified
Total RNA
DNase I treated &
Purified Total RNA
RNA Fragmentation
FFPE
rRNA depletion
cDNA Synthesis
Adapter Ligation & PCR
RNA
Fragmentation
Hybridization & capture
cDNA
Synthesis
Clean up & PCR
Adapter Ligation & PCR
Sequencing
mRNA sequencing
➢
•
•
•
•
➢
•
Specifications:
Total RNA extracted
from FFPE samples
Total RNA:>100ng
Quality: RIN>2;
DV200≥30%
Protocol
rRNA depletion or
Exome capture based
mRNA enrichment
Summary: Library protocols for mRNA
No.
Sample type
Total RNA
RIN
Protocol
1
Cell line or tissue
20<m<200ng
>7
Poly A selection
2
Cell line or tissue
M>200ng
<5
rRNA depletion
3
Whole Blood
M>500ng
>7
Globin mRNA removal + Poly A
selection
4
Buffy coat cell from
whole blood
No. 1
No. 2
No. 1
No. 2
5
FFPE slides
>100ng
>2
Poly A selection
rRNA depletion
rRNA depletion/exon capture
enrichment
DNBSEQ Transcriptome Experimental workflow
Total RNA
mRNA fragmentation and enrichment
N6 primer
Reverse transcription
Second cDNA chain synthesis
End-repair
Adaptor ligation
cDNA amplification & heat separation
Ss Circ
Rolling circle amplification
Rolling circle amplification
DNA nanoball generation
High Correlation Between DNBSEQ and qPCR
Correlation with qPCR (Human)
UHRR-1
UHRR-2
High correlation relationship with qPCR* result
(UHRR-1 vs qPCR, spearman value>0.88; UHRR-2 vs qPCR, spearman value>0.88)
RNA sequencing Vs. Research aims
RNA Sequencing Options
RNA (Transcriptome)
Gene Expression
●
●
Alternative Splicing
●
●
Gene Fusion
●
●
Novel Transcript
●
●
Small RNA
LncRNA
Small RNA
LncRNA
●
●
01 RNA Sequencing Overview
02 QC principle of RNA-seq
CONTENTS
03 mRNA sequencing
04 ncRNA sequencing
05 Data analysis
Application of LncRNA-seq
◆ Application
• Study the mechanism of pathogenesis
regulation
• Discover potential biomarkers and
target sites
• LncRNA profiling of plants and animals
• Regulation of plant and animal growth
and development
lncRNA regulation
Run-Wen Yao, et al. Cellular functions of long noncoding RNAs. Nature Cell Biology .volume 21, pages542–551(2019)
Experimental workflow of LncRNA-seq at BGI
Total RNA
➢ Sample type:
Total RNA with good quality,
RIN>7
RNA fragmentation and rRNA removal
➢ Species: Human/mice/rat
Add dUTP to second cDNA strain
➢ Sample amount:
Total RNA Amount >200ng;
Concentration: 40-2500
ng/μL;
>10G PE150 data
End-repair
N6 primer
Reverse transcription
Adaptor ligation
PCR (add UDG)& heat separation
Ss Circ
Rolling circle amplification
Get mRNA and lncRNA at one shoot
Higher correlation rate of dUTP library
• Left bars: pooled reference; Middle
bars: the control library; Right bars:
Agilent microarrays
• dUTP library had the best correlation
and lowest RMSE relative to all three
references
• MA plots showed an excellent linear
relation between the dUTP library and
the control library across a broad
range of values
Levin et al, 2010, Nat methods
smallRNA classification
miRNA sequencing application on Exosome research
Exosomes research
-miRNAs
•
•
•
•
Cancer research
Rare disease
Signaling pathway
biomarker
identification
• …
https://www.novusbio.com/research-areas/cell-biology/Exosome-research-tools
UMI micro RNA research workflow
➢ Total RNA
• Small RNA :≥20ng
• Total RNA:≥1 μg
➢ Low input protocol
• Exosome RNA≥1ng
Sequence
for SE50
Unique molecular identifiers (UMI)
Reads
UMI
To quantify the gene/transcripts expression level by eliminate the PCR bias with UMIs.
Higher correlation comparing to non-UMI
➢ Sample type: Human
Exosome
➢ Library type: UMI small
RNA
➢ Sequencing: SE50
➢ Data amount: 20M reads
UMI Library
Normal Library
01 RNA Sequencing Overview
02 QC principle of RNA-seq
CONTENTS
03 mRNA sequencing
04 ncRNA sequencing
05 Data analysis
An overview of RNA-seq analysis
Gene info (sequence/ref)
Experiment
Function: GO, KEGG, GSEA etc.
SNP/InDel
Promoter,
Transposon
Structure
RNA
splicing
UTR length
Gene
Expression
DEGs
(Phenotype)
Regulation
lncRNA regulation
micRNA regulation
(active/inhibit)
Interaction: PPI, KDA
Clustering: Heatmap, time
series
Co-expression
Previous study
Common Analysis Challenges
Q1: I’ve finally got my result, but I would like to do the enrichment in MSigDB rather than KEGG or
Go database.
Q2: There are thousands of DEGs in my result! How can I filter genes that are related to my
research?
Q3: I’m not able to do bioinformatics analysis myself, but I would like do further data mining based
on the analysis result.
Q4: Is there a platform that integrates common tools which could be used to do plotting?
Q5: Can I compare my results to external database (ie. TCGA/ARCHS4)?
Features of Dr. Tom system
Dr. Tom
Small tools overview
Clustering heatmap
Association clustering
Pathway Enrichment
Association Network
KDA
Pathway classification
Multi-omics association
Line Chart
Chi-square test
BGI RNA-seq & Dr. Tom user case study
MOLM-13 and MV4-11 cells exposure to chidamide or MI3 or in combination
Apoptosis analysis
In AML, most of 11q23 translocations led to fusion
proteins involving the mixed lineage leukemia (MLL)
gene
Chidamide, a novel histone deacetylase (HDAC)
inhibitor that inhibits HDAC1-3
Menin-MLL interaction inhibitor MI-3, which acts to
inhibit transcription of MLL target genes
Western blot analysis
RNA-seq
Animal study
Ye J, Zha J, Shi Y, Li Y, Yuan D, Chen Q, Lin F et al. (2019) Co-inhibition of HDAC and MLL-menin interaction targets MLL-rearranged acute myeloid
leukemia cells via disruption of DNA damage checkpoint and DNA repair. Clinical Epigenetics. 11:137
Gene enrichment
Enriched pathways include cell cycle, DNA replication, and several DNA repair mechanisms
A
Combined treatment with chidamide and MI-3 for 24 h
B
Combined treatment with chidamide and MI-3 for 48h
Heatmap generation
C
D
Overlapped 635 genes
Venn diagram:
DEGs of three treatments
E
Differed 59 genes
Same trends of expression
overall
These genes were associated with severa
key survival signaling pathways
Summary
✓ Library protocols are really depends on sample QC result.
✓ BGI provide specific library preparations for mRNA-seq and non-coding
RNA-seq
✓ The Dr. Tom data mining system create a friendly environment for user
to analyze their data in an easier way
info@bgi.com
www.bgi.com
International Head Offices
BGI Americas
One Broadway, 14th Floor
Cambridge, MA 02142,
USA
Tel: +16175002741
BGI Europe
Ole Maaløes Vej 3,
DK-2200 Copenhagen N,
Denmark
Tel: +4570260806
BGI Asia-Pacific
16 Dai Fu Street,
Tai Po Industrial Estate,
New Territories, Hong Kong
Tel: +85236103510
BGI makes no representations or warranties about the accuracy or suitability of any information in the webinars and related materials; all such content is provided on an “as is” basis. BGI
HEREBY DISCLAIMS ALL WARRANTIES REGARDING THE CONTENTS OF THESE INFORMATION AND MATERIALS, INCLUDING WITHOUT LIMITATION ALL WARRANTIES OF
TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.
BGI hereby disclaims all liability for any claims, losses, or damages in connection with use or application of these information and/or webinars. BGI does not guarantee, warrant, or
endorse the products or services of any entity, organization, or person.
BGI, or the relevant owner, retains all rights (including copyrights, trademarks, patents as well as any other intellectual property right) in relation to all information and/or materials provided
in the webinars (including all texts, graphics and logos).
Download