Document

advertisement
Comparative Transcriptomics as a Gene Discovery Tool in
Solanum pennellii, a Potential Source of Biogasoline
Tom McKnight, Sachi Mandal, Wang Ming Ji, Department of Biology
Photo by TRGC
xkcd.com
Fractions
decreasing in
density and
boiling point
Fractions
increasing in
density and
boiling point
Crude oil
C1 to C4
gases
C1 Methane
C2 Ethanol
C4 Biobutanol
C5 to C9
naphtha
C5 to C10
gasoline
C4 to C10 S. pennellii
biogasoline
C10 to C16
kerosene
C14 to C20
diesel
C20 to C50
lubricating oil
Heating
C50 to C70
fuel oil
> C70 residue
C16 & C18 Biodiesel
Solanum pennellii is native to extremely arid regions of Peru
Glucolipids are secreted by trichomes onto the leaf surface
Fobes, J.F., Mudd, J.B. and Marsden, M.P.F. (1985) Plant Physiol. 77, 567-570.
Glucolipids accumulate to over 20% of dry weight of the plant!
mg/g extractable lipid
% DW
200
150
20%
15%
S. pennellii
100
10%
50
5%
VF36
5 6 7 8 9 10 11 12 13 14 15 16
Weeks of Growth
Fobes, J.F., Mudd, J.B. and Marsden, M.P.F. (1985) Plant Physiol. 77, 567-570.
The S. pennellii glucolipid has three short-chain
fatty acids (C4 to C10) esterified to glucose
CH2OH
O
O
CH3(CH2)n-C-O
O
OH
O
O
O-C-(CH2)nCH3
C-(CH2)nCH3
Transesterification of triglycerides produces biodiesel
H2C-OH
O
H2C-O-C-(CH2)n-CH3
O
HC-O-C-(CH2)n-CH3
O
H2C-O-C-(CH2)n-CH3
+
CH3OH
Vegetable oil + MeOH
HC-OH
NaOH
H2C-OH
O
CH3-O-C-(CH2)nCH3
O
CH3-O-C-(CH2)nCH3
O
CH3-O-C-(CH2)nCH3
Glycerol + 3 long-chain fatty acid esters
Transesterification of glucolipid produces biogasoline
CH2OH
O
OH
+
CH3OH
OH
HO
2,3,4-tri-O-acylglucose
NaOH
O
OH
CH3-O-C-(CH2)nCH3
O
CH3-O-C-(CH2)nCH3
O
CH3-O-C-(CH2)nCH3
Glucolipid + MeOH
Glucose + 3 short-chain fatty acid esters
Predominant fatty acids in acylsugars of S. pennellii accessions
Fatty acid
LA 0716
LA 1941
LA 1946
LA 1912
(n=6)
(n=6)
(n=6)
(n=6)
2-methylpropanoate (C4)
41.8 (0.4)
42.2 (1.6)
41.6 (0.8)
t
2-methylbutanoate (C5)
10.8 (0.2)
9.9 (0.7)
9.0 (0.3)
-
3-methylbutanoate (C5)
4.0 (0.2)
8.5 (2.2)
13.0 (0.5)
t
5-methylhexanoate (C7)
-
t
2.0 (0.2)
-
6-methylheptanoate (C8)
-
-
-
t
7-methyloctanoate (C9)
t
t
t
-
n-octanoate (C8)
-
-
-
-
8-methylnonanoate (C10)
26.3 (0.6)
19.7 (2.3)
9.9 (0.3)
t
n-decanoate (C10)
10.7 (0.4)
10.6 (1.1)
14.8 (0.4)
-
9-methyldecanoate (C11)
t
t
t
t
N-dodecanoate (C12)
4.9 (0.2)
5.7 (0.6)
7.1 (0.6)
t
t = Trace (<2%) measured.
Joseph A. Shapiro et al. (1993) Biochemical Systematics and Ecology 22, 545-561.
Advantages of S. pennellii
Not a food or feed crop
Drought tolerant and can grow on marginal land
Lipid is on leaf surface and can be extracted in the field with a simple
ethanol rinse to rapidly yield a high-energy, high-value liquid without
transporting large amounts of low-value biomass
Glucolipid can be converted to gasoline with standard
transesterification technology
Resulting biogasoline should be compatible with existing fuel
technology (transportation and engines)
Potential Disadvantages of S. pennellii
Not perennial (yet)
Yield per acre is not known (yet)
Biosynthetic Pathway for Glucolipid
• Glucolipid is made from UDPGlucose and short chain fatty
acids.
UDP Glucose
Fatty acid
Cloned
• Genes encoding the first two
enzymes have been cloned and
characterized.
• Step 1 – Glucosyltransferase
• Step 2 – Glucose acyltransferase
• There are only two or three
additional steps, making this short
pathway a good candidate for
moving into other plants.
1-O-acyl-ß-glucose
Cloned
1,2-di-O-acyl-ß-glucose
2,3,4-tri-O-acyl-ß-glucose
Glucoseor
Not Cloned
Characterized
Information Flow
DNA makes RNA makes Protein
DNA is the chemical
stable genetic material.
RNA is an unstable
messenger that conveys
information from DNA
to ribosomes.
Ribosomes read the
genetic code on RNA to
make proteins.
Proteins do the bulk of
work in cells.
Growing protein chain
Generation of S. pennellii transcriptome
Total RNA isolated from
different Solanum
pennellii lines
Purified mRNA
Oligo dT selection
(< 1% of total RNA)
Reverse Transcription
~200 million
paired-ends reads
Next-Gen DNA
Sequencing
cDNA library
QC & End Trimming
Assembly
Trimmed reads
(101 or 125 nt long)
Assembled Solanum
pennellii transcriptome for
further analysis
Short reads (101 nt) mapped to Gene 1 genomic DNA region
Intergenic region
Promoter (switch)
Protein coding region
Identities of Putative Transcripts
No Match
(20,547)
Matched to
GO Term
(32,909)
Matched, No Info
(8,313)
CEGMA: 456 of 458 conserved genes represented (99.5%)
Comparative Transcriptomics
Four high and four low glucolipid-producing accessions
High (>20% DW)
•
•
•
•
0716
1941
1946
1302
Low (<5% DW)
•
•
•
•
1911
1912
1920
1926
~200 million reads (125 bp x 2 ends) for 8 accessions
Transcriptomes of different Solanum pennellii accessions
after Trinity assembly
Solanum
pennellii lines
Number of
contigs
30000000
0716 Hi
Reference
57242
25000000
70749
1302 Hi
67214
1941 Hi
61207
1946 Hi
60790
1911 Lo
66962
1912 Lo
68316
1920 Lo
62045
1926 Lo
62908
Total reads
0716 Hi
20000000
15000000
10000000
5000000
0
0716
1302
1944
1946
1911
1912
Solanum pennellii lines
1920
1926
Mapping RUBISCO small subunit sequence reads for QC
1911_Low
1912_Low
1920_Low
1920_Low
1302_High
1941_High
1946_High
0716_High
Expression level of control genes in different S. pennellii accessions
700
High
Low
600
RPKM
500
400
Phosphoglycerate
kinase
300
Ubiquitin conjugating
enzyme
200
100
0
0716
0716
1302
1302
1944
1946
1911
1912
1920
1926
1941
1946
1911
1912
1920
1926
Solanum pennellii accessions
RPKM = Reads per kilobase of model (gene)
Gene 1: sequence and expression levels
1911_Low
1912_Low
1920_Low
1926_Low
1302_High
1941_High
1946_High
0716_High
Expression level of Gene1 & 2 in different S. pennellii accessions
600
Low
High
500
RPKM
400
300
Gene1
Gene2
200
100
0
0716
0716
1302
1302
1944
1941
1946
1946
1911
1911
1912
1912
Solanum pennellii accessions
RPKM = Reads per kilobase of model (gene)
1920
1920
1926
1926
Alpha/beta hydrolase family
Sugar (and other) transporter
Thaumatin family
Glutathione S-transferase, Nterminal domain
alpha/beta hydrolase fold
Initiation factor 2 subunit family
Cyclophilin type peptidyl-prolyl cistrans isomerase/CLD
Fatty acid desaturase
hypothetical protein
Plant invertase/pectin
methylesterase inhibitor
hypothetical protein
hypothetical protein
Xylanase inhibitor N-terminal
Prephenate dehydratase
hypothetical protein
hypothetical protein
Plant invertase/pectin
methylesterase inhibitor
AP2 domain
EamA-like transporter family
Glycosyl hydrolases family 17
Chitinase class I
Chitinase class I
Dirigent-like protein
Uncharacterised protein family
(UPF0041)
hypothetical protein
hypothetical protein
Cytochrome P450
Glycosyl hydrolases family 16
Light regulated protein Lir1
Potato inhibitor I family
Pathogenesis-related protein Bet v I
family
Major intrinsic protein
Thaumatin family
Subtilase family
Subtilase family
Subtilase family
Protein of unknown function
(DUF_B2219)
hypothetical protein
UDP-glucoronosyl and UDP-glucosyl
transferase
hypothetical protein
hypothetical protein
No apical meristem (NAM) protein
Peptidase family M20/M25/M40
BURP domain
Cytochrome P450
Patatin-like phospholipase
Cytochrome P450
Serine carboxypeptidase
Transmembrane amino acid
transporter protein
Leucine Rich repeats (2 copies)
Leucine Rich repeats (2 copies)
hypothetical protein
F-box associated
Cytochrome P450
Mycolic acid cyclopropane synthetase
Cytochrome P450
Reverse transcriptase (RNA-dependent
DNA polymerase)
Rieske (2Fe-2S) domain
Gibberellin regulated protein
Myb-like DNA-binding domain
AP2 domain
Putative lysophospholipase
Pathogenesis-related protein Bet v I
family
Glycosyl hydrolases family 32 N-terminal
domain
DnaJ domain
Plastocyanin-like domain
hypothetical protein
Yippee zinc-binding/DNA-binding
/Mis18, centromere assembly
Leucine rich repeat
non-haem dioxygenase in morphine
synthesis N-terminal
Domain associated at C-terminal with
AAA
Alpha/beta hydrolase family
K-box region
Chitinase class I
Pectinesterase
Terpene synthase family, metal binding
domain
Papain family cysteine protease
hypothetical protein
non-haem dioxygenase in morphine
synthesis N-terminal
hypothetical protein
EF hand
Chitinase class I
Gibberellin regulated protein
hypothetical protein
Translationally controlled tumour
protein
Glutathione S-transferase, N-terminal
domain
Plant mobile domain
Uncharacterised protein family
(UPF0113)
Protein of unknown function
(DUF1298)
Helix-loop-helix DNA-binding domain
PA domain
Cytochrome P450
hypothetical protein
Xylanase inhibitor C-terminal
MOSC N-terminal beta barrel domain
gag-polypeptide of LTR copia-type
Reverse transcriptase (RNAdependent DNA polymerase)
hypothetical protein
Protein kinase domain
Cytochrome P450
Glycosyl hydrolases family 17
Protein kinase domain
Leucine Rich Repeat
Xylanase inhibitor C-terminal
Glutathione S-transferase, N-terminal
domain
Download