Transposon insertion frequency distinguishes essential from non

advertisement
Functional Encyclopedia of Bacteria and Archaea
Matthew Blow
Deutschbauer lab, LBNL
Adam Deutschbauer
Morgan Price
Kelly Wetmore
Adam Arkin
JGI
Cindi Hoover
Feng Chen
Jim Bristow
mjblow@lbl.gov
1. Gene function annotation using transposon
mutagenesis and sequencing (TnSeq)
2. A ‘Functional Encyclopedia of Bacteria and
Archaea’ (FEBA)
1. Gene function annotation using transposon
mutagenesis and sequencing (TnSeq)
2. A ‘Functional Encyclopedia of Bacteria and
Archaea’ (FEBA)
Problem: Computational annotation of
microbial genomes is imperfect
Current computational genome annotation pipeline:
Isolate
Sequence
Predict gene structure and function
Incomplete model
Nucleus
Limitations of homology:
• Median bacterial genome:
3261 protein coding genes
971 “hypothetical” protein coding genes
• New experimental approaches are
necessary to rapidly annotate
and characterize microbial genomes.
Our solution: Experimental evidence based
annotation of genomes
Develop a rapid experimental pipeline to:
1) Assess phenotypic capability
via growth assays (~300
metabolic and stress conditions)
3) Predict gene function with TnSeq in
multiple conditions per microbe
Nucleus
2) Correct gene structure and
identify promoters with RNAseq
In D. vulgaris, 507 gene revisions and 1,124
promoters at single nucleotide resolution.
Synthetic light
collecting structure
Gene function annotation by TnSeq
Microbe of interest
Condition A
Identify mutant fitness
effects by PCR and
sequencing
Is there evidence that this approach
works to annotate gene function?
ii) Recovery
i) Transposon
Mutagenesis
Condition B
iii) Antibiotic
selection
Mutant population
Millions cells, 1 random mutant per cell
…
…
…
Selection under
100’s of conditions
essential in
condition B
essential in
condition C
essential in
all conditions
Proof of principle: Gene function annotation using
Transposon mutagenesis and microarray based analysis
Condition 1
Condition 2
Condition 3
…etc
S. Oneidensis MR-1
Metal reducing bacteria
Bio-remediation
Mutant
population
Growth under
~300 conditions
Assay selected
populations on
microarray
(Deutschbauer et al PLoS Genetics 2011)
290 diverse conditions
(average 7 mutants per gene)
3,355 genes
Microbe
Plant
Mb
Gb
Fungi
10’s Mb
Metagenome Gb+
an
a u ua l
to /
m
As ated
se
m
bl
er
G
M
ar
ie
s
1,
an
a u ua l
to /
m
As ated
se
m
bl
er
G
2-3
M
ar
ie
s
Li
br
10’s Mb
AP
Genes with Tn
mutants
As
s
d i em
ffi b
cu ly
lty
Fungi
3,355
1-2
Mb
G
en
si om
ze e
Microbe
Li
br
As
s
d i em
ffi b
cu ly
lty
G
en
si om
ze e
Proof of principle: Gene function annotation using
Transposon mutagenesis and microarray based analysis
AP
1,230
1-2
AP
Genes with significant
3+
phenotypes
M
2-3
AP
1
S,V
40
1,
Genes with proposed
Assemblers:–ve
M fitness
= Meraculous,
AP
=
AllPaths,
S = SOAP
DeNovo, V =M
velvet
Plant
Gb
3+ of
effect
annotations
specific
Meraculous (Chapman et al. PloS One, 2011), Assemblathon (Earl et al Genome Research, 2011)
No fitness effect
molecular function
+ve fitness effect
Meta-
7
(Deutschbauer et al PLoS Genetics 2011)
7
4
1. Gene function annotation using transposon
mutagenesis and sequencing (TnSeq)
2. A ‘Functional Encyclopedia of Bacteria and
Archaea’ (FEBA)
A Functional Encyclopedia of
Bacteria and Archaea (FEBA)
~50 Phylogeneticaly
diverse organisms (GEBA)
*
*
*
Phosphorous
sources, 8
*
*
*
TnSeq under 50 growth
conditions
Sulfur
sources; 12
*
*
Bacterial
phylogenetic
tree
*
*
*
*
Carbon
sources; 96
*
*
*
Environment
al stresses
(temp, pH,
salinity); 9
*
*= GEBA / candidate F-GEBA
Phylogeny approach to maximize functional diversity
Small
molecule
stresses
(metals,
Nitrogen
antibiotics);
sources; 48
165
300 possible growth conditions
Outcome: 1000’s of novel gene function annotations
Plans for a FEBA pilot project
Aim 2
Culturing and transposon mutagenesis of ~40 diverse bacteria
..etc
Growth assays
RNASeq
TnSeq
Analysis / integration
Functional genome annotation
Aim 1
a) Work through the entire
functional annotation
pipeline for one bacteria
(P. Stutzeri)
b) Expand to ~10 bugs
Plans for a FEBA pilot project
Aim 1
Culturing and transposon mutagenesis of ~40 diverse bacteria
..etc
?
Growth assays
RNASeq
TnSeq
Analysis / integration
Functional genome annotation
Strategy for identifying transposon insertions
PCR primer contains
adapter arm and
5’ index sequence
3’
5’
3’
Transposon
complementary
sequence
Random 5mer
Read 2 primer
5’
3’
5’
5’
3’
3’
5’
3’
Genomic DNA only inserts are
not amplifiable by downstream
PCR
5’
5’
3’
3’
5. Sequencing
(HiSeq or MiSeq)
Index Read
+
Tn specific primer
3’
etc
3’
DNA / Tn junction
5’
3’
Tn specific
primer
5’
3. Ligate custom
truncated illumina
adapter
5’
4. PCR
using Tn
specific
primer
3’
Read 1 primer
6. Mapping to reference genome
and counting
5’
5’
1. Isolate genomic DNA
from mutant population
2. Sonicate DNA
3’
Illumina universal
adapter
Does this sequencing strategy work?
Can we use it to identify function of known genes?
Proof of principle: Identification of genes required for
survival in minimal media in Pseudomonas Stutzeri
P.Stutzeri
Soil bacteria with a potential
applications in bioremediation
Transposon
Mutagenesis
Select in LB
Compare
>106 mutant cells
Select in
minimal media
TnSeq specifically identifies Tn insertions and
is highly reproducilbe
Tn insertion is at TA
Replicate 1
99.91%
97.81%
Replicate 2
99.92%
97.80%
Tn inserts per gene (Rep 1)
Map to the genome
150
100
50
Pearson correlation 0.99
0
0
50
100
150
Tn inserts per gene (Rep 2)
“Essential” genes appear as transposon free regions
Illumina
read
depth
230
Transposon
insertions
Insertion
free site
Transposon
insertions
0
Genes
Non-essential genes
Non-essential genes
Essential gene:
dihydroxy-acid dehydratase
(required for biosynthesis of amino acids)
Top 20 genes advantageous for survival in minimal media
Gene
Phosphoribosylanthranilate_isomerase
phosphoserine_phosphatase_SerB
3-isopropylmalate_dehydrogenase
Predicted_membrane_protein
Putative_threonine_efflux_protein
O-succinylhomoserine_sulfhydrylase
Chemotaxis_protein_histidine_kinase_and_related_kinases
tryptophan_synthase,_beta_subunit
Indole-3-glycerol_phosphate_synthase
hypothetical_protein
anthranilate_phosphoribosyltransferase
ATP_phosphoribosyltransferase,_regulatory_subunit
methionine_biosynthesis_protein_MetW
5,10-methenyltetrahydrofolate_synthetase
Membrane_protease_subunits,_stomatin/prohibitin_homologs
3-isopropylmalate_dehydratase,_large_subunit
anthranilate_synthase_component_I
Predicted_integral_membrane_protein
Imidazoleglycerol-phosphate_dehydratase
5,10-methylenetetrahydrofolate_reductase
Tn insertion ratio
(LB / minimal)
7.0
6.2
5.0
4.7
3.8
3.5
3.4
3.2
3.2
3.1
3.1
3.0
3.0
2.9
2.8
2.8
2.7
2.7
2.7
2.6
Top 20 genes advantageous for survival in minimal media
Gene
Phosphoribosylanthranilate_isomerase
phosphoserine_phosphatase_SerB
3-isopropylmalate_dehydrogenase
Predicted_membrane_protein
Putative_threonine_efflux_protein
O-succinylhomoserine_sulfhydrylase
Chemotaxis_protein_histidine_kinase_and_related_kinases
tryptophan_synthase,_beta_subunit
Indole-3-glycerol_phosphate_synthase
hypothetical_protein
anthranilate_phosphoribosyltransferase
ATP_phosphoribosyltransferase,_regulatory_subunit
methionine_biosynthesis_protein_MetW
5,10-methenyltetrahydrofolate_synthetase
Membrane_protease_subunits,_stomatin/prohibitin_homologs
3-isopropylmalate_dehydratase,_large_subunit
anthranilate_synthase_component_I
Predicted_integral_membrane_protein
Imidazoleglycerol-phosphate_dehydratase
5,10-methylenetetrahydrofolate_reductase
Red = known role in amino acid biosynthesis
Blue = known role in purine biosynthesis
Tn insertion ratio
(LB / minimal)
7.0
6.2
5.0
4.7
3.8
3.5
3.4
3.2
3.2
3.1
3.1
3.0
3.0
2.9
2.8
2.8
2.7
2.7
2.7
2.6
Conclusion:
- TnSeq strategy works
- Identifies genes required for growth in minimal media
The next experiment:
P. Stutzeri
Mutant library
Selection under
multiple conditions
Synthesis of
libraries in plate
based format
Sequencing of
pooled
experiments
Plans for a FEBA pilot project
Aim 2
Culturing and transposon mutagenesis of ~40 diverse bacteria
..etc
Growth assays
RNASeq
TnSeq
Analysis / integration
Functional genome annotation
Aim 2
a) Work through the entire
functional annotation
pipeline for one bacteria
(P. Stutzeri)
b) Expand to ~10 bugs
Progress toward culturing and mutagenesis of ~40 bacteria
44 bugs (9 phyla)
In hand at LBNL
15 bugs (5 phyla)
Cultured
9 bugs (2 phyla)
Tn mutagenesis
attempted
Was mutagenesis successful?
MiSeq analysis of transposon mutant libraries
from four new bugs
Tn mutants of four marine bacteria with similar culturing conditions
Alcanivorax
jadensis
Dinoroseobacter
shibae
Kangiella
aquimarina
Phaeobacter
gallaeciensis
Isolate and pool DNA
PCR Tn inserts and
sequence on MiSeq
Map to four genomes
Alcanivorax
jadensis
insertions
Dinoroseobacter
shibae
insertions
Kangiella
aquimarina
insertions
Phaeobacter
gallaeciensis
insertions
MiSeq analysis of transposon mutant libraries
from four new bugs
96% reads map to
unincorporated
transposon!
But…….
Candidate transposon
insertions from all 4 bugs
(Insertion dinucleotide frequency / genome dinucleotide frequency)
Candidate transposon are at expected TA dinucleotides
20
TA = insertion site
preference of
pHIMAR transposon
Kangiella aquimarina (639 potential insertions)
10
0
Fold enrichment
Conclusion:
20
Phaeobacter gallaeciensis (158 potential insertions)
10
0
- We are
able to culture and mutagenize diverse bacteria
- Need40 to demonstrate that we can generate high
Dinoroseobacter shibae (170 potential insertions)
30
diversity
mutant libraries
20
10
0
20
Alcanivorax jadensis (161 potential insertions)
10
0
AA
AC
AG
AT
CA
CC
CG
CT
GA
GC
GG
GT
TA
Dinucleotide sequence of Tn insertion site
TC
TG
TT
Summary
We are developing high throughput
experimental approaches to
annotate gene function
* * * *
*
*
*
*
Bacterial
phylogenetic
tree
*
*
*
*
The ‘FEBA’ project will provide
functional annotation for 50 diverse
organisms / 1000s novel genes
* *
* *
Future ‘product’ of JGI? Keen to target
bugs of interest to DOE and to JGI user
community
mjblow@lbl.gov
Example of specific novel gene function annotation from
transposon mutagenesis
Gene S0_3749 = Hypothetical gene with no homology based annotation
Functional evidence from mutant assays
2. Function confirmed in
complementation assay
Arg biosynthesis genes
Conditions
Does SO_3749 catalayze
missing step in Arg
biosynthesis?
Strong –ve fitness effect
No fitness effect
Conclusion: SO374 encodes a functional acetyl-ornithine deacetylase
No homology to the functional ortholog (argE) in E.Coli
Transposon mutagenesis through bacterial
conjugation
Vector
carrying
transposon
Target cell
E. Coli ‘donor’ cell
Conjugation
Growth in absence of DAP
(E. Coli dies)
Further growth
(Vector is lost)
Transposon mutagenesis through bacterial
conjugation
Vector
carrying
transposon
Target cell
E. Coli ‘donor’ cell
Conjugation
Growth in absence of DAP
(E. Coli dies)
THIS STEP DIDN’T
WORK PROPERLY
Further growth
(Vector is lost)
Download