Shewanella oneidensis MR-1

advertisement
Microbial Functional Genomics,
Genomic Technologies, And Their
Applications
Jizhong (Joe) Zhou
Zhouj@ornl.gov, 865-576-7544
Environmental Sciences Division, Oak Ridge
National Laboratory, Oak Ridge, TN 37831, USA
Gene Expression Patterns
Microbial
functional
Genomics
Whole Genome
Microarrays
Genomic
Technology
Community &
Ecosystem
Genomics
Microbial Community
Diversity & Mechanisms
Microbial
Ecology &
Extremophiles
Oligonucleotide
Arrays
Functional
Gene
Arrays
Producing Magnetic
Nanoparticles
Community
Genome Arrays
Uranium Reduction
Protein array
Challenges in functional genomics
 Defining gene functions:
30-60% open reading frames are
functionally unknown.
• Regulatory network
Gene number difference could not
explain phenotypic differences,
suggesting regulation is the key.
Microbial Functional Genomics
Integrating Gene Expression Profiling, Bioinformatics, mutagenesis and
Proteomics
MUTAGENESIS
BIOINFORMATICS
Structure-Based
Function
Prediction
sacB
aac1
Gmr
pDS31
PROTEOMICS
TRANSCRIPTOMICS
2-D Gels
DNA Microarrays
Genome Sequence
ORF #, putative functiona
Group 1
r = 0.93
Group 2
r = 0.86
Extracellular
Periplasm
Jun
Mass Spectrometry
Group 3
r = 0.84
POI
Cytoplasm
PSP promoter
Jun
pIII
Fos
Transcription &
Translation
1203,
3458,
4138,
2987,
3455,
3457,
4141,
4142,
3454,
1863,
1752,
1754,
2851,
2952,
2849,
3388,
3005,
2389,
2390,
3134,
624,
4403,
4405,
4406,
487,
488,
2262,
3280,
3290,
4795,
722,
-
3961,
749,
748,
3960,
3956,
3954,
1073,
3958,
2778,
3959,
3957,
cytochrome c552, nrfA
dimethyl sulfoxide reductase, dmsB
Ni/Fe hydrogenase, hydA
fumarate reductase, fcc
outer membrane protein
dimethyl sulfoxide reductase, dmsA
Ni/Fe hydrogenase, hydB
Ni/Fe hydrogenase, hydC
deca-heme cytochrome c
fumarate reductase, flavocytochrome c3
formate dehydrogenase, fdhA
formate dehydrogenase, fdhC
periplasmic nitrate reductase, napA
di-heme split-soret cytochrome c
ferredoxin-type protein napH
prismane
formate dehydrog., Se-cystein, fdhA
fumarate reductase, frdA
fumarate reductase, frdB
bacterioferritin, b fr
cytochrome c'
cbb3-cytochrome oxidase, ccoP
cbb3-cytochrome oxidase, ccoQ
cbb3-cytochrome oxidase, ccoN
cytochrome d ubiquinol oxidase, cydA
cytochrome d ubiquinol oxidase, cydB
mono-heme c-type cytochrome, scyA
probable oxidoreductase ordL
conserved hypothetical protein
cytochrome b, cyb P
N ADH dehydrogenase, ndh
1.11 (±0.04) c
3.16 (±1.26)
3.37 (±1.34)
2.25 (±0.35)
4.16 (±1.31)
4.99 (±0.49)
2.13 (±0.71)
3.11 (±1.22)
5.91 (±1.49)
2.08 (±0.42)
5.57 (±0.86)
4.74 (±0.56)
3.53 (±1.43)
3.55 (±0.32)
2.04 (±0.15)
2.89 (±1.62)
2.29 (±0.58)
2.69 (±0.98)
1.80 (±0.05)
0.30 (±0.06)
0.54 (±0.01)
0.52 (±0.02)
0.60 (±0.13)
0.64 (±0.33)
0.62 (±0.06)
0.83 (±0.05)
0.50 (±0.05)
0.43 (±0.12)
0.42 (±0.28)
0.37 (±0.09)
0.43 (±0.09)
3.33 (±0.47)
1.93 (±0.04)
3.46 (±0.16)
1.78 (±0.10)
2.46 (±0.51)
3.21 (±0.80)
2.09 (±0.41)
1.34 (±0.18)
2.51 (±0.21)
2.01 (±0.48)
10.38 (±4.45)
12.48 (±1.61)
1.05 (±0.31)c
nd d
0.82 (±0.04)
0.59 (±0.06)
1.24 (±0.36) c
nd d
nd d
0.26 (±0.12)
0.33 (±0.02)
0.36 (±0.03)
0.34 (±0.05)
0.36 (±0.04)
0.26 (±0.06)
0.29 (±0.14)
0.37 (±0.07)
0.55 (±0.08)
0.58 (±0.08)
0.47 (±0.02)
0.65 (±0.11)
p rom ote r
geneIII
o ri
R
CMP
Nitrate
A. Electron transport:
-
C o lE 1
F os loxP
Gene
B. Intermediary carbon metabolism:
KanR
pJun
ori SC101
Mean intensity ratiob
Fumarate
F1 F2 N1 N2
AMP R
M 1 3 ori
loxP
o ri
R 6 ky
Group 4
r = 0.90
succinyl-CoA synthetase, sucD
glucose-6-phosphate isomerase, gpi
transaldolase B, talB
succinyl-CoA synthetase, sucC
succinate dehydrogenase, sdhA
citrate synthase, gltA
malate oxidoreductase, sfcA
2-oxoglutarate dehydrogenase, sucA
malate dehydrogenase, mdh
2-oxoglutarate dehydrogenase, sucB
succinate dehydrogenase, sdhB
0.99
0.59
0.51
0.46
0.54
0.52
0.52
0.40
0.58
0.75
0.70
(±0.26) c
(±0.11)
(±0.10)
(±0.02)
(±0.08)
(±0.17)
(±0.18)
(±0.08)
(±0.20)
(±0.12)
(±0.03)
0.54
0.67
0.70
0.44
0.50
0.46
0.49
0.41
0.40
0.41
0.57
0.44
0.41
0.59
0.65
0.48
0.40
0.43
2.27
2.43
(±0.11)
(±0.10)
(±0.01)
(±0.24)
(±0.16)
(±0.12)
(±0.05)
(±0.81)
(±1.02)
0.57 (±0.03)
0.40 (±0.05)
0.60 (±0.06)
0.24 (±0.05)
1.10 (±0.13)
nd d
0.93 (±0.21) c
1.32 (±0.11)
1.74 (±0.05)
(±0.06)
(±0.11)
(±0.04)
(±0.04)
(±0.08)
(±0.12)
(±0.06)
(±0.05)
(±0.06)
(±0.05)
(±0.14)
C. Transcription regulation:
Phage Display
Group 5
r = 0.86
Group 6
r = 0.81
-
3006,
2099,
3965,
1987,
4603,
1386,
721,
4019,
1382,
H 2O2-acti vator, hpkR, LysR family
histidine utilization repressor, hutC
ferric uptake regulatory protein, fur
transcritpional regulator, DeoR family
sensor histidine kinase, kinA
ATP-dependent protease, hslV
transcritpional regulator, LacI family
chemotaxis CheV homolog
tetrathionite sensor kinase, ttrS
Figure 2
Whole genome microarrays available at
ORNL
Geobacter
metallireducens: MetalShewanella oneidensis MRreducing bacterium
1: Metal-reducing
(GTL)
bacterium (MGP, GTL)
Rhodopseudomonas
palustris: Photosynthetic
bacterium (MGP, GTL)
Nitrosomonas europaea:
Ammonium-oxidizing
bacterium (MGP)
Desulfovibrio vulgaris:
Sulfate-reducing
bacterium (GTL, NABIR)
Deinococcus radiodurans
R1: Radiation-resistant
bacterium (GTL)
Methanococcus
maripaludis (GTL)
Two primary uses of microarrays for
functional analysis
• Hypothesis-generating, i.e., exploratory,
Gene expression profiling under different
conditions:
e.g., Radiation responses in Deinococcus
radiodurans .
• Hypothesis-driven:
e.g., mutant characterization in Shewanella
oneidensis MR-1.
Deinococcus radiodurans R1 Genome: 3.3Mb
Plasmid
45.7 Kbp
Chromosome I
2.65 Mbp
Megaplasmid
177.5 Kbp
Chromosome II
412.3 Kbp
% G+C
# ORFs
Mean ORF size
% Coding
66.6%
3,195
937 bp
91%
# Similar to known proteins
# Conserved hypothetical
# Hypothetical
rRNA operons
52.2%
16%
31.5%
9
*D. radiodurans R1
genome sequence and
annotation courtesy of
TIGR
Radiation Resistance of D. radiodurans R1
 Radiation Survival Curve
• Majority of E. coli
cells are dead at ~500
grays.
D. radiodurans R1
• D. radiodurans
exhibits a shoulder of
resistance up to
~5000 Gy; no loss of
viability.
E. coli
Hours post irradiation
bp
23.1
9.4
6.6
4.4
M CK 0 1.5 3
5
9 24
• Very little is known
about the DNA
repair pathways
enabling D.
radiodurans to resist
ionizing and UV
irradiation.
Deinococcus Cells
Can Survive Acute
-radiation due to
its ability to repair
direct damage and
remove free
radicals.
• Direct damage
(20%)
• Indirect damage
due to free radicals
(80%)
DNA damage repair
Re-initiate DNA synthesis
(early events after irradiation)
-radiation
-photon
(20%)
Cells
DNA
damages
mRNA
degradation
Irradiation-induced
Free radicals (80%)
Protein
degradation
Minimize free radical levels
(late events after irradiation)
Cellular functions
impaired
Replication
impaired
Cell division
arrested
Cells grow
slow or dead
Gene Expression Profiling: Experimental Design
 Recovery of D. radiodurans (wild-type strain R1) from acute radiation
(exposure dose = 15,000 Grays of -radiation)
Cell Sample
Recovery Time (in hours) @ 32C
Control (non-irradiated)
–
1
0
2
0.5
Irradiated
Control
3
1.5
4
3
5
5
6
9
7
12
8
16
9
24
3 biological replicates (different mRNAs)
Collaboration with
4 technical replicates
Mike Daly
Total replicates: 12
•More than 800 genes
Time (h)
Hierarchical Clustering Analysis
of Expression Profile Patterns
Gene#, putative functiona
A. recA-like activation pattern
r=0.83
DR0911
DR2220
DR2221
DRB0069
DRB0067
DR0261
DRA0344
DR0099
DR2129
DR2128
DR0324
DR2337
DRA0346
DR1825
DR1771
DRA0345
DR0422
DR1143
DR0003
DR1776
DR2340
DR2610
DR1645
DR0696
DR0421
DR1775
DR1561
DR2285
DR2356
DR2275
DR0206
DR0204
DR1354
DR0203
DR0205
DR1357
DR2482
DR2483
DRA0008
DRA0234
DR1359
DR2127
DR1356
DRB0136
DR1548
DR0207
DRA0249
DR0665
DR0596
DR0912
recA-like expression profile:
DNA replication
DNA
repair
Recombination
Cell wall metabolism
Cellular transport
Uncharacterized proteins
Superoxide dismutase
r=0.71
0.5
5
3
3
3
0.5
1.5
0.5
1.5
1.5
0.5
1.5
0.5
1.5
1.5
1.5
1.5
1.5
1.5
1.5
1.5
0.5
1.5
1.5
1.5
1.5
1.5
3
3
3
3
3
3
1.5
3
1.5
1.5
1.5
3
1.5
1.5
3
3
3
3
3
3
3
0.5
0.5
are induced at 1.5 hr
radiation.
regulated than downregulated.
genes which are
functionally unknown
are significantly
changed upon
irradiation.
B. Growth-related activation pattern
DR1172
DR0461
DR1595
DRA0043
DRA0042
DRA0031
DRA0065
DR2263
DRA0275
DR1279
Proteases, nucleases
Lea76/LEa29-like desiccation resistance protein
Bacillus yacB ortholog
6-phosphogluconate dehydrogenase, gnd
TDP-rhamnose synthetase
Glucose-1-phosphate thymidylyltransferase, rfbA
Glucose-1-phosphate thymidylyltransferase
Chromosomal protein HU HupA, hupA
Bacterioferritin, Iron chelating protein
Soluble cytochrome C
Superoxide dismutase (Mn)
2.66
2.58
2.30
5.08
3.70
2.48
7.71
6.41
4.80
3.91
(±0.60)
(±0.81)
(±0.52)
(±2.12)
(±1.19)
(±1.64)
(±2.07)
(±1.97)
(±1.22)
(±1.43)
24
24
24
12
12
12
24
16
24
24
0.33
0.25
0.37
0.48
0.42
0.23
0.25
0.46
0.35
0.45
(±0.12)
(±0.05)
(±0.13)
(±0.22)
(±0.12)
(±0.07)
(±0.06)
(±0.09)
(±0.15)
(±0.25)
12
3
3
1.5
1.5
3
1.5
1.5
3
5
C. Repressed pattern
r=0.77
DR1126
DR1337
DR0728
DR0977
DR1742
DR1998
DR1146
DR0493
DR0674
DR2620
TCA cycle
0.2
Genes involved in de novo synthesis of
amino acids and nucleotides
(hr)c
•More than 40% of the
Glyoxylate shunt
Repressed Genes (early to mid
phases):
1.99 (±1.37)
3.13 (±1.49)
5.24 (±2.94)
3.18 (±1.39)
4.37 (±1.21)
3.36 (±1.68)
1.80 (±1.08)
3.01 (±1.20)
5.92 (±2.09)
4.03 (±2.80)
3.30 (±1.47)
7.41 (±5.71)
3.52 (±1.94)
3.21 (±1.48)
3.52 (±1.15)
10.05 (±4.39)
18.85 (±7.46)
8.85 (±4.26)
14.03 (±5.53)
4.70 (±2.83)
7.98 (±3.86)
4.13 (±1.67)
5.88 (±2.79)
7.19 (±2.16)
4.94 (±2.30)
3.30 (±1.69)
6.00 (±1.40)
2.36 (±0.40)
3.35 (±0.45)
4.93 (±1.81)
5.45 (±2.65)
6.01 (±1.35)
3.78 (±0.42)
3.82 (±0.86)
4.10 (±2.45)
6.79 (±2.56)
5.75 (±2.92)
5.43 (±1.22)
6.60 (±2.00)
12.76 (±5.27)
24.83 (±11.13)
5.40 (±1.50)
9.85 (±5.98)
5.22 (±0.46)
5.62 (±2.35)
15.47 (±8.31)
6.47 (±4.43)
11.66 (±5.74)
3.22 (±1.31)
3.19 (±0.80)
Time
•More genes are up-
Induced Genes (early to mid phases):
Stress response
DNA-directed rna polymerase beta subunit, rpoC
Tellurium resistance protein TerB
Tellurium resistance protein TerE
Subtilisin serine protease
Extracellular nuclease with Fibronectin III domains
8-oxo-dGTPase, mutT
LEXA repressor, HTH+protease, lexA
SsDNA-binding protein, ssb
Ribosomal component L17 , rplQ
RNA polymerase alpha subunit, rpoA
Probable glutamate formiminotransferase
Uncharacterized protein
PprA protein, involved in DNA damage resistance
Protein-export membrane protein
UVRA ABC family ATPase, uvrA-1
Predicted esterase
Trans-aconitate methylase
Uncharacterized protein
Uncharacterized protein
Nudix family pyrophosphatase
RecA, recA
Periplasmic binding protein, fliY
Teichoic acid biosynthesis protein, wecG
V-type ATPase synthase, subunit K
Uncharacterized protein
Superfamily I helicase, uvrD
UDP-N-acetylglucosamine 2-epimerase, wecB
MutY, A/G-specific adenine glycosylase, mutY
Nudix family hydrolase
Excinuclease ABC subunit B, uvrB
Uncharacterized protein
Uncharacterized membrane protein
Excinuclease ABC subunit C, uvrC
Uncharacterized membrane protein
ABC transporter ATPase
ABC transporter, permease subunit
Predicted transcription regulator
McrA nuclease
Conserved membrane protein
Uncharacterized protein,
ABC transporter, periplasmic subunit
Ribosomal protein S4, rpsD
ABC transporter, ATP-binding protein
Putative DEAH ATP-dependent helicase, hepA
Bacillus ykwD ortholog, PRP1 superfamily protein
ComEA related protein, secreted
Metalloproteinase, leishmanolysin-like
Uncharacterized protein
Resovasome RuvABC, subunit B, ruvB
DNA-directed rna polymerase beta subunit, rpoB
Ratio
(fold)b
1
5
RecJ like DHH superfamily Phosphohydrolase
Transaldolase, tal
Fructokinase, cscK
Phosphoenolpyruvate carboxykinase, pckA
Glucose-6-phosphate isomerase, pgi
Catalase, CATX, katA
GSP26 general stress like protein
Formamidopyrimidine-DNA glycosidase, mutM
Argininosuccinate synthase, ASSY, argG
Cytochrome oxidase subunit I, COX1, caaA
Discovery of a Novel ATP-dependent DNA ligase
Ligase (DR0100)
16.00
relative expression level
14.00
12.00
• A novel ATPdependent DNA ligase
was highly expressed
with recA profile.
• It has consensus
motifs with ligase from
eucaryotes.
DRB 0098 HD family
pho spho hydro lase and nucleo tide
kinase
DRB 0099 Uncharacterized
co nserved pro tein
DRB 0100 P redicted DNA ligase
10.00
DR2069 NA D dependent ligase, dnlJ
8.00
6.00
4.00
2.00
0.00
0
5
6459863 DNLJ_DR2069
2506362 DNLJ_ECOLI
1352290 DNL1_MOUSE
1706482 DNL4_HUMAN
1706481 DNL3_HUMAN
11498455 AF0849
15894039 CAC0752
6460914 DRB0100
consensus/100%
secondary str (1DGS)
10 time (h) 15
123
110
561
201
416
91
38
35
motif I
*
FTGELKIDGLSV
WCCELKLDGLAV
FTCEYKYDGQRA
FYIETKLDGERM
MFSEIKYDGERV
VVLEEKMNGYNV
CVLEEKVDGANC
VVVTEKLDGENT
hh...KhsG.th
EEEEE
EEE
20
25
motif III
44
46
41
46
40
40
49
37
LEVRGEVYL
LEVRGEVFL
FILDTEAVA
CILDGEMMA
MILDSEVLL
YMLCCEAVG
YVMYGEWLY
WRFCGENVY
h.h.sE.hh
EEEEEEEE
motif IIIa
44
44
31
28
27
16
12
12
KAILYAVGKRDG
TFFCYGVGVLEG
CLYAFDLIYLNG
CYCVFDVLMVNN
CLFVFDCIYFND
EFFLFDVREGKT
YFMEFDIFDKKE
YFYLFSVWDDLN
.hh.ashh...t
EEEE
motif IV
50
51
51
51
51
46
50
42
ADGTVLK
IDGVVIK
CEGLMVK
EEGIMVK
LEGLVLK
REGVVFK
RENLEIR
MEGYVVR
.-sh.h+
EEEEE
300
290
723
365
573
232
188
165
Liu et al. 2003. PNAS, 100: 4191-4196
Highly coordinated regulations
• Energy pathway switching,
less energy produced.
• Minimizing energy demands --Shutdown de novo biosynthetic
pathways
Energy
• Energy pathway switching --- less
free radicals produced.
• Increasing activities of the genes
involved in removing free radicals.
Free radicals
Biosynthetic
precursors
• Shutdown de novo biosynthetic pathways to minimize energy
requirement.
• Increasing activities of proteases and nucleases to provide amino
acids and nucleotides for protein, DNA and RNA synthesis.
Shewanella oneidensis – MR-1
Habitats:
•
•
•
•
Formate
Lactate
Pyruvate
Amino
Acids
H2
O2
-, NO NO
2
lake & marine sediments 3
Mn(IV)
deep sea
Mn(III)
oil brine
spoiled food
Fe (III)
Fumarate
S
Mine waste
Black Sea
Oneida Lake
Green Bay
Panama Basin
Mississippi Delta
North Sea Redox Interfaces
DMSO
TMAO
So
S2O32U(VI)
Cr(VI), Tc, As, Se, I,
With this kind of versatility, what will it really do?
DOE Shewanella Federation
TIGR (John Heidelberg) Sequencing, annotation
Metabolomics
Center for Microbial
Ecology, MSU
(J.Tiedje, J.Cole,
J.Klappenbach)
USC, JPL (K.Nealson)
ORNL ESD
Microbial
Functional
Genomics Group
UCB
(J.
Keasling)
ANL (C.Giometti)
BCM (T. Palzkill)
B.Palsson (UCSD)
Adam Arkin (LBL)
M.Riley (Woods Hole)
ISB
PNNL
(J.Frederickson, (E.
Kolker)
D. Smith)
ORNL LSD, CASD
(F.Larimer, B.
Hettich)
Large Genomes To Life Project: $38M for 5 years
Rapid Deduction of Stress Response Pathways in
Metal/Radionuclide Reducing Bacteria
Stress responses on:
Desulfovibrio vulgaris
Shewanella oneidensis
Geobacter metallireducens
National Laboratories
Universities
Private Organizations
UC Berkeley
U Washington
U Missouri
(Consultant)
Summary of microarray analysis for Shewanella
Responses to 11 different electron acceptors
Mutant characterization with chemostats
Low-pH and high-pH stress
 Heat shock, cold shock
 Oxidative stress (e.g., H2O2)(Ting Li)
 High salt
 Carbon starvation
Metal stress: strontium, chromium
Hypothetical proteins
Many mutants
Defining Gene Function through Deletion
Mutagenesis, ~ 80 deletion mutants
GLOBAL REGULATORS: etrA, narQ, fur, crp, arcA, envZ
cAMP-BINDING REGULATORS: cAMP1, cAMP2, cAMP3
ADENYLATE CYCLASES: cya1, cya2, cya3
OUTER MEMBRANE PROTEINS AND CYTOCHROMES: mtrC, mtrA, omcA
SIGMA FACTORS: rpoH, rpoE,
STRESS RESPONSE:
oxyR, bolA, dps, ompR, cpxR
DOUBLE MUTANTS: etrA-fur, etrA-crp, cpxR-cpxA, ompR-envZ, cpxR-cpxA
PAS domain (old annotation): 0834, 0906, 1761,4254, 4326, 4917
Hypothetical proteins: 1377, 3584
Transcriptional factors: 220 genes, 78 within single operon,
Cytochrome genes: 42 genes
Computational Prediction of the function of
the SO1328 Gene Product (LysR)
•
•
•
•
It was annotated as LysR family protein.
It is induced 5-7 folds by H2O2 treatment.
It shares ~34% sequence homology with E.coli OxyR gene.
3D structure is similar to OxyR in E. coli.
C-terminal domain
N-terminal DNA-binding
domain
Growth phenotype of LysR
deletion mutant (SO1328)
OD log
WT,H2O2
WT
0.4
0.2
0
-0.2
0 uM
0um
-0.4
-0.6
-0.8
-1
-1.2
2000um
2,000 uM
0
2
4
6
8
10
Time (hours)
OD log
LysR, H2O2
Mutant
0.4
0.2
0
-0.2
0 uM
2,000 uM
-0.4
-0.6
-0.8
-1
-1.2
0
2
4
6
Time (hours)
8
0um
2000um
10
• Less growth was obtained
when the WT cells were
treated with 2,000 um
H2O2.
• Wild type cells were
sensitive to H2O2.
• No differences between
treatment and control for the
mutant cells
• The LysR mutant is not
sensitive to H2O2.
• OxyR mutant is more
sensitive to H2O2 in E. coli
Microarray analysis of LysR
mutant in response to H2O2 stress
folds of induction
deregulation of the major H2O2 (40uM, 2 min)
responsive genes
100
80
60
40
20
• Key genes (e.g., dps, katG)
known to be involved in
oxidative stress were not affected
by H2O2 in the mutant.
• Since OxyR mutant is more
resistant to H2O2, it is expected
that the genes involved in
oxidative stress should be highly
WT
expressed, but they are not. This
LysR
suggests that novel
mechanisms and pathways
may exist.
0
Dps family
protein
ahpC
KatG-1
ahpF
• OxyR-dps double mutant is also
resistant to H2O2, suggesting
that the oxidative responses in
MR-1 are very complicated.
Proteomics
Tools for studying proteomics
2-Dimentional gel electrophoresis
Mass spectrometry
Phage-display
Yeast two hybrid system
Protein arrays
Structural determination: X-rays, NMR
Using phage-display to study proteinprotein interactions and regulations
Gateway
cloning vector
Extracellular
Periplasm
POI
Cytoplasm
Jun
PSP promoter
Jun
pIII
Fos
Transcription &
Translation
p rom oter
geneIII
ori
C olE 1
Fos loxP
Gene
KanR
pJun
oriSC101
R
CMP
AMP R
M1 3 ori
Phage display
loxP
o ri
R 6ky
• First key step: cloning all
genes into universal vector.
• The cloning systems were
optimized.
• All primers were
synthesized.
• 3,853 genes were cloned.
• Sequenced 50 clones, no
errors were found.
Expression of Shewanella proteins
from the pDEST17 vector
ni i
n
i
i
i GST
i i
175kDa
83kDa
62kDa
70.2kDa
34.2kDa
48kDa
20.5kDa
33kDa
32.4kDa
25kDa
NarQ
ArcA
Global regulatory genes are
well expressed in E. coli
Fur
EtrA
n= no insert control
i= expression induced
with 0.5 mM IPTG
Identification of binding motifs of ArcA
by gel shifting assays
gltA
aceA
aceB
1.
Consistent with E. coli : Icd, gltAsdhCAB, sucABCD
2.
Different from E. coli, aceBA,
potentially regulate the glyoxylate
shunt pathway.
3.
Shewanella ArcA can also interact
with promoters of other TCA cycle
related genes (not found in E. coli):
SO0970 (fumarate reductase
flavoprotein subunit precursor),
SO1538 (isocitrate dehydrogenase), ,
SO2222 (fumarate hydratase)
Icd
sucAB
sdhCAB
sucCD
Using promoter microarray for studying proteinDNA interactions to understand regulatory network
1
In vitro/vivo pull down
qPCR amplification
2
Non specific competitors
1. BSA/milk
Direct binding
2. Random DNA
Verification by EMSA/RT-PCR/cDNA microarray
Challenges in protein arrays
 Antibodies are commonly used as probes in
protein arrays
 Two big challenges:
 Loss of activity: The big challenge for antibody arrays
is the loss of activity of antibody because the active
binding site may bind to slide surface through chemical
bonding, and thus the active site may not be available to
the antigen.
 Cross reactivity: Specificity is also a big issue for
antibody protein arrays..
Development of novel chemistry
for protein array fabrication
Langmuir 20, (2004), 8877-8885.
Proteomics, in revision
Thin film
coating
1, Polycation
2, Wash
3, Polyanion
Cleaned slide
4, Wash
repeat
5, Polycation
Glass
substrate
Proteins are affixed on the slide by:
• Entrapment by porous structure of the polymer
• Electrostatic interaction
• But not by covalent bonding
Proteins spotted on different slides
2 fold decrease
Nanofilm coated slide
• More sensitive
• Less background noise
Nanofilm-coated
Superaldehyde
Poly-Lysine
Superamine
Antibody arrays
1
2
3
4
5
Anti-Human IgG
BSA
Anti-Fibronectin
BSA
Streptavidin
Very good
specificity of
the antibodyantigen
reactions were
obtained.
BSA
• A patent was filed and licensed to a company
• Nominated by ORNL for R&D100 Award.
Detection of Single Base Pair Differences
GAG GGG GAA AGC GGG GGA TCG CAA GAC CTC GCG TGA TTG GAG CGG CCG AT
CCT AGC GTT XTG GAG CGC A One-mismatch probe
CCT AGC GTT XYG GAG CGC A two-mismatch probe
3-mismatch probe
CCT AGC GTT XYZ GAG CGC A
Checkborder
X=C
Checkborder
XY=GG
XY=AA
XY=AT
XY=GA
XYZ=GA C
XYZ=A GC
Discrimination factor(Fm/Fp)
X=G
X=A
X=T
1.2
Perfect match
1
blank bar-polymer coated slide
Filled bar-SuperAldehyde slide
0.8
0.6
1-mismatch
0.4
2-mismatch
4 & 5 mismatch
3-mismatch
0.2
0
Checkborder
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16
probes
• Short oligos (<25 bp) without end modification, typically $20/oligo.
• More than 5 fold difference of signal intensity between PM and MM probes.
• Single mismatch can be clearly differentiated.
Arbitrary cutoff for network identification
Correlation matrix of 5 genes
Main challenges
 All methods
defined a
cutoff
arbitrarily.
 Identified
clusters or
modules are
ambiguous.
1
0.3
0.9 0.5 0.4
1
0.7 0.3 0.8
1
0.4 0.2
1
0
0.9 0.5 0.4
1
0.7 0
1

0.6
1
Rc=0.7
Rc=0.4
1
 Only 3
0.8
0.4 0
1
0.6
1
interactions
left when
Rc=0.7.
7 interactions
left when
Rc=0.4
0
0.9 0
0
1
0.7 0
0.8
1
0
0
1
0
1
1
7 interactions left
3 interactions left
Novel approach for network
identification
Poisson Distribution
Wigner-Dyson Distribution
(cutoff >0.7)
(cutoff < 0.7)
Random Matrix Theory
and Level Statistics
Level Spacing Distribution of Yeast Gene
Correlation Matrix
P(0.8)
P(0.7)
p(0.6)
P(0.5)
1
Poisson Distribution:
P( s )  exp( s)
0.8
0.6
p
Wigner-Dyson Distribution:
  s2 

P( s)  s exp  

2
4


• Random properties: WignerDyson distribution
• Nonrandom properties:
Poisson distribution
0.4
0.2
0
0
0.5
1
1.5
2
2.5
Level Spacing
Main advantages:
• Universal laws support
• Automatic cutoff
• Reliable, sensitive, robust
3
Identification of 27 Modules from Yeast
Cell Cycle Expression Data
Experimental Validation of
some hypothetical proteins
• Cycloheximide inhibits
protein synthesis by
blocking peptidyl
transferase.
• Mutants are more
sensitive to this drug,
suggesting that it has
defective ribosome.
• Thus the function of
the genes is involved
in ribosomal
biogenesis.
Functional identification of a
hypothetical protein in Shewanella
1
For Shewanella heat
shock data, SO2017 is
grouped with heat shock
proteins.
Experimental validation of SO2017
10
OD
600
1
30oC
SO2017 30oC
Series2
DSP10 42oC
Series3
SO2017 42oC
Series4
DSP10
Series1
0.1
0.01
0
2
4
Time (h)
6
8
2
7
5
6
3
4
1. dnaK 2. htpG 3. groEL
4. groES 5. Lon 6. dnaJ
7. SO2017
• Mutant of SO2017 is
sensitive to heat shock.
• This gene is indeed
involved in heat shock
response.
• Suggesting that the
prediction is correct
Pioneering advances in microarray-based
technologies to address challenges in microbial
community genomics
 Challenges:




Specificity: Environmental sequence divergences.
Sensitivity: Low biomass.
Quantification:
Existence of contaminants: Humic materials, organic
contaminants, metals and radionuclides.
 Solutions
 Developing different types of microarrays and novel chemistry to
address different levels of specificity.
 Developing novel signal amplification strategy to increase
sensitivity
 Optimizing microarray protocols for reliable quantification.
Summary of 50mer-based FGAs
for environmental studies
Oligonucleotide probe size: 50 bp
Tiquia et al. 2004. BioTechniques 36, 664-675
Rhee et al. 2004, AEM 70:4303-4317
•
•
•
•
•
•
Nitrogen cycling: 302
Sulfate reduction: 204
Carbon cycling: 566
Phosphorus utilization: 79
Organic contaminant degradation: 770
Metal resistance and oxidation: 85
• Total: 2,006 probes
• All probes are < 88% similarity
Specificity of 50 mer microarrays
Specific hybridization was obtained with probes 
85% similarity
4
5
• 5 nirS genes were mixed
together
• Only corresponding genes
were hybridized
1
3
nir K
2
nir S
• 6 types of genes were
mixed together
• Only corresponding genes
were hybridized
nif H
amo
pmo
A
dsr AB
A
Sensitivity
Cells
Genomic DNA
5
6
7
8
1
2
3
4
500 ng gDNA
50 ng
25 ng
1.6109
1.3107
Detection limit
• 50 ng pure DNA in the presence of nontarget templates
• 107 cells
3.0106
Quantification and validation
r2 = 0.98
0.5
Real-PCR
1.6  109
12
8.0  108
0.0
2.0  108
4.0 
Real Time PCR (Log Copy Number)
Microarray Hybridization (Log SNR)
1: gi4704462-TFD
2: gi4704463-TFD-Microcosm
3: gi4704464-TFD-Enrichment
4: gi4704463-TFD
5: gi4704464-TFD-Microcosm
6: gi4704465-TFD-Enrichment
7: gi2828015-TFD
8: gi2828016-TFD-Microcosm
9: gi2828017-TFD-Enrichment
10: gi2828018-TFD
11: gi2828019-TFD-Microcosm
12: gi2828020
8
109
1.0  107
5.0  107
2.5  107
-0.5
r=0.86
10
Log Value
Signal Ratio
(Log(Log
R)
LogLog
Signal
Ratio
R)
Microarray hybridization
-1.0
6
4
2
0
1.3  107
3.0  106
-1.5
6.0 
6
-2
106
7
-4
8
9
Log (Cell
Cell Number
(Log[N])
[N])
Log
Number
Quantification
• Good linear relationship
• Quantitative
10
0
2
4
6
8
10
12
14
Genes
• Microarray result is
consistent with realtime PCR
Novel amplification approach for
increasing hybridization sensitivity
10fg
4.6
M A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7 A8 B8 M
4.4
Log Signal Intensity
4.2
4.0
3.8
3.6
3.4
SO4131: r2=0.9910
SO3234: r2=0.9922
SO1077: r2=0.9924
SO4136: r2=0.9934
SO2637: r2=0.9942
3.2
3.0
2.8
2.6
-2
-1
0
1
2
Log DNA Template Concentration (ng)
As low as 10fg (2 cells)
can be detected
Submitted to PNAS
Amplification is quantitative
for majority of the genes
3
NABIR Field Research
Center Samples
pH
Nitrate
Uranium
Nickel
TOC
FW-300*
6.1
1.200
0.001
0.005
30
FW-003
6.0
1060
0.01
0.015
100
FW-005
3.9
175.0
6.40
5.00
70
FW-010
3.5
42000
0.17
18.0
175
FW-015
3.4
8300
7.70
8.80
65
TPB-16
6.3
30.00
1.10
ND
65
 2 L groundwater
 Genes analyzed
 16S rRNA, nirS, nirK, dsrAB, amoA
Area 2
16
S-3 Ponds
Cap
010
Contaminant
source
005
Most contaminated
015
Least contaminated
003
Area 3
Less contaminated
275 m
Area 1
N
6 samples were taken to assess the effects of contaminants on microbial
community structure
30 m
Groundwater samples with
very low biomass
• 2L groundwater from
six different sites.
• Cell counts: 1-5x105/ml
• DNA was isolated, 1/20
of the DNA was
manipulated and used
for hybridization.
• Nice hybridization was
obtained with the DNA
manipulated with the
new method.
• No hybridization were
obtained if the DNA is
not manipulated.
Difference of functional genes in samples
from NABIR Field Research Center
40000
35000
30000
FW300
25000
Reference site
20000
15000
10000
5000
0
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53
40000
35000
30000
FW010
25000
20000
Highly contaminated site
15000
10000
5000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53
• Clear difference
was observed
among
contaminated and
noncontaminated
sites.
• E.g., some genes
are present in
noncontaminated
site but not in
contaminated sites
Overall diversity among different samples
FW300
FW300
FW003
FW021
FW010
FW024
61(20%)
189(36%)
174(35%)
80(21%)
111(23%)
25(11%)
144(35%)
61(17%)
84(20%)
10(5%)
64(20%)
90(24%)
6(5%)
118(37%)
FW003
FW021
FW010
FW024
Total Genes Detected
Genetic diversity, Simpson’s
(1/D)a
30(16%)
302
219
192
130
190
125.5
67.1
26.6
17.4
35.7
• Overall diversity correlates with contaminant level.
• The proportion of overlapping genes between samples was consistent with the
contaminant level and geochemistry.
• A significant portion (5-20%) of all detected genes were unique to each sample,
even though they are very close. Thus, important microbial populations appear to
be highly heterogeneous in this groundwater system.
CommOligo --- New oligo probe
design program for community analysis
Number and specificity of designed probes (50-mer) by different programs
Group sequences of nirS and nirK
(842 gene sequences)
Programs used
Total
ORFs
ORFs
rejected
Probes
designed
Specific
probe
Nonspecific
Group-specific
ArrayOligoSeector
842
0
842
117
725
0
OligoArray
842
35
807
70
737
0
OligoArray 2.0
842
51
791
35
756
0
OligoPicker
842
657
185
141
44
0
CommOligo
842
512
330
147
0
183
• Useful for both whole genome microarrays and community arrays
• Able to design group-specific probes
• Better performance than other programs
Probes Designed for a Second
Generation FGA
• Nitrogen cycling: 5089
• Carbon cycling: 9198
• Sulfate reduction: 1006
• Phosphorus utilization: 438
• Organic contaminant degradation: 5359
• Metal resistance and oxidation: 2303
23,408 genes
•23,000 probes designed
Total:
• Will be very useful for community and ecological
studies
Community Genomics
Grand challenges
• Extremely high diversity, 5000
species/g soil
• 99% of the microbial species are
uncultured
Whole community sequencing
99
010A-A05
Ralstonia eutropha
Azoarcus eutrophus
67
Ralstonia NI1
59
010A-E08
010D-B06
010A-F09
54
Azoarcus FL05
98
010B-A01
uncultured clone 3
100 010A-A04
Acidovorax 3DHB1
84
010D-C09
95
uncultured clone 81
96
80
010A-D01
Rhodoferax antarcticus
97
010A-F11
98
uncultured clone HC-32
64
010B-E10
Aquaspirillum autotrophicum
61
010D-D06
55
010D-A06
53
uncultured clone S015
uncultured clone GOUTA12
99
010B-G08
51 010B-B11
100
Pseudomonas marginalis
010D-G08
100
010B-B09
87
010D-C08
99
Pseudomonas stutzeri
010A-C01
010A-A01
100
Rhizobium gallicum
010A-F12
100
uncultured clone LAH1
89
71
100
89
100
0.05
• Sample from NABIR Field Research Center at ORNL
• Sequenced by DOE Joint Genome Institute
• 20 species based on 16S rRNA
Sequencing a stable thermophilic
terephthalate (TA)-degrading community
CH4 + CO2
CO2
H2+CO2
Ac
TA
Go’
(B)
(A)
(1) TA 2  8 H 2 O 

(kJ/reaction
)
3acetate  3H   2HCO 3  3H 2

3

(43.2)
(2) 4H 2  HCO  H  CH 4  3H 2 O
(-135.6)
(3) acetate   H 2 O  HCO 3  CH 4
(-31.0)
(4) 4TA -  35H 2 O 
2
17HCO 3  9H   15CH 4
(-151.9)
• Terephthalate (TA) or 1,4-benzene dicarboxylic acid is a major byproduct
of the plastics manufacturing industry.
• Three dominant populations:
– Pelotomaculum: converting TA to acetate and hydrogen.
– Methanothrix: converting acetate to methane and carbon dioxide.
– A representative of candidate bacterial phylum OP5, unknown
function, but may also ferment TA.
Syntrophic Interaction
Shewanella-Clostridium Co-Culture
MeOH + Fe(III)
14CO
2
Growth
 Functional Genomics of
Shewanella in CoCulture – [towards
microbial communities]
 Establish ShewanellaClostridium co-culture
 MR-1 & Clostridium
acetobutylicum or C.
sphenoides
 Global expression
analyses of co-cultures
Also
Fe(II)
Daniel, Gottschalk et al. 1999
Desulfovibrio (H2
production) +
Methanococcus (H2
utilization)
Genomics, community functions
and stability
Linking genomics to populations, to community diversity, functions, stability and to global change
Dynamics, stability in nature
Analyses: genome
sequencing, FGA
microarrays
Obj 4. Effects of elevated
CO2 on microbial
community, functions &
stability in nature
Natural system
Many species
Obj 5. Integration, modeling,
simulation & prediction
across different organization
levels
Defined system
2 species
Obj 2. AOB-NOB
interactions,
regulation & stability
Analyses: mRNA,
protein, metabolites,
populations dynamics,
community function
Insights on stability of the mutalistic
interactions in more complex systems
Providing systems and knowledge for
constructing more complex systems
Defined systems
3 & 4 - species
Obj 3. Competition,
functional redundancy,
stresses, & stability
Providing signature target
genes for monitoring
Natural system
Many species
Probe sequences, diversity
Dynamics, stability in nature
Isolates, sequences
Mechanistic understanding of
coexistence in nature
Obj 1. Genome
diversity of nitrifying
community & isolation
• Nitrifying
communities.
• One of the
biggest NSF
program in
life science.
• 1M/yr for 5
years.
• Preproposal
was panel
reviewed, and
invited to
submit a full
proposal.
Proposal to NSF Frontiers In Integrated Biological Research (FIBR) program.
Predictive Microbial Ecology
 Qualitative microbial ecology: Due to the difficulty in obtaining
experimental data, microbial ecology is qualitative, but not quantitative.
 Opportunity for quantitative microbial science: With availability of
genomic technologies, microbial ecology is no longer limited by the
deficiency of experimental data.
 Challenges: Modeling, simulation and prediction
 A big mathematical challenges: dimensionality problem. The sample number is
less than the gene number.
 Possible solution: System ecology + Genomics
An example of the conceptual integration scheme
mk
d (k )
(k )
(k )
i. Modeling microarray data at individual gene level
xi (t )  Wij  x j (t )
dt
j 1
ii. Modeling interactions between functional gene groups or gilds.
n
d
yk (t )  f k (t )   Qkj  ykj (t )
dt
j 1
mk
yk   xi( k ) mk
i
ii. Modeling interactions between functional gene groups or gilds.
N
d
z p (t )  g p (t )  U pq  z pq (t )
dt
q 1
n
zk (t1 )   yi (t1 ) n
i
Grand Challenges for Systems Biology
Sequence and
Pathway
Analyses
Sequencing
Microarray
Data Analysis &
Management
Population
Level
Modeling
Experiment
Design
Experiment
Species 3
Species 1
Species 2
•
Network identification and modeling
•
Scaling from single cells to ecosystems
• Spatial
• Temporal
Community
Level
Modeling
First Book on Microbial Functional Genomics

Authors
 Jizhong Zhou, Dorothea Thompson, Ying Xu, James M.
Tiedje




John Wiley & Sons, March 19, 2004
15 chapters, > 600 pages
Rita Colwell, former NSF Director, wrote a forward
To our knowledge, this is the first book in microbial functional genomics
Acknowledgement
(1)
• Department of Energy
– Microbial Genome Program
– Genomes To Life Program
– NABIR Program
– Ocean Margin Program
– Carbon cycling programs
• Oak Ridge National Laboratory
– Laboratory Directed Research and Development
Microbial Genomics and Ecology Group at
Environmental Sciences Division, ORNL
• ORNL
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
Acknowledgement
Zhili He
Liyou Wu
Dorothea Thompson
Yongqing Liu
Ting Li
Matthew Fields
Xuedan Liu
Tingfen Yan
Sung-Keun Rhee
Song Chong
Yunfeng Yang
Jost Liebich
Christopher Schadt
Dawn Stanek
Adam Leaphart
Weimin Gao
Terry Gentry
Steve Brown
Qiang He
Feng Luo
Crystal McAlvin
Susan Carroll
Lisa Fagan
Haichun Gao
Hongbin Pan
Xiufeng Wan
Xichun Zhou
Zamin Yang
Jianxin Zhong
Dong Yu
Ying Xu
• Michigan State University
–
–
–
James M. Tiedje
James Cole
Joel Klappenbach
• USUHS
– Mike Daly
• USC
– Ken Nealson
• Argonne National Lab
– Carol Giomettie
• Univ of Iowa
– Caroline Harwood
• Oregon State Univ
– Dan Arp
• UC Berkeley
– Jay Kneasling
• Ohio State Univ
– Bob Tabita
• Univ of Missouri
– Judy Wall
• Bayler College
– Tim Palzkill
• SREL
–
Chuanlun Zhang
• PNNL
–
–
–
–
–
Jim Frederickson
Margie Romine
Yuri Gorby
Dick Smith
Mary Lipton
• LBL
–
–
Terry Hazen
Adam Arkin
• Perkin Elmer
–
Xinyuan Li
Download