pathways

advertisement
Scalable metabolic reconstruction
for metagenomic data
and the human microbiome
Sahar Abubucker, Nicola Segata,
Johannes Goll, Alyxandria M. Schubert, Jacques Izard,
Brandi L. Cantarel, Beltran Rodriguez-Mueller, Jeremy Zucker,
Mathangi Thiagarajan, Bernard Henrissat, Owen White,
Scott T. Kelley, Barbara Methé, Patrick D. Schloss,
Dirk Gevers, Makedonka Mitreva,
Curtis Huttenhower
Harvard School of Public Health
Department of Biostatistics
07-16-11
What’s metagenomics?
Total collection of microorganisms
within a community
Also microbial community or microbiota
Total genomic potential of
a microbial community
Study of uncultured microorganisms
from the environment, which can include
humans or other living hosts
Total biomolecular repertoire
of a microbial community
2
Valm et al, PNAS 2011
What to do with your metagenome?
Reservoir of
gene and protein
functional
information
Who’s there?
What are they doing?
Comprehensive
snapshot of
microbial ecology
and evolution
Who’s there varies: your microbiota is
plastic and personalized.
What they’re doing is adapting to
their environment:
you, your body, and your environment.
Public health tool
monitoring
population health
and interactions
Diagnostic or
prognostic
biomarker for
host disease
4
The Human Microbiome Project for a normal population
300 People/
15(18) Body Sites
 >50M 16S seqs.
Multifaceted
analyses
 Human
population
 4Tbp unique
 Microbial
Multifaceted
data
 >6,000 samples
metagenomic
sequence
 >1,900 reference
genomes
 Full clinical
metadata
population
 Novel organisms
 Biotypes
 Viruses
 Metabolism
2 clin. centers, 4 seq. centers, data generation,
technology development, computational tools, ethics…
Metabolic/Functional Reconstruction:
The Goal
Healthy/IBD
BMI
Diet
Taxon
Geneabundances
SNP
LEfSe:
Enzyme
family
abundances
LDA Effect Size
expression
genotypes
Pathway abundances Metagenomic biomarker discovery
Nicola Segata
http://huttenhower.sph.harvard.edu/lefse
6
HMP: Metabolic reconstruction
100 subjects
1-3 visits/subject
~7 body sites/visit
10-200M reads/sample
100bp reads
HUMAnN:
HMP Unified
Metabolic Analysis
Network
Functional seq.
KEGG + MetaCYC
BLAST
CAZy, TCDB,
VFDB, MEROPS…
BLAST → Genes
 (1  p )(a  g )
1  p
a
1
a(r )
c( g ) 

|g| r
http://huttenhower.sph.harvard.edu/humann
a
a(r )
Genes
(KOs)
Genes → Pathways
MinPath (Ye 2009)
WGS
reads
Taxonomic limitation
Pathways
(KEGGs)
Xipe
?
Rem. paths in taxa < ave.
Pathways/
modules
Distinguish zero/low
Gap filling
(Rodriguez-Mueller in review)
c(g) = max( c(g), median )
Smoothing
Witten-Bell
TN /(V  T ) /( N  T ) c( g )  0
c( g )  
otherwise
c( g ) N /( N  T )
7
HUMAnN: Metabolic reconstruction
Oral (BM)
Oral (TD)
Gut
← Pathways→
Vaginal Skin Nares Oral (SupP)
← Samples →
Oral (BM)
Gut
Oral (SupP)
Oral (TD)
Skin Nares
← Pathways→
Vaginal
← Samples →
Pathway coverage
Pathway abundance
8
HUMAnN: Validating gene and pathway
abundances on synthetic data
Individual gene families
Validated on individual gene families,
module coverage, and abundance
• 4 synthetic communities:
ρ=0.91
Low (20 org.) and high (100 org.) complexity
Even and lognormal abundances
• Best-BLAST-hit overshoots false positives,
undershoot real pathways as a result
• HUMAnN FNs: short genes (<100bp),
taxonomically rare pathways
• HUMAnN FPs: large and multicopy
(not many in bacteria)
9
A portrait of the healthy human microbiome:
Who’s there vs. what they’re doing
← Relative abundance →
Nares
Oral (BM)
Vaginal Skin
Gut
Oral (SupP)
Oral (TD)
← Relative abundance →
← Relative abundance →
← Relative abundance →
← Phylotypes →
← Pathways →
10
Niche specialization in human
microbiome function
Metabolic modules in the
KEGG functional catalog
enriched at one or more
body habitats
http://huttenhower.sph.harvard.edu/lefse
Nicola Segata
11
Proteoglycan degradation
by the gut microbiota
Glycosaminoglycans
(Polysaccharide chains)
AA core
12
Proteoglycan degradation:
From pathways to enzymes
Enzyme relative abundance
10-8
10-3
• Heparan sulfate degradation
missing due to the absence of
heparanase, a eukaryotic enzyme
• Other pathways not bottlenecked
by individual genes
• HUMAnN links microbiome-wide
pathway reconstructions →
site-specific pathways →
individual gene families
13
Patterns of variation in human
microbiome function by niche
14
Patterns of variation in human
microbiome function by niche
• Three main axes of variation
• Eukaryotic exterior
• Low-diversity vaginal
• Gut metabolism
• Oral vs. tooth hard surface
• Only broad patterns:
every human-associated habitat
is functionally distinct!
15
How do microbes and function vary
within each body site across the population?
16
How do body sites compare between
individuals across the population?
17
HMP: Prevalence of species (OTUs)
across the population
Cumulative prevalence
18
HMP: Prevalence of pathways
across the population
Cumulative
prevalence
• 16 (of 251) modules strongly “core” at 90%+ coverage in 90%+
individuals
at 7 body sites
• 24 modules at 33%+ coverage
• 71 modules (28%) weakly “core” at 33%+ coverage in 66%+ individuals at 6+ body sites
• Contrast zero phylotypes or OTUs meeting this threshold!
• Only 24 modules (<10%) differentially covered by body site
• Compare with 168 modules (>66%) differentially abundant by body site
19
Linking function to community composition
← Taxa and correlated metabolic pathways →
← 52 posterior fornix microbiomes →
Plus ubiquitous pathways: transcription, translation,
cell wall, portions of central carbon metabolism…
Lactobacillus crispatus
Phosphate and peptide
transport
Lactobacillus jensenii
Sugar transport
Lactobacillus gasseri
Embden-Meyerhof glycolysis,
phosphotransferases
Lactobacillus iners
F-type ATPase, THF
Gardnerella/Atopobium
AA and small molecule
biosynthesis
Candida/Bifidobacterium
Eukaryotic pathways
20
Linking communities to host phenotype
Normalized relative abundance
Top correlates
with BMI in stool
Body Mass Index
Vaginal pH (posterior fornix)
Vaginal pH, community metabolism, and community
composition represent a strong, direct link between
phenotype and function in these data.
Vaginal pH (posterior fornix)
21
Microbial biomolecular function and metabolism
in the human microbiome: the story so far?
• HUMAnN
– Accurate metagenomic metabolic reconstruction
– Sequences → genes → pathways → phenotypes
– Validated on 4x synthetic communities
• Who’s there varies even in health
– What they’re doing doesn’t (as much)
• There are patterns in this variation
– Communities in related environments adapt using related functions
– Function correlates with membership and phenotype
• ~1/3 to 2/3 of human metagenome characterized
– Job security!
22
Ask both what you can do for your microbiome
and what your microbiome can do for you
Thanks!
Human Microbiome Project
Sahar Abubucker
Nicola Segata
Dirk Gevers
Levi Waldron
George Weinstock
Owen White
Rob Knight
Johannes Goll
Makedonka Mitreva
Yuzhen Ye
Erica Sodergren
Beltran Rodriguez-Mueller
Mihai Pop
Jeremy Zucker
Vivien Bonazzi
Mathangi Thiagarajan
Jane Peterson
Brandi Cantarel
Lita Proctor
Qiandong Zeng
Maria Rivera
Barbara Methe
Bill Klimke
Daniel Haft
HMP Metabolic Reconstruction
Ben
Ganzfried
Fah
Sathira
Alyx Schubert
Pat Schloss
Jacques Izard
Bruce Birren Ramnik Xavier
Doyle Ward Eric Alm
Ashlee Earl Lisa Cosimi
Interested? We’re recruiting
students and postdocs!
http://huttenhower.sph.harvard.edu
Vagheesh
Narasimhan
Larisa
Miropolsky
http://huttenhower.sph.harvard.edu/humann
http://huttenhower.sph.harvard.edu/lefse
24
HMP: Prevalence of genera (phylotypes)
across the population
Cumulative prevalence
26
Download