UniFrac: Comparing Microbial Communities Presented by: Donovan Parks Overview Motivating example Quick look at Bray-Curtis Index (BCI) UniFrac: distance measure, significance test, community tree, sequence jackknifing Contrast BCI and UniFrac Why compare microbial communities? Soil suppressing Tree Pathogens Soil not suppressing Tree Pathogens Is this guy protecting our trees? Bray-Curtis Index (BCI): Distance Measure Measures the difference in OTU diversity between microbial communities. Cluster Normalize 0.4 0.3 0.3 Community 1 Cluster Normalize 0.2 0.4 Cluster Differences 0.4 Community 2 Clusters defined by taxonomy or sequence similarity 0.2 0.1 0.1 BCI = 0.4 / 2 = 0.2 Normalize so measure is between [0, 1] BCI: Pros and Cons Pros: Simple (both conceptually and experimentally) Considers both richness and evenness # OTU in a Community Relative Abundance Cons: All clusters considered equally different UniFrac: Distance Measure Measures the evolutionary distance between microbial communities Similar Communities Maximally Different Communities UniFrac Distance Measure = (------) / (------ + ------) UniFrac: Significance Test Do two communities differ significantly? Randomize Community Labels Observed Tree ... # Trees Magic level of significance! UniFrac Distance UniFrac: Community Distance Matrix Pairwise Comparison 0 0.6 0.5 UPGMA 0.6 0 0.5 1.0 1.0 0 Distance Matrix Community Tree UniFrac: Sequence Jackknifing How does the number and evenness of sequences from each community affect the results? Sequence Jackknifing UniFrac Analysis Replicate 100 to 1000 times UniFrac Analysis Weighted UniFrac What about the relative abundance of sequences (i.e., evenness)? Since there are twice as many -- sequences, they get half the weight. UniFrac: A Practical Look Remove duplicate sequences if not doing weighted UniFrac. 16S Gene Sequence Library Filter Sequences 0.7 16S Gene Sequence Library Filter Sequences 16S Gene Sequence Library Filter Sequences Multiple Sequence Alignment (e.g., MUSCLE) Tree Construction (e.g., RAxML) UniFrac Analysis 0.8 Tree must be rooted Tree must have meaningful branch lengths Community Tree Final Remarks: BCI vs. UniFrac They are different, equally valid measures! BCI: taxon-based method which considers richness and evenness UniFrac: phylogenetic-based method which considers evolutionary divergence Which to prefer depends on the questions you are trying to answer Final Remarks: Many Ways to Compare Many richness and evenness measures (e.g., Jaccard, Sörensen) Other phylogenetic distance measures (e.g., TreeClimber) Many interesting and meaningful ways to compare microbial communities: Richness and evenness Evolutionary divergence Functional diversity References UniFrac: Lozupone C, Knight R, UniFrac: a New Phylogenetic Method for Comparing Microbial Communities, Appl. Environ. Microbiol., Vol. 71, December 2005. Lozupone C, Hamady M, Knight R, UniFrac - An Online Tool for Comparing Microbial Community Diversity in a Phylogenetic Context, BMC Bioinformatics, Vol. 7, August 2006. Online Tool: http://bmf2.colorado.edu/unifrac/index.psp (excellent help page) BCI: Bray R, Curtis T, An Ordination of the Upland Forest Communities of Southern Wisconsin, Ecological Monographs, Vol. 27, 1957. TreeClimber (source of soil sample example): Schloss PD, Handelsman J, Introducing TreeClimber, a test to compare microbial community structures, Applied and environmental microbiology, Vol. 72, No. 4., April 2006.