Evolutionary and Ecological Bioinformatics Biology/Computer

advertisement
Evolutionary and Ecological Bioinformatics
Biology/Computer Science 327, Fall 2013
Professors Fred Cohan and Danny Krizanc
DATE
Sept. 3
Sept. 5
Sept. 24
LECTURER
LECTURE TITLE
Cohan
1. Bioinformatic approaches to ecology and evolution
Krizanc
2. Algorithms in everyday life and in research
3. Approaches to phylogeny through overall similarity
Cohan
of organisms (phenetics vs. cladistics)
Krizanc
4. Alignment of DNA and protein sequences
5. Distance-based algorithms for estimating
Krizanc
relationships (UPGMA and NJ)
6. Maximum parsimony approach to phylogeny; search
Krizanc
algorithms for finding the best phylogeny
7. Models of molecular evolution (including JukesCantor, neutral theory, transition-transversion);
incorporating molecular models in maximum likelihood
Krizanc
algorithms for phylogeny estimation
Sept. 26
Oct. 1
Krizanc
Krizanc
Oct. 3
Krizanc
Oct. 8
Krizanc
Oct. 10
Cohan
Oct. 15
Cohan
Oct. 17
Oct. 22
Cohan
Sept. 10
Sept. 12
Sept. 17
Sept. 19
Oct. 31
Cohan and
Krizanc
Cohan and
Krizanc
Cohan and
Krizanc
Nov. 5
Krizanc
Nov. 7
Cohan
Oct. 24
Oct. 29
8. Testing the robustness of a tree
9. Bayesian approaches to phylogeny and your own life
10. Gene trees vs. species trees; splits trees and
phylogenetic networks
11. Genome-based trees (based on gene content and on
gene order); supertrees
12. The importance of using phylogeny for testing
hypotheses about natural selection; phylogenetic
algorithms for testing natural selection
13. Genome-wide analysis of adaptation through gene
acquisition vs. losses of genes
14. Analyses of adaptation through changes in genomewide gene expression
Fall break
TEXTBOOK
READINGS
Ch. 1
Ch. 2
Ch. 3, 4, 12
Ch. 6
Ch. 5, 8
Ch. 9
pp. 82-89,
134-136,
Ch. 10
Ch. 15
15. Research projects
16. Genome-wide approaches for finding shared genes
under recent positive selection (Theory)
17. Genome-wide approaches for finding shared genes
under recent positive selection (Applications)
Ch. 14
18. Assembly algorithms for genome sequencing—from
isolates, metagenomes, and uncultivated single cells
19. Metagenomics in ecosystems biology: how to find
out the physiological processes occurring in an
ecosystem even when we don’t know who the
organisms are
Nov. 12
Cohan
Nov. 14
Cohan
Nov. 19
Nov. 21
Cohan
Cohan
Cohan and
Krizanc
Nov. 26
Nov. 28
Dec. 3
Dec. 5
Thursday
Dec. 12
2:00-5:00
Zelnick
Pavillion
Cohan and
Krizanc
Guest
lecturer:
Sarah
Kopac
Due Sept. 26
Due Oct. 15
Due Oct. 31
Due Nov. 26
Due Nov. 26
Due on Thursday,
Dec. 12, 2:00 PM
Due on Thursday,
Dec. 12, 2:00 PM
20. Metagenomic approaches for characterizing
community-wide organismal diversity
21. Metagenomic approaches to finding out what
unidentified genes do (ecological annotation)
22. The human microbiome: types of communities
across humans, functional screening for novel genes,
antibiotic holocausts and health consequences
23. Baseball, biology, and big data
24. (cancelled)
Thanksgiving
25. Molecular approaches for identifying microbial
diversity in natural communities—AdaptML and
Ecotype Simulation
26. Microbial diversification through adaptations to
physical conditions versus organic resources
POSTER SESSION
HOMEWORK ASSIGNMENTS
1. A pencil and paper phylogenetic problem
2. Make a tree (with help from computer algorithms)
3. Another pencil and paper phylogenetic problem
4. Comparing genomes to characterize past natural selection
5. Project abstract
TERM PROJECT
Poster on research project
Paper on (the same) research project
GRADING
Homework assignments
Term project poster
Term project paper
50%
20%
30%
READINGS
Textbook: Phylogenetic Trees Made Easy: A How-To Manual, Fourth Edition, Barry G. Hall.,
2011, Sinauer Associates.
Supplementary Readings will be listed on the class WesFiles web site.
CONTACT INFORMATION (Email is the best way to set up an appointment.)
Fred Cohan
207 Shanklin
X3482
fcohan@wesleyan.edu
Office hours: Fridays 1:15-2:15, and by appointment
Danny Krizanc
631 Exley Science Center
X2186
dkrizanc@wesleyan.edu
December 2, 2013
Evolutionary and Ecological Bioinformatics
Biology/Computer Science 327, Fall 2013
Supplementary Reading
Sep.
3
1.
Bioinformatic
approaches to
ecology and
evolution
Sep.
5
2. Algorithms in
everyday life
and in research
3. Approaches
to phylogeny
through overall
similarity of
organisms
Sep.
10
Ginsberg gives a really nice example of the Big Data approach, in this case to
predict influenza levels before the CDC can, based on Google search queries
(Ginsberg et al., 2009). Larson et al. provides phylogenetic evidence that wild
pigs were domesticated in six different places around Eurasia (Larson et al.,
2005); similarly, Thalmann et al. show that dogs were domesticated in Europe
(Thalmann et al., 2013). Keeling and Palmer chart phylogenetically the most
significant horizontal transfer events in eukaryotic history (Keeling and
Palmer, 2008). Mikkelsen et al. have identified those genes in the genome
that have been under selection for new adaptations in humans (Mikkelsen et
al., 2005). Merhej compared bacterial genomes to test whether different
lineages evolving independently toward pathogenicity (or mutualism) tend to
lose the same genes convergently (Merhej et al., 2009). (They do!) Christina
Richards et al. explored the circumstances under which gene expression
changed over the course of an organism’s life, in the case of the plant
Arabidopsis (Richards et al., 2012). Fierer et al. explored how the bacterial
community on hands varies between the left and right hands and between
people, and the effects of washing on hands’ bacterial communities (Fierer et
al., 2008). Knight et al. showed, in a meta-analysis across various high-impact
studies from the Earth Microbiome Project, how the similarity of environment
drives the similarity of bacterial communities (Knight et al., 2012).
Harel’s Chapter 4 is a "gentle" introduction to the notion of NP-completeness
or why some problems are hard for computers to solve (Harel, 2000).
Nosenko et al. give a recent phylogeny of animals based on various genes;
they explain how to choose the best set of genes when genes differ in the
phylogenies they yield (Nosenko et al., 2013). Funch and Kristensen present
their discovery of an animal phylum (Funch and Kristensen, 1995). Schloss and
Handelsman present a phylogeny of the bacterial phyla, showing that most of
the phyla do not have even a single cultivated species (Schloss and
Handelsman, 2004). My recent encyclopedia chapter on species gives an
overview of the concepts of species, including the dynamic qualities species
have long been expected to have (Cohan, 2013). Mallet gives a species
concept based on Darwin’s idea that two species should have no or very little
overlap in a set of distinguishing characteristics; his concept does not deal
with the dynamic qualities of cohesion irreversible separateness, and so on
(Mallet, 1995). Genoways and Choate, from the heyday of numerical
taxonomy, illustrate two ways of presenting data on clustering of organisms
by their overall phenotypic similarity (Genoways and Choate, 1972). Kämpfer
et al. make a case that species of Streptomyces form distinct, justifiable units
when we demarcate species at the 80% similarity level for phenotypic traits
(Kämpfer et al., 1991). Futuyma, in his textbook, explains the limitations of
the phenetic approach to phylogeny (where all characters are used), and why
we should constrain our analyses to those characters that are derived
(Futuyma, 1998).
Sep.
12
4. Sequence
Alignment
Sep.
17
5. Distancebased Methods
for Phylogeny
Construction
Sep.
19
6. Maximum
parsimony
approach
7. Models of
molecular
evolution and
maximum
likelihood
approach
Sep.
24
Sep.
26
8. Testing the
robustness of
trees
Oct.
1
9. Bayesian
methods
Oct.
3
10. Gene trees
vs species trees
Sean Eddy contains a biologist view of something called dynamic
programming which is the central idea behind a number of bioinformatics
algorithms including how to perform pairwise sequence alignment (Eddy,
2004). I’ve also included the original papers introducing ClustalW (the most
commonly used multiple alignment tool), MUSCLE (a newer tool
recommended by Hall) (Edgar, 2004) and GUIDANCE (a tool for evaluating the
quality of alignments described in Chapter 12 of Hall) (Penn et al., 2010).
Morrison tries to answer the question ``Why would phylogeneticists ignore
computerized sequence alignment’’ and makes some interesting points along
the way (Morrison, 2009). His conclusion is that the current tools aren’t good
enough.
I’ve included the original papers describing UPGMA (by Michener and Sokal)
(Michener and Sokal, 1057) and Neighbor-Joining (by Saitou and Nei) (Saitou
and Nei, 1987). Both are pretty heavy going but interesting. For gentler
descriptions of these algorithms I suggest Wikipedia. For a computer science
perspective on this and the next three lectures I have also included Mona
Singh’s notes (from a course she teaches at Princeton) on phylogeny
reconstruction.
The paper by Bos and Posada is a nice review of different models of DNA
evolution and how they are used in building trees (Bos and Posada, 2005). The
article by Guindon et al. discusses some recent developments in maximum
likelihood algorithms that have had a real impact on how fast they are and
how large a tree they can construct (Guindon et al., 2010). Sumner et al.
discuss why it might not be such a good idea to use the most general model
available when estimation trees (Sumner et al., 2012).
The paper by Anisimova and Gascuel introduces an approximate likelihood
ratio test that can be used in conjunction with maximum likelihood methods
to estimate one’s confidence in the clades of a given tree (Anisimova and
Gascuel, 2006). This turns out to be much faster than using non-parametric
approaches such as bootstrapping.
McGrayne discusses implicit, embedded use of Bayesian methods in baseball
batting averages and other issues of daily import (McGrayne, 2011) (p. 130).
Silver introduces Bayesian analysis using the mysterious panties (or nighty)
story (Silver, 2012) (p. 245). Huelsenbeck et al. reviews the use of Bayesian
methods in phylogeny reconstruction (Huelsenbeck et al., 2001). Ronquist
and Huelsenbeck introduce the third iteration of the program Mr. Bayes
(Ronquist and Huelsenbeck, 2003).
The paper by Degnan and Rosenberg shows how lineage sorting can cause
serious problems when trying to infer the correct species tree from gene trees
(Degnan and Rosenberg, 2006). White et al. study the discordance between
gene trees for three subspecies of mouse (White et al., 2009). The Iwabe et al.
paper uses gene duplication/loss parsimony to root the tree of life (Iwabe et
al., 1989). Zmasek and Eddy describe a straightforward algorithm for inferring
duplication/loss events given a gene tree and its corresponding species tree
Oct.
8
Oct.
10
Oct.
15
Oct.
17
11. Genomebased trees
(based on gene
content and on
gene order);
supertrees
12. The
importance of
using phylogeny
for testing
hypotheses
about natural
selection;
phylogenetic
algorithms for
testing natural
selection
13. Genomewide analysis of
adaptation
through gene
acquisition vs.
losses of genes
(part 1)
14. Genomewide analysis of
adaptation
through gene
acquisition vs.
losses of genes
(Zmasek and Eddy, 2001).
(no reading assigned)
Donoghue presents the classic case for why every evolutionary biologist
needs to pay attention to phylogeny (Donoghue, 1989). In their book on
comparative biology, Harvey and Pagel explain how phylogeny can be used to
make tests of natural selection (Harvey and Pagel, 1991). Probert et al.
analyze the relationship between seed longevity and various phenotypic and
environmental factors. In one analysis, they perform the tests using a preDonoghue, non-phylogenetic approach, and in another, they make a test
based on phylogenetically independent contrasts (Probert et al., 2009).
Zhong et al. gives a protocol for discovering young duplicated genes in a
genomic comparison of 12 Drosophila species; they show hot spots for young
duplication (Zhong et al., 2013). Brenner et al. present the first complete
genome sequence of a bacterium, and report the high frequencies of genes
occurring in families (Brenner et al., 1995). Merhej report the convergence of
gene loss in multiple lineages that have evolved pathogenicity (Merhej et al.,
2009). Luo et al. (across several papers) show that various clades of E. coli are
adapted to freshwater living, and they have genes, not shared with gut E. coli,
that adapt them to freshwater (Luo et al., 2011); also, Sarah Kopac and I have
written a commentary on this article (Cohan and Kopac, 2011). Hao and
Golding present evidence that genes entering a lineage by horizontal genetic
transfer are likely to evolve quickly in the context of their new homes (Hao
and Golding, 2006). Bhaya et al. present genomic data that led to discovery of
a fuller understanding of environmental differences between upstream and
downstream habitats in a hot spring (that mineral nutrient content was
greater upstream) (Bhaya et al., 2007). I have discussed the various
constraints of transfer of adaptations, including architectural incompatibilities
across distant relatives (Cohan, 2010). Popa et al. analyzed a polarized
network of HGT events, and found the role of sequence similarity in
determining frequencies of HGT; they also found how HGT is related to
function (Popa et al., 2011). Choi et al. analyzed very close relatives of
Streptococcus to discover both the rates of replacement and additive
horizontal transfer events, and discovered asymmetries in recombination
direction (Choi et al., 2012).
Touchon et al. identify HGT events among members of the species taxon E.
coli, and show that among closest relatives, nearly all of HGT events involve
genes without a function for the bacteria (Touchon et al., 2009). In an
experimental evolution system, Blount et al. discovered two ecotypes that
were able to coexist based on resource partitioning, one ecotype specialized
on glucose and one on citrate (Blount et al., 2008). Various papers from our
(part 2);
Analyses of
adaptation
through
changes in
genome-wide
gene expression
Oct.
24
Oct.
29
Oct.
31
15. Suggestions
for term
research
projects
16. Genomewide
approaches for
finding shared
genes under
recent positive
selection
(Theory)
17. Genomewide
approaches for
finding shared
genes under
recent positive
selection
(Applications)
lab discuss the ecotype models (Cohan, 2011a; Cohan, 2013; Cohan and Perry,
2007; Connor et al., 2010; Koeppel et al., 2013).
Herring et al. use a genome “re-sequencing” approach to infer that single
changes in one gene might have manifold effects on gene expression across
the genome (Herring et al., 2006). Ferea et al. present a classic piece of work
showing the hundreds of gene expression changes that yeast undergoes as it
spontaneously evolves to be aerobic (in the absence of competitors) (Ferea et
al., 1999). Sumby et al. use genome-wide gene expression and genome
resequencing to show that passaging a non-pathogenic strain of Strep
through a mouse brings about evolution of virulence through a single change
in a signal transducing gene brings about massive changes in gene expression,
including dozens of virulence genes (Sumby et al., 2006). Hahne et al. explore
genome-wide in one strain of Bacillus subtilis the various gene expression
changes that respond to a salinity challenge (Hahne et al., 2010). Arendt and
Reznick discuss the diversity of evolutionary responses among closely related
populations to a single selection pressure (Arendt and Reznick, 2008);
Dettman et al. further discuss this issue in the context of bacteria through the
magic of genome-wide gene expression analyses (Dettman et al., 2012). My
collaborators and I have investigated the tendency for different populations
of one species to find unique responses to the same selection challenge
(Cohan, 1984; Cohan and Hoffmann, 1989). Fong et al. have shown through
genome-wide gene expression how different genetic responses can be to the
same environmental challenge (Fong et al., 2005).
See our discussion of suggested research topics.
Nei and Gojobori describe a simple parsimony based method for estimating
dN/dS that is implemented in MEGA (Nei and Gojobori, 1986). PAML
implements a maximum likelihood approach (Yang, 2007). Hughes has been
perhaps the most vocal critic of using dN/dS to infer adaptive evolution
(Hughes, 2007). A response to Hughes is given by Zhai et al. (Zhai et al., 2012).
MUMmer describes the basic idea behind one of the most used algorithms for
aligning whole genomes (Delcher et al., 1999).
Williamson et al. present a genome-wide analysis of selective sweeps in the
human genome, across the entire species and within ethnic groups
(Williamson et al., 2007). Pavlidis et al. last month presented a new algorithm
(SweeD) for detecting selective sweeps from an input of thousands of wholegenome sequences (Pavlidis et al., 2013). Here they applied it to detect
several genes that underwent a selective sweep on human chromosome 1.
My students and I have argued why selective sweeps are not limited to a
particular region of the genome in bacteria (Cohan, 2005; Kopac and Cohan,
2012). Clark et al. performed a genome-wide analysis of positive selection in
the human lineage, compared to chimps, and with mouse as the outgroup
(Clark et al., 2003). Note how they identified the individual genes under
selection in the human lineage, and how they identified functional classes of
genes with a particularly high frequency of accelerated evolution in humans.
Nov.
5
18. Assembly
algorithms for
genome
sequencing
Nov.
7
19.
Metagenomics
in ecosystems
biology: how to
find out the
physiological
processes
occurring in an
ecosystem even
when we don’t
know who the
organisms are
Nov.
12
20.
Metagenomic
approaches for
characterizing
communitywide organismal
diversity
Vos developed a species concept for bacteria based on each ecotype having
its own unique history of positive selection (Vos, 2011); you might think about
how this idea may yield the same or different demarcations of ecotypes. Vos
et al. present their new computer package ODoSE to find bacterial ecotypes
as units that are different in their histories of positive selection (Vos et al.,
2013).
Flicek and Birney provide a fairly recent review of the most commonly used
methods of assembly (Flicek and Birney, 2009). The three papers Waterston
et al., Myers et al., and She et al. give some insight into the battle between
Hierarchical and Whole Genome sequencing as it played out in the early part
of this century (Waterston et al., 2002; Myers et al., 2002; She et al., 2004).
Finally, the paper by Chin et al. discuss some new algorithmic ideas that have
come about due to the most recent advances in sequencing technology (Chin
et al., 2013).
Bell et al. present evidence that increasing bacterial diversity increases the
productivity of an ecosystem (Bell et al., 2005). Lay et al. investigate the
functional diversity in an extremely cold and salty spring at the top of the
world; they find that certain functions are found redundantly in a great
diversity of organisms, while others are not (Lay et al., 2013). Simon et al. use
a metagenomic approach to studying the microbial organismic diversity on a
glacier; they also discover the genes responsible for protection against the
cold in this community (Simon et al., 2009). McHardy et al. present a package
called Phylopythia, for identifying organisms from a single metagenomic
sequence, based on nucleotide composition (McHardy et al., 2007). Cecchini
et al. use a metagenomic approach to figure out which organisms provide
certain functions in the environment, in this case the ability to utilize prebiotic
compounds (Cecchini et al., 2013). McMahon et al. present a functional
screen for novel genes that provide a certain function, and they show that the
host in which metagenomic segments are cloned makes a big difference in
their expression (and ability to be screened) (McMahon et al., 2012). Sommer
et al. perform a functional screen for antibiotic resistance genes in human
guts; surprisingly, there are many resistance genes that show only a distant
relationship to those resistance genes isolated from cultured bacteria
(Sommer et al., 2009). Robertson et al. perform a functional screen for novel
nitrilases, and are able to chart the history of evolutionary transitions from
activity on one enantiomer to activity on another (Robertson et al., 2004).
Rinke et al. show how single-cell genomics (i.e., sequencing the entire
genome of one cell we cannot culture) an add to our understanding of the
functional repertoire of an ecosystem (Rinke et al., 2013). (More from Rinke
in the next lecture on the diversity of organisms in bacterial communities.)
DeSantis et al. present their algorithm and web site, GreenGenes, for
classifying a 16S rRNA sequence to a taxon (DeSantis et al., 2006).
Konstantinidis and Tiedje present evidence for criteria (or a range of criteria)
of 16S rRNA divergence for demarcating taxa of different ranks
(Konstantinidis and Tiedje, 2005). Kim et al. is my foray into discovery of new
genera and species by 16S rRNA analysis of environmental DNA (Kim et al.,
2012). Sogin et al. present the first high-throughput sequencing of
environmental DNA from a marine habitat, providing evidence that there is an
Nov.
14
21.
Metagenomic
approaches to
finding out what
unidentified
genes do
(ecological
annotation)
Nov.
19
22. The human
microbiome:
types of
communities
across humans,
functional
screening for
novel genes,
antibiotic
holocausts and
health
consequences
extraordinary diversity of extremely rare organisms (Sogin et al., 2006). We
briefly revisit Simon et al., who gave an example of characterizing the
organismic diversity of a community by assigning protein-coding genes from
the metagenome to taxa (Simon et al., 2009); also, we revisit PhyloPythia
(McHardy et al., 2007). Hess et al. perform the amazing feat of obtaining a
nearly complete genome sequence of various organisms from the
metagenome fragments of a cow’s rumen (Hess, 2011); Mackelprang et al.
obtained a similar result from permafrost soil, obtaining the sequence of a
novel methanogen from permafrost soil (Mackelprang et al., 2011). Rinke et
al. provide results from single-cell genome sequencing of various phyla that
had never previously been sequenced; this provided evidence for four
previously unknown superphyla (Rinke et al., 2013). Just to show that we
care about the gene-based discovery of phylogenetic supergroups in nonbacteria, we provide the discovery of superorders of mammals (BinindaEmonds et al., 2007).
Here are the references for the metagenome projects discussed in class f (Wu
et al., 2009; Turnbaugh et al., 2007; Gilbert et al., 2010; 10K, 2009; Davies et
al., 2012; Tyson et al., 2004). Knight et al. plea for a new standard of
coverage of environmental data in metagenomics studies (Knight et al., 2012).
Plewniak give a nice old-style example of how we can identify the genes
responsible for adaptation to a given geochemical stressor, if we already
know the genes (Plewniak et al., 2013). Inskeep et al. give a nice example of
extremely different sets of geochemical stressors across habitats in a
metagenome study (Inskeep et al., 2010). Biddle et al. give an example of less
extreme variation among environments, where the same phyla are found
everywhere, possibly a good source of ecological annotation (Biddle et al.,
2011). Mackay et al. describe the Drosophila melanogaster genetic reference
panel, which consists of the genome sequences of 168 inbred lines derived
from a single natural population; this is being used to determine the genes
responsible for each of many physiological, behavioral, and ecological traits
(Mackay et al., 2012). (This is something I learned about on my visit to SUNY
Binghamton after I gave this lecture, and so it wasn’t included in the lecture.)
Our story today begins with the emergence of the germ theory of disease, and
an attitude both within households and in the public health establishment
that the only good germ is a dead germ; I recommend The Gospel of Germs by
Nancy Tomes as a great narrative of this period, from the 1870’s mostly until
the antibiotic revolution of the 1940’s (Tomes, 1998). Zimmer and VelasquezManoff have recently written popular accounts of the importance of our gut
microbes in human health (Pollan, 2013; Velasquez-Manoff, 2013)
http://www.nytimes.com/2013/05/19/magazine/say-hello-to-the-100-trillionbacteria-that-make-up-your-microbiome.html?ref=magazine . The most
direct repercussion of the germ-as-enemy approach, leading to overuse of
antibiotics, has been the emergence of antibiotic resistance. Forslund et al.
present data on the prevalence of antibiotic resistance in different countries,
and the relationship between use of antibiotics for animal agriculture and
resistance in the human gut microbiome (Forslund et al., 2013). More
recently, we have reached an appreciation for the beneficial qualities of our
gut bacteria, and Khosravi and Mazmanian describe the disease-fighting
Nov.
21
23. Baseball,
biology, and big
data
importance of our resident bacteria (Khosravi and Mazmanian, 2013). PérezCobas describe the lasting effect of an antibiotic regimen on the composition
of an individual’s gut microbiome (Perez-Cobas et al., 2013). Liping Zhao
presents a proposal for a research field where we use various bioinformatic
approaches to determine the organismal changes correlated with obesity and
leanness, and then perform experiments to test the effects of the implicated
bacteria (Zhao, 2013). Wu et al. make a case that there are two primary
clusters of bacterial gut communities across humanity, one dominated by
Prevotella and associated with a carbohydrate diet, and another dominated
by Bacteroides and associated with a diet high in fat and proteins; they also
show that the microbiome can be changed in the short-term but that it
probably takes a long time to fully change a human’s gut microbiome (Wu et
al., 2011). Lozupone and Knight have developed a very useful algorithm
called Unifrac for clustering bacterial communities by their phylogenetic
differences; it is described in a couple of articles (Lozupone et al., 2006;
Lozupone and Knight, 2005). Muegge et al. find a functional pattern to the
differences in microbiomes of mammalian herbivores vs. carnivores; they find
such interesting things as carnivores tending to have microbiomes with lots of
amino acid degradation enzymes, while herbivores tend to have lots of amino
acid biosynthesis enzymes, which makes sense when you think about it. They
also make the case that the microbiomes of human vegans tend to look more
like those of mammalian herbivores, while microbiomes of human meateaters tend to look like those of mammalian carnivores (Muegge et al., 2011).
A mouse study by Zhang et al. (including Liping Zhao) shows that the
microbiome of a mouse is rapidly changed with the onset of a high-fat diet,
and changes back quickly with resumption of a low-fat diet; they also identify
key phylotypes that change rapidly with the change in diet (Zhang et al.,
2012). The next step in Liping Zhao’s paradigm is to test each of these taxa
for its effect on weight by introducing it into gnotobiotic mice; I supply an
example with a previously suspected effect of the bacterium on reducing
inflammation (Sokol et al., 2008).
I have written a couple of pieces on Sandy Koufax’s perfect game, and what it
taught me about using our imaginations better to have a fuller and more
useful data set (Cohan, 2011b; Cohan, 2011c). I also wrote an article on how
Big Data approaches can be used better in biology, in homage to Moneyball
(Cohan, 2012). Lozupone and Knight wrote their break-out piece on Unifrac,
showing that changes in adaptations to salinity were the most difficult
transitions in bacterial history; they also lamented that the resolution of the
environmental data was such that they could not also investigate the difficulty
of more subtle changes in salinity adaptation (Lozupone and Knight, 2007).
The disappointment of the missing data led to various conferences on what
environmental parameters (and sequencing and assembly tools!) we should
be recording when we spend millions of dollars on genome and metagenome
sequencing (Field et al., 2008). David Toomey writes about how we see only
what we expect to see. This was exemplified by microbiologists’ disinterest in
exploring what life may exist in Yellowstone’s hot springs, owing to their
“knowledge” that life couldn’t possibly exist at such high temperatures
(Toomey, 2013), p. 12-13. See also David Brock’s account of his discovery
(Brock, 1995) and Thomas Kuhn’s account of our limitations toward discovery
(Kuhn, 1996), p. 63-64. Hurwirtz and Sullivan have organized the unknown
diversity among marine viral proteins by clustering them, and then trying to
find out what ocean properties each cluster is associated with (Hurwitz and
Sullivan, 2013).
Nov.
26
Dec.
3
24. (lecture
cancelled)
25. Molecular
approaches for
identifying
microbial
diversity in
natural
communities—
AdaptML and
Ecotype
Simulation
Mallet presents evidence that speciation is always easy, even for the highly
sexual animals and plants (Mallet, 2008). My “Are species cohesive?” article
presents all the ways that speciation may be even easier in bacteria, and the
ways that we can use molecular and bioinformatic techniques to find the rate
of speciation in bacteria (Cohan, 2011a).
10K, G. (2009). Genome 10K: A proposal to obtain whole-genome sequence for 10,000 vertebrate
species. Journal of Heredity 100: 659-674.
Anisimova, M. &Gascuel, O. (2006). Approximate likelihood-ratio test for branches: A fast, accurate, and
powerful alternative. Syst Biol 55(4): 539-552.
Arendt, J. &Reznick, D. (2008). Convergence and parallelism reconsidered: what have we learned about
the genetics of adaptation? Trends Ecol Evol 23(1): 26-32.
Bell, T., Newman, J. A., Silverman, B. W., Turner, S. L. &Lilley, A. K. (2005). The contribution of species
richness and composition to bacterial services. Nature 436(7054): 1157-1160.
Bhaya, D., Grossman, A. R., Steunou, A. S., Khuri, N., Cohan, F. M., Hamamura, N., Melendrez, M. C.,
Bateson, M. M., Ward, D. M. &Heidelberg, J. F. (2007). Population level functional diversity in a
microbial community revealed by comparative genomic and metagenomic analyses. ISME J 1(8):
703-713.
Biddle, J. F., White, J. R., Teske, A. P. &House, C. H. (2011). Metagenomics of the subsurface BrazosTrinity Basin (IODP site 1320): comparison with other sediment and pyrosequenced
metagenomes. ISME J 5(6): 1038-1047.
Bininda-Emonds, O. R., Cardillo, M., Jones, K. E., MacPhee, R. D., Beck, R. M., Grenyer, R., Price, S. A.,
Vos, R. A., Gittleman, J. L. &Purvis, A. (2007). The delayed rise of present-day mammals. Nature
446(7135): 507-512.
Blount, Z. D., Borland, C. Z. &Lenski, R. E. (2008). Historical contingency and the evolution of a key
innovation in an experimental population of Escherichia coli. Proc Natl Acad Sci U S A 105(23):
7899-7906.
Bos, D. H. &Posada, D. (2005). Using models of nucleotide evolution to build phylogenetic trees. Dev
Comp Immunol 29(3): 211-227.
Brenner, S. E., Hubbard, T., Murzin, A. &Chothia, C. (1995). Gene duplications in H. influenzae. Nature
378(6553): 140.
Brock, T. D. (1995). The road to Yellowstone--and beyond. Annu Rev Microbiol 49: 1-28.
Cecchini, D. A., Laville, E., Laguerre, S., Robe, P., Leclerc, M., Dore, J., Henrissat, B., Remaud-Simeon, M.,
Monsan, P. &Potocki-Veronese, G. (2013). Functional metagenomics reveals novel pathways of
prebiotic breakdown by human gut bacteria. PLoS One 8(9): e72766.
Chin, C. S., Alexander, D. H., Marks, P., Klammer, A. A., Drake, J., Heiner, C., Clum, A., Copeland, A.,
Huddleston, J., Eichler, E. E., Turner, S. W. &Korlach, J. (2013). Nonhybrid, finished microbial
genome assemblies from long-read SMRT sequencing data. Nat Methods 10(6): 563-569.
Choi, S. C., Rasmussen, M. D., Hubisz, M. J., Gronau, I., Stanhope, M. J. &Siepel, A. (2012). Replacing and
additive horizontal gene transfer in streptococcus. Mol Biol Evol 29(11): 3309-3320.
Clark, A. G., Glanowski, S., Nielsen, R., Thomas, P. D., Kejariwal, A., Todd, M. A., Tanenbaum, D. M.,
Civello, D., Lu, F., Murphy, B., Ferriera, S., Wang, G., Zheng, X., White, T. J., Sninsky, J. J., Adams,
M. D. &Cargill, M. (2003). Inferring nonneutral evolution from human-chimp-mouse orthologous
gene trios. Science 302(5652): 1960-1963.
Cohan, F. M. (1984). Genetic-Divergence under Uniform Selection .1. Similarity among Populations of
Drosophila-Melanogaster in Their Responses to Artificial Selection for Modifiers of Cid. Evolution
38(1): 55-71.
Cohan, F. M. (2005).Periodic selection and ecological diversity in bacteria. In Selective Sweep, 78-93 (Ed
D. Nurminsky). Georgetown, Texas: Landes Bioscience.
Cohan, F. M. (2010). Synthetic biology: now that we're creators, what should we create? Curr Biol
20(16): R675-677.
Cohan, F. M. (2011a).Are species cohesive?--A view from bacteriology. In Bacterial Population Genetics:
A Tribute to Thomas S. Whittam, 43-65 (Eds S. Walk and P. Feng). Washington, DC: American
Society for Microbiology Press.
Cohan, F. M. (2011b).A more perfect numbers game. In Los Angeles Times.
Cohan, F. M. (2011c). Q&A: Frederick Cohan. Current Biology 21(11): R412-R414.
Cohan, F. M. (2012). Science needs more Moneyball. American Scientist 100(3): 182-185.
Cohan, F. M. (2013).Species. In Brenner's Encyclopedia of Genetics, Second Edition, 506-511 (Eds S.
Maloy and K. Hughes). Amsterdam: Elsevier.
Cohan, F. M. &Hoffmann, A. A. (1989). Uniform selection as a diversifying force in evolution: evidence
from Drosophila. Am Nat 134: 613-637.
Cohan, F. M. &Kopac, S. M. (2011). Microbial genomics: E. coli relatives out of doors and out of body.
Curr Biol 21(15): R587-589.
Cohan, F. M. &Perry, E. B. (2007). A systematics for discovering the fundamental units of bacterial
diversity. Current Biology 17: R373-R386.
Connor, N., Sikorski, J., Rooney, A. P., Kopac, S., Koeppel, A. F., Burger, A., Cole, S. G., Perry, E. B.,
Krizanc, D., Field, N. C., Slaton, M. &Cohan, F. M. (2010). The ecology of speciation in Bacillus.
Applied and Environmental Microbiology 76: 1349-1358.
Davies, N., Field, D. &Genomic Observatories, N. (2012). Sequencing data: A genomic network to
monitor Earth. Nature 481(7380): 145.
Degnan, J. H. &Rosenberg, N. A. (2006). Discordance of species trees with their most likely gene trees.
PLoS Genet 2(5): e68.
Delcher, A. L., Kasif, S., Fleischmann, R. D., Peterson, J., White, O. &Salzberg, S. L. (1999). Alignment of
whole genomes. Nucleic Acids Res 27(11): 2369-2376.
DeSantis, T. Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E. L., Keller, K., Huber, T., Dalevi, D., Hu, P.
&Andersen, G. L. (2006). Greengenes, a chimera-checked 16S rRNA gene database and
workbench compatible with ARB. Appl Environ Microbiol 72(7): 5069-5072.
Dettman, J. R., Rodrigue, N., Melnyk, A. H., Wong, A., Bailey, S. F. &Kassen, R. (2012). Evolutionary
insight from whole-genome sequencing of experimentally evolved microbes. Mol Ecol 21(9):
2058-2077.
Donoghue, M. J. (1989). Phylogenies and the analysis of evolutionary sequences, with examples from
seed plants. Evolution 43: 1137-1156.
Eddy, S. R. (2004). What is dynamic programming? Nat Biotechnol 22(7): 909-910.
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput.
Nucleic Acids Research 32: 1792-1797.
Ferea, T. L., Botstein, D., Brown, P. O. &Rosenzweig, R. F. (1999). Systematic changes in gene expression
patterns following adaptive evolution in yeast. Proc Natl Acad Sci U S A 96(17): 9721-9726.
Field, D., Garrity, G., Gray, T., Morrison, N., Selengut, J., Sterk, P., Tatusova, T., Thomson, N., Allen, M. J.,
Angiuoli, S. V., Ashburner, M., Axelrod, N., Baldauf, S., Ballard, S., Boore, J., Cochrane, G., Cole,
J., Dawyndt, P., De Vos, P., DePamphilis, C., Edwards, R., Faruque, N., Feldman, R., Gilbert, J.,
Gilna, P., Glockner, F. O., Goldstein, P., Guralnick, R., Haft, D., Hancock, D., Hermjakob, H., HertzFowler, C., Hugenholtz, P., Joint, I., Kagan, L., Kane, M., Kennedy, J., Kowalchuk, G., Kottmann,
R., Kolker, E., Kravitz, S., Kyrpides, N., Leebens-Mack, J., Lewis, S. E., Li, K., Lister, A. L., Lord, P.,
Maltsev, N., Markowitz, V., Martiny, J., Methe, B., Mizrachi, I., Moxon, R., Nelson, K., Parkhill, J.,
Proctor, L., White, O., Sansone, S. A., Spiers, A., Stevens, R., Swift, P., Taylor, C., Tateno, Y., Tett,
A., Turner, S., Ussery, D., Vaughan, B., Ward, N., Whetzel, T., San Gil, I., Wilson, G. &Wipat, A.
(2008). The minimum information about a genome sequence (MIGS) specification. Nat
Biotechnol 26(5): 541-547.
Fierer, N., Hamady, M., Lauber, C. L. &Knight, R. (2008). The influence of sex, handedness, and washing
on the diversity of hand surface bacteria. Proc Natl Acad Sci U S A 105(46): 17994-17999.
Flicek, P. &Birney, E. (2009). Sense from sequence reads: methods for alignment and assembly. Nat
Methods 6(11 Suppl): S6-S12.
Fong, S. S., Joyce, A. R. &Palsson, B. O. (2005). Parallel adaptive evolution cultures of Escherichia coli
lead to convergent growth phenotypes with different gene expression states. Genome Res
15(10): 1365-1372.
Forslund, K., Sunagawa, S., Kultima, J. R., Mende, D. R., Arumugam, M., Typas, A. &Bork, P. (2013).
Country-specific antibiotic use practices impact the human gut resistome. Genome Res 23(7):
1163-1169.
Funch, P. &Kristensen, R. (1995). Cycliophora is a new phylum with affinities to Entoprocta and
Ectoprocta. Nature 378: 711-714.
Futuyma, D. J. (1998). Evolutionary Biology.
Genoways, H. H. &Choate, J. r. (1972). A multivariate analysis of systematic relationships among
populations of the short-tailed shrew (genus Blarina) in Nebraska. Systematic Zoology 21: 106116.
Gilbert, J. A., Meyer, F., Jansson, J., Gordon, J., Pace, N., Tiedje, J., Ley, R., Fierer, N., Field, D., Kyrpides,
N., Glockner, F. O., Klenk, H. P., Wommack, K. E., Glass, E., Docherty, K., Gallery, R., Stevens, R.
&Knight, R. (2010). The Earth Microbiome Project: Meeting report of the "1 EMP meeting on
sample selection and acquisition" at Argonne National Laboratory October 6 2010. Stand
Genomic Sci 3(3): 249-253.
Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S. &Brilliant, L. (2009). Detecting
influenza epidemics using search engine query data. Nature 457(7232): 1012-1014.
Guindon, S., Dufayard, J. F., Lefort, V., Anisimova, M., Hordijk, W. &Gascuel, O. (2010). New algorithms
and methods to estimate maximum-likelihood phylogenies: assessing the performance of
PhyML 3.0. Syst Biol 59(3): 307-321.
Hahne, H., Mader, U., Otto, A., Bonn, F., Steil, L., Bremer, E., Hecker, M. &Becher, D. (2010). A
comprehensive proteomics and transcriptomics analysis of Bacillus subtilis salt stress
adaptation. J Bacteriol 192(3): 870-882.
Hao, W. &Golding, G. B. (2006). The fate of laterally transferred genes: life in the fast lane to adaptation
or death. Genome Res 16(5): 636-643.
Harel, D. (2000).Sometimes we just don't know. In computers Ltd.: what they really can't do, 91-117
Oxford: Oxford Univ. Press.
Harvey, P. H. &Pagel, M. D. (1991). The Comparative Method in Evolutionary Biology. Oxford: Oxford
University Press.
Herring, C. D., Raghunathan, A., Honisch, C., Patel, T., Applebee, M. K., Joyce, A. R., Albert, T. J., Blattner,
F. R., van den Boom, D., Cantor, C. R. &Palsson, B. O. (2006). Comparative genome sequencing
of Escherichia coli allows observation of bacterial evolution on a laboratory timescale. Nat Genet
38(12): 1406-1412.
Hess, M. (2011). Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.
Science 331: 463-467.
Huelsenbeck, J. P., Ronquist, F., Nielsen, R. &Bollback, J. P. (2001). Bayesian inference of phylogeny and
its impact on evolutionary biology. Science 294(5550): 2310-2314.
Hughes, A. L. (2007). Looking for Darwin in all the wrong places: the misguided quest for positive
selection at the nucleotide sequence level. Heredity (Edinb) 99(4): 364-373.
Hurwitz, B. L. &Sullivan, M. B. (2013). The Pacific Ocean virome (POV): a marine viral metagenomic
dataset and associated protein clusters for quantitative viral ecology. PLoS One 8(2): e57355.
Inskeep, W. P., Rusch, D. B., Jay, Z. J., Herrgard, M. J., Kozubal, M. A., Richardson, T. H., Macur, R. E.,
Hamamura, N., Jennings, R., Fouke, B. W., Reysenbach, A. L., Roberto, F., Young, M., Schwartz,
A., Boyd, E. S., Badger, J. H., Mathur, E. J., Ortmann, A. C., Bateson, M., Geesey, G. &Frazier, M.
(2010). Metagenomes from high-temperature chemotrophic systems reveal geochemical
controls on microbial community structure and function. PLoS One 5(3): e9773.
Iwabe, N., Kuma, K., Hasegawa, M., Osawa, S. &Miyata, T. (1989). Evolutionary relationship of
archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated
genes. Proc Natl Acad Sci U S A 86(23): 9355-9359.
Kämpfer, P., Kroppenstedt, R. M. &Dott, W. (1991). A numerical classification of the genera
Streptomyces and Streptoverticillium using miniaturized physiological tests. Journal of General
Microbiology 137: 1831-1891.
Keeling, P. J. &Palmer, J. D. (2008). Horizontal gene transfer in eukaryotic evolution. Nat Rev Genet 9(8):
605-618.
Khosravi, A. &Mazmanian, S. K. (2013). Disruption of the gut microbiome as a risk factor for microbial
infections. Curr Opin Microbiol 16(2): 221-227.
Kim, J. S., Makama, M., Petito, J., Park, N. H., Cohan, F. M. &Dungan, R. S. (2012). Diversity of Bacteria
and Archaea in hypersaline sediment from Death Valley National Park, California.
MicrobiologyOpen 1(2): 135-148.
Knight, R., Jansson, J., Field, D., Fierer, N., Desai, N., Fuhrman, J. A., Hugenholtz, P., van der Lelie, D.,
Meyer, F., Stevens, R., Bailey, M. J., Gordon, J. I., Kowalchuk, G. A. &Gilbert, J. A. (2012).
Unlocking the potential of metagenomics through replicated experimental design. Nat
Biotechnol 30(6): 513-520.
Koeppel, A. F., Wertheim, J. O., Barone, L., Gentile, N., Krizanc, D. &Cohan, F. M. (2013). Speedy
speciation in a bacterial microcosm: New species can arise as frequently as adaptations within a
species. ISME J 7: 1080-1091.
Konstantinidis, K. T. &Tiedje, J. M. (2005). Towards a genome-based taxonomy for prokaryotes. J
Bacteriol 187(18): 6258-6264.
Kopac, S. M. &Cohan, F. M. (2012).Comment on "Population genomics of early events in the ecological
differentiation of bacteria". In Science, Vol. 336.
Kuhn, T. (1996). The Structure of Scientific Revolutions. Chicago: University of Chicago.
Larson, G., Dobney, K., Albarella, U., Fang, M., Matisoo-Smith, E., Robins, J., Lowden, S., Finlayson, H.,
Brand, T., Willerslev, E., Rowley-Conwy, P., Andersson, L. &Cooper, A. (2005). Worldwide
phylogeography of wild boar reveals multiple centers of pig domestication. Science 307(5715):
1618-1621.
Lay, C. Y., Mykytczuk, N. C., Yergeau, E., Lamarche-Gagnon, G., Greer, C. W. &Whyte, L. G. (2013).
Defining the functional potential and active community members of a sediment microbial
community in a high-arctic hypersaline subzero spring. Appl Environ Microbiol 79(12): 36373648.
Lozupone, C., Hamady, M. &Knight, R. (2006). UniFrac--an online tool for comparing microbial
community diversity in a phylogenetic context. BMC Bioinformatics 7: 371.
Lozupone, C. &Knight, R. (2005). UniFrac: a new phylogenetic method for comparing microbial
communities. Appl Environ Microbiol 71(12): 8228-8235.
Lozupone, C. A. &Knight, R. (2007). Global patterns in bacterial diversity. Proc Natl Acad Sci U S A
104(27): 11436-11440.
Luo, C., Walk, S. T., Gordon, D. M., Feldgarden, M., Tiedje, J. M. &Konstantinidis, K. T. (2011). Genome
sequencing of environmental Escherichia coli expands understanding of the ecology and
speciation of the model bacterial species. Proc Natl Acad Sci U S A 108(17): 7200-7205.
Mackay, T. F., Richards, S., Stone, E. A., Barbadilla, A., Ayroles, J. F., Zhu, D., Casillas, S., Han, Y., Magwire,
M. M., Cridland, J. M., Richardson, M. F., Anholt, R. R., Barron, M., Bess, C., Blankenburg, K. P.,
Carbone, M. A., Castellano, D., Chaboub, L., Duncan, L., Harris, Z., Javaid, M., Jayaseelan, J. C.,
Jhangiani, S. N., Jordan, K. W., Lara, F., Lawrence, F., Lee, S. L., Librado, P., Linheiro, R. S., Lyman,
R. F., Mackey, A. J., Munidasa, M., Muzny, D. M., Nazareth, L., Newsham, I., Perales, L., Pu, L. L.,
Qu, C., Ramia, M., Reid, J. G., Rollmann, S. M., Rozas, J., Saada, N., Turlapati, L., Worley, K. C.,
Wu, Y. Q., Yamamoto, A., Zhu, Y., Bergman, C. M., Thornton, K. R., Mittelman, D. &Gibbs, R. A.
(2012). The Drosophila melanogaster Genetic Reference Panel. Nature 482(7384): 173-178.
Mackelprang, R., Waldrop, M. P., DeAngelis, K. M., David, M. M., Chavarria, K. L., Blazewicz, S. J., Rubin,
E. M. &Jansson, J. K. (2011). Metagenomic analysis of a permafrost microbial community reveals
a rapid response to thaw. Nature 480(7377): 368-371.
Mallet, J. (1995). A species definition for the modern synthesis. Trends Ecol. Evol. 10: 294-299.
Mallet, J. (2008). Hybridization, ecological races and the nature of species: empirical evidence for the
ease of speciation. Philos Trans R Soc Lond B Biol Sci 363(1506): 2971-2986.
McGrayne, S. B. (2011). The theory that would not die: how bayes' rule cracked the enigma code, hunted
down russian submarines, & emerged triumphant from two centuries of controversy. New
Haven: Yale.
McHardy, A. C., Martin, H. G., Tsirigos, A., Hugenholtz, P. &Rigoutsos, I. (2007). Accurate phylogenetic
classification of variable-length DNA fragments. Nat Methods 4(1): 63-72.
McMahon, M. D., Guan, C., Handelsman, J. &Thomas, M. G. (2012). Metagenomic analysis of
Streptomyces lividans reveals host-dependent functional expression. Appl Environ Microbiol
78(10): 3622-3629.
Merhej, V., Royer-Carenzi, M., Pontarotti, P. &Raoult, D. (2009). Massive comparative genomic analysis
reveals convergent evolution of specialized bacteria. Biol Direct 4: 13.
Michener, C. D. &Sokal, R. R. (1057). A Quantitative Approach to a Problem in Classification. Evolution
11: 130-162.
Mikkelsen, T. S., Hillier, L. W. &authors, a. m. o. (2005). Initial sequence of the chimpanzee genome and
comparison with the human genome. Nature 437(7055): 69-87.
Morrison, D. A. (2009). Why would phylogeneticists ignore computerized sequence alignment? 58: 150158.
Muegge, B. D., Kuczynski, J., Knights, D., Clemente, J. C., Gonzalez, A., Fontana, L., Henrissat, B., Knight,
R. &Gordon, J. I. (2011). Diet drives convergence in gut microbiome functions across mammalian
phylogeny and within humans. Science 332(6032): 970-974.
Myers, E. W., Sutton, G. G., Smith, H. O., Adams, M. D. &Venter, J. C. (2002). On the sequencing and
assembly of the human genome. Proc Natl Acad Sci U S A 99(7): 4145-4146.
Nei, M. &Gojobori, T. (1986). Simple methods for estimating the numbers of synonymous and
nonsynonymous nucleotide substitutions. Mol Biol Evol 3(5): 418-426.
Nosenko, T., Schreiber, F., Adamska, M., Adamski, M., Eitel, M., Hammel, J., Maldonado, M., Muller, W.
E., Nickel, M., Schierwater, B., Vacelet, J., Wiens, M. &Worheide, G. (2013). Deep metazoan
phylogeny: when different genes tell different stories. Mol Phylogenet Evol 67(1): 223-233.
Pavlidis, P., Zivkovic, D., Stamatakis, A. &Alachiotis, N. (2013). SweeD: likelihood-based detection of
selective sweeps in thousands of genomes. Mol Biol Evol 30(9): 2224-2234.
Penn, O., Privman, E., Ashkenazy, H., Landan, G., Graur, D. &Pupko, T. (2010). GUIDANCE: a web server
for assessing alignment confidence scores. Nucleic Acids Res 38(Web Server issue): W23-28.
Perez-Cobas, A. E., Gosalbes, M. J., Friedrichs, A., Knecht, H., Artacho, A., Eismann, K., Otto, W., Rojo, D.,
Bargiela, R., von Bergen, M., Neulinger, S. C., Daumer, C., Heinsen, F. A., Latorre, A., Barbas, C.,
Seifert, J., dos Santos, V. M., Ott, S. J., Ferrer, M. &Moya, A. (2013). Gut microbiota disturbance
during antibiotic therapy: a multi-omic approach. Gut 62(11): 1591-1601.
Plewniak, F., Koechler, S., Navet, B., Dugat-Bony, E., Bouchez, O., Peyret, P., Seby, F., Battaglia-Brunet, F.
&Bertin, P. N. (2013). Metagenomic insights into microbial metabolism affecting arsenic
dispersion in Mediterranean marine sediments. Mol Ecol 22(19): 4870-4883.
Pollan, M. (2013).Some of my best friends are germs. In New York TimesNew York.
Popa, O., Hazkani-Covo, E., Landan, G., Martin, W. &Dagan, T. (2011). Directed networks reveal genomic
barriers and DNA repair bypasses to lateral gene transfer among prokaryotes. Genome Res
21(4): 599-609.
Probert, R. J., Daws, M. I. &Hay, F. R. (2009). Ecological correlates of ex situ seed longevity: a
comparative study on 195 species. Annals of Botany 104(1): 57-69.
Richards, C. L., Rosas, U., Banta, J., Bhambhra, N. &Purugganan, M. D. (2012). Genome-wide patterns of
Arabidopsis gene expression in nature. PLoS Genet 8(4): e1002662.
Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N. N., Anderson, I. J., Cheng, J. F., Darling, A., Malfatti, S.,
Swan, B. K., Gies, E. A., Dodsworth, J. A., Hedlund, B. P., Tsiamis, G., Sievert, S. M., Liu, W. T.,
Eisen, J. A., Hallam, S. J., Kyrpides, N. C., Stepanauskas, R., Rubin, E. M., Hugenholtz, P. &Woyke,
T. (2013). Insights into the phylogeny and coding potential of microbial dark matter. Nature
499(7459): 431-437.
Robertson, D. E., Chaplin, J. A., DeSantis, G., Podar, M., Madden, M., Chi, E., Richardson, T., Milan, A.,
Miller, M., Weiner, D. P., Wong, K., McQuaid, J., Farwell, B., Preston, L. A., Tan, X., Snead, M. A.,
Keller, M., Mathur, E., Kretz, P. L., Burk, M. J. &Short, J. M. (2004). Exploring nitrilase sequence
space for enantioselective catalysis. Appl Environ Microbiol 70(4): 2429-2436.
Ronquist, F. &Huelsenbeck, J. P. (2003). MrBayes 3: Bayesian phylogenetic inference under mixed
models. Bioinformatics 19(12): 1572-1574.
Saitou, N. &Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic
trees. Mol Biol Evol 4(4): 406-425.
Schloss, P. D. &Handelsman, J. (2004). Status of the microbial census. Microbiol Mol Biol Rev 68(4): 686691.
She, X., Jiang, Z., Clark, R. A., Liu, G., Cheng, Z., Tuzun, E., Church, D. M., Sutton, G., Halpern, A. L.
&Eichler, E. E. (2004). Shotgun sequence assembly and recent segmental duplications within the
human genome. Nature 431(7011): 927-930.
Silver, N. (2012). The Signal and the Noise: Why So Many Predictions Fail--but Some Don't. New York:
Penguin.
Simon, C., Wiezer, A., Strittmatter, A. W. &Daniel, R. (2009). Phylogenetic diversity and metabolic
potential revealed in a glacier ice metagenome. Appl Environ Microbiol 75(23): 7519-7526.
Sogin, M. L., Morrison, H. G., Huber, J. A., Mark Welch, D., Huse, S. M., Neal, P. R., Arrieta, J. M. &Herndl,
G. J. (2006). Microbial diversity in the deep sea and the underexplored "rare biosphere". Proc
Natl Acad Sci U S A 103(32): 12115-12120.
Sokol, H., Pigneur, B., Watterlot, L., Lakhdari, O., Bermudez-Humaran, L. G., Gratadoux, J. J., Blugeon, S.,
Bridonneau, C., Furet, J. P., Corthier, G., Grangette, C., Vasquez, N., Pochart, P., Trugnan, G.,
Thomas, G., Blottiere, H. M., Dore, J., Marteau, P., Seksik, P. &Langella, P. (2008).
Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut
microbiota analysis of Crohn disease patients. Proc Natl Acad Sci U S A 105(43): 16731-16736.
Sommer, M. O., Dantas, G. &Church, G. M. (2009). Functional characterization of the antibiotic
resistance reservoir in the human microflora. Science 325(5944): 1128-1131.
Sumby, P., Whitney, A. R., Graviss, E. A., DeLeo, F. R. &Musser, J. M. (2006). Genome-wide analysis of
group a streptococci reveals a mutation that modulates global phenotype and disease
specificity. PLoS Pathog 2(1): e5.
Sumner, J. G., Jarvis, P. D., Fernandez-Sanchez, J., Kaine, B. T., Woodhams, M. D. &Holland, B. R. (2012).
Is the general time-reversible model bad for molecular phylogenetics? Syst Biol 61(6): 10691074.
Thalmann, O., Shapiro, B., Cui, P., Schuenemann, V. J., Sawyer, S. K., Greenfield, D. L., Germonpre, M. B.,
Sablin, M. V., Lopez-Giraldez, F., Domingo-Roura, X., Napierala, H., Uerpmann, H. P., Loponte, D.
M., Acosta, A. A., Giemsch, L., Schmitz, R. W., Worthington, B., Buikstra, J. E., Druzhkova, A.,
Graphodatsky, A. S., Ovodov, N. D., Wahlberg, N., Freedman, A. H., Schweizer, R. M., Koepfli, K.
P., Leonard, J. A., Meyer, M., Krause, J., Paabo, S., Green, R. E. &Wayne, R. K. (2013). Complete
mitochondrial genomes of ancient canids suggest a European origin of domestic dogs. Science
342(6160): 871-874.
Tomes, N. (1998). The Gospel of Germs: Men, Women, and the Microbe in American Life. Cambridge,
Mass.: Harvard University Press.
Toomey, D. (2013). Weird Life: The Search for Life that Is Very, Very Different from our Own. New York:
Norton.
Touchon, M., Hoede, C., Tenaillon, O., Barbe, V., Baeriswyl, S., Bidet, P., Bingen, E., Bonacorsi, S.,
Bouchier, C., Bouvet, O., Calteau, A., Chiapello, H., Clermont, O., Cruveiller, S., Danchin, A.,
Diard, M., Dossat, C., Karoui, M. E., Frapy, E., Garry, L., Ghigo, J. M., Gilles, A. M., Johnson, J., Le
Bouguenec, C., Lescat, M., Mangenot, S., Martinez-Jéhanne, V., Matic, I., Nassif, X., Oztas, S.,
Petit, M. A., Pichon, C., Rouy, Z., Ruf, C. S., Schneider, D., Tourret, J., Vacherie, B., Vallenet, D.,
Médigue, C., Rocha, E. P. &Denamur, E. (2009). Organised genome dynamics in the Escherichia
coli species results in highly diverse adaptive paths. PLoS Genet 5(1): e1000344.
Turnbaugh, P. J., Ley, R. E., Hamady, M., Fraser-Liggett, C. M., Knight, R. &Gordon, J. I. (2007). The
human microbiome project. Nature 449(7164): 804-810.
Tyson, G. W., Chapman, J., Hugenholtz, P., Allen, E. E., Ram, R. J., Richardson, P. M., Solovyev, V. V.,
Rubin, E. M., Rokhsar, D. S. &Banfield, J. F. (2004). Community structure and metabolism
through reconstruction of microbial genomes from the environment. Nature 428(6978): 37-43.
Velasquez-Manoff, M. (2013).A cure for the allergy epidemic? In New York TimesNew York Times.
Vos, M. (2011). A species concept for bacteria based on adaptive divergence. Trends Microbiol 19(1): 17.
Vos, M., te Beek, T. A., van Driel, M. A., Huynen, M. A., Eyre-Walker, A. &van Passel, M. W. (2013).
ODoSE: a webserver for genome-wide calculation of adaptive divergence in prokaryotes. PLoS
One 8(5): e62447.
Waterston, R. H., Lander, E. S. &Sulston, J. E. (2002). On the sequencing of the human genome. Proc Natl
Acad Sci U S A 99(6): 3712-3716.
White, M. A., Ane, C., Dewey, C. N., Larget, B. R. &Payseur, B. A. (2009). Fine-scale phylogenetic
discordance across the house mouse genome. PLoS Genet 5(11): e1000729.
Williamson, S. H., Hubisz, M. J., Clark, A. G., Payseur, B. A., Bustamante, C. D. &Nielsen, R. (2007).
Localizing recent adaptive evolution in the human genome. PLoS Genet 3(6): e90.
Wu, D., Hugenholtz, P., Mavromatis, K., Pukall, R., Dalin, E., Ivanova, N. N., Kunin, V., Goodwin, L., Wu,
M., Tindall, B. J., Hooper, S. D., Pati, A., Lykidis, A., Spring, S., Anderson, I. J., D'Haeseleer, P.,
Zemla, A., Singer, M., Lapidus, A., Nolan, M., Copeland, A., Han, C., Chen, F., Cheng, J. F., Lucas,
S., Kerfeld, C., Lang, E., Gronow, S., Chain, P., Bruce, D., Rubin, E. M., Kyrpides, N. C., Klenk, H. P.
&Eisen, J. A. (2009). A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea. Nature
462(7276): 1056-1060.
Wu, G. D., Chen, J., Hoffmann, C., Bittinger, K., Chen, Y. Y., Keilbaugh, S. A., Bewtra, M., Knights, D.,
Walters, W. A., Knight, R., Sinha, R., Gilroy, E., Gupta, K., Baldassano, R., Nessel, L., Li, H.,
Bushman, F. D. &Lewis, J. D. (2011). Linking long-term dietary patterns with gut microbial
enterotypes. Science 334(6052): 105-108.
Yang, Z. (2007). PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24(8): 1586-1591.
Zhai, W., Nielsen, R., Goldman, N. &Yang, Z. (2012). Looking for Darwin in genomic sequences--validity
and success of statistical methods. Mol Biol Evol 29(10): 2889-2893.
Zhang, C., Zhang, M., Pang, X., Zhao, Y., Wang, L. &Zhao, L. (2012). Structural resilience of the gut
microbiota in adult mice under high-fat dietary perturbations. ISME J 6(10): 1848-1857.
Zhao, L. (2013). The gut microbiota and obesity: from correlation to causality. Nat Rev Microbiol 11(9):
639-647.
Zhong, Y., Jia, Y., Gao, Y., Tian, D., Yang, S. &Zhang, X. (2013). Functional requirements driving the gene
duplication in 12 Drosophila species. BMC Genomics 14: 555.
Zmasek, C. M. &Eddy, S. R. (2001). A simple algorithm to infer gene duplication and speciation events on
a gene tree. Bioinformatics 17(9): 821-828.
Download