© Draft materials. Do not circulate. Page 1 Rooting and outgroups

advertisement
Rooting
and
outgroups
Rooted
trees
contain
information
about
the
flow
of
time,
which
allows
them
to
tell
us
about
recency
of
common
ancestry
(monophyly)
and
the
direction
of
trait
evolution.
Rooted
trees
are
therefore
essential
for
most
downstream
uses
of
phylogenies,
from
classification
to
studies
of
adaptation.
However,
most
methods
of
phylogenetic
inference
do
not
directly
yield
rooted
trees,
but
unrooted
trees.
While
you
can
read
much
of
the
secondary
phylogenetic
literature
only
understanding
rooted
trees,
it
is
essential
to
understand
unrooted
trees
(and
how
they
are
rooted)
if
you
plan
to
be
engaged
in
phylogenetic
research
or
if
you
want
to
be
able
to
critically
read
the
primary
phylogenetic
literature.
At
one
level
rooting
is
simple:
there
are
just
a
handful
of
ways
to
root
trees
and
these
all
have
a
sound
logical
basis.
Nonetheless,
perhaps
because
rooting
requires
a
certain
level
of
mental
gymnastics,
I
have
observed
that
even
professional
biologists
have
been
known
to
stumble
over
unrooted
trees
and
to
over‐interpret
rooted
trees.
Therefore,
I
start
by
introducing
unrooted
trees
and
then
list
the
three
main
methods
for
converting
them
into
rooted
trees.
I
end
by
visiting
some
of
the
common
mistakes
made
that
relate
to
rooting
and
provide
guidelines
as
to
how
to
avoid
these
errors.
The
information
content
of
unrooted
trees
An
unrooted
tree
is
simply
a
tree
that
lacks
a
defined
root.
In
an
unrooted
tree
the
lines
represent
evolutionary
lineages,
but
unlike
a
rooted
tree,
we
do
not
know
which
way
evolution
preceded
along
the
lineage.
Because
a
clade
is
an
ancestor
and
all
its
descendants
we
need
temporal
information
to
identify
clades.
Thus,
the
internal
branches
of
an
unrooted
tree
do
not
define
clades
but
rather
split
the
taxa
into
two
sets
that
are
attached
(directly
or
indirectly)
to
the
two
ends
of
the
branch.
Let’s
work
through
this.
Start
by
considering
a
rooted
tree,
such
as
that
shown
for
select
Archosaurs.
After
removing
the
root
from
this
rooted
tree
it
is
possible
to
redraw
the
unrooted
tree
in
several
different
formats.
© Draft materials. Do not circulate.
Page 1
Rooted
Unrooted
Unrooted
The
first
unrooted
tree
in
this
example
maintains
the
general
shape
of
the
rooted
tree
except
for
an
apparent
polytomy
in
the
position
that
used
to
correspond
to
the
root.
Actually
this
is
not
a
polytomy,
just
the
correct
way
to
draw
this
unrooted
tree
when
the
root
node
has
been
removed.
This
can
be
seen
because
the
third
tree
has
been
spread
out
to
avoid
even
an
impression
of
a
root.
The
tree
is
binary
in
that
each
node
has
three
branches
–
one
corresponding
to
the
ancestral
lineage
and
two
to
descendent
lineages
(although
the
figure
doesn’t
specify
which
are
ancestral
and
which
derived).
If
you
find
it
hard
to
see
that
the
second
and
third
trees
are
identical
in
content,
there
are
three
ways
to
proceed.
(1)
You
could
use
mental
gymnastics
to
visualize
the
changes
in
shape
needed
to
get
from
one
tree
to
another.
You
should
imagine
that
the
trees
are
made
of
string
(that
can
bend
or
change
length)
and
ask
yourself
whether
you
could
rearrange
the
second
tree
to
yield
the
third
without
having
to
cut
the
string
(the
answer
is
yes).
(2)
You
could
list
the
tips
that
are
attached
to
each
node
and
see
if
the
same
set
of
nodes
are
present
in
each
case.
The
table
lists
all
11
nodes
and
their
connected
tips.
Because
the
list
is
the
same
for
both
unrooted
trees
(and
the
rooted
tree),
they
are
the
same.
Branch
1
Branch
2
Branch
3
Stegosaurus
Ankylosaurus
Diplodocus,
Allosaurus,
Tyrannosaurus,
Velociraptor,
Archaeopteryx,
Ostrich,
Robin,
© Draft materials. Do not circulate.
Page 2
Pterosaur,
Crocodile,
Tricerotops,
Iguanodon
Tricerotops
Iguanodon
Diplodocus,
Allosaurus,
Tyrannosaurus,
Velociraptor,
Archaeopteryx,
Ostrich,
Robin,
Pterosaur,
Crocodile,
Stegosaurus,
Ankylosaurus
Tricerotops,
Iguanodon
Stegosaurus,
Akylosaurus
Diplodocus,
Allosaurus,
Tyrannosaurus,
Velociraptor,
Archaeopteryx,
Ostrich,
Robin,
Pterosaur,
Crocodile
Pterosaur
Crocodile
Diplodocus,
Allosaurus,
Tyrannosaurus
Velociraptor,
Archaeopteryx,
Ostrich,
Robin,
Stegosaurus,
Akylosaurus,
Tricerotops,
Iguanodon
Pterosaur,
Crocodile
Stegosaurus,
Akylosaurus,
Tricerotops,
Diplodocus,
Allosaurus,
Iguanodon
Tyrannosaurus,
Velociraptor,
Archaeopteryx,
Ostrich,
Robin
Diplodocus
Stegosaurus,
Akylosaurus,
Tricerotops,
Allosaurus,
Tyrannosaurus,
Iguanodon,
Pterosaurus,
Crocodile
Velociraptor,
Archaeopteryx,
Ostrich,
Robin
Allosaurus
Stegosaurus,
Akylosaurus,
Tricerotops,
Tyrannosaurus,
Velociraptor,
Iguanodon,
Pterosaurus,
Crocodile,
Archaeopteryx,
Ostrich,
Robin
Diplodocus
Tyrannosaurus
Stegosaurus,
Akylosaurus,
Tricerotops,
Velociraptor,
Archaeopteryx,
Ostrich,
Iguanodon,
Pterosaurus,
Crocodile,
Robin
Diplodocus,
Allosaurus
Velociraptor
Stegosaurus,
Akylosaurus,
Tricerotops,
Archaeopteryx,
Ostrich,
Robin
Iguanodon,
Pterosaurus,
Crocodile,
Diplodocus,
Allosaurus,
Tyrannosaurus
Archaeopteryx
Stegosaurus,
Akylosaurus,
Tricerotops,
Ostrich,
Robin
Iguanodon,
Pterosaurus,
Crocodile,
Diplodocus,
Allosaurus,
Tyrannosaurus,
Velociraptor
Ostrich
Robin
© Draft materials. Do not circulate.
Stegosaurus,
Akylosaurus,
Tricerotops,
Page 3
Iguanodon,
Pterosaurus,
Crocodile,
Diplodocus,
Allosaurus,
Tyrannosaurus,
Velociraptor,
Archaepteryx
Ankylosaurus
Stegosaurus
(3)
You
could
list
the
sets
of
taxa
separated
by
each
Iguanodon
Pterosaur
internal
branch,
the
set
of
splits.
For
Triceratops
example,
one
internal
branch
(shown
to
the
right)
Crocodile
divides
Stegosaurus,
Akylosaurus,
Robin
Tricerotops,
Iguanodon,
Pterosaurus,
Ostrich
Crocodile
on
the
one
side
from
Diplodocus,
Diplodocus
Allosaurus,
Tyrannosaurus,
Velociraptor,
Archaeopteryx
Archaeopteryx,
Ostrich
and
Robin,
on
the
other.
Allosaurus
Velociraptor
Tyrannosaurus
Because
there
are
an
odd
number
of
taxa
in
this
case,
we
can
define
each
split
by
just
listing
the
smaller
of
the
two
sets
of
taxa.
Thus,
you
should
be
able
to
see
that
both
trees
are
composed
of
the
following
splits:
Stegosaurus,
Ankylosaurus
Iguanodon,
Tricerotops
Stegosaurus,
Ankylosaurus,
Iguanodon,
Tricerotops
Pterosaur,
Crocodile
Stegosaurus,
Ankylosaurus,
Iguanodon,
Tricerotops,
Pterosaur,
Crocodile
Allosaurus,
Tyrannosaurus,
Velociraptor,
Archaeopteryx,
Ostrich,
Robin
Tyrannosaurus,
Velociraptor,
Archaeopteryx,
Ostrich,
Robin
Velociraptor,
Archaeopteryx,
Ostrich,
Robin
Archaeopteryx,
Ostrich,
Robin
Ostrich,
Robin
Through
any
one
of
these
three
methods,
mental
gymnastics,
node
enumeration,
or
splits
enumeration,
you
should
be
able
to
show
that
the
two
unrooted
trees
above
are
equivalent.
© Draft materials. Do not circulate.
Page 4
Rooting
an
unrooted
tree
Rooting
an
unrooted
tree
involves
adding
an
additional
node
to
one
of
the
branches
and
reorienting
the
tree
relative
to
that
node.
Here
are
three
alternative
rootings
of
the
same
unrooted
tree.
There
are
as
many
places
to
add
a
root
to
an
unrooted
tree
as
there
are
branches
(internal
and
external)
on
an
unrooted
tree.
The
number
of
branches
on
an
unrooted
tree,
and
hence
the
number
of
distinct
ways
to
root
it,
is
2N‐3,
where
N
is
the
number
of
tips.
For
example,
in
this
case
there
are
eight
tips
in
the
rooted
tree
and
thus
13
distinct
ways
to
root
the
tree.
You
may
notice
that
adding
a
root
is
just
like
adding
an
additional
taxon.
This
is
why
the
number
of
distinct
unrooted
topologies
is
equal
to
the
number
of
rooted
trees
for
that
number
of
tips
(given
in
Chap.
8,
page
xx)
minus
one.
Depending
where
the
root
is
added,
different
sets
of
taxa
are
resolved
as
clades.
For
example,
consider
the
simple
unrooted
tree
to
the
right.
We
do
not
know
from
this
tree
if
(AB)
is
a
clade,
or
if
(CDEF)
is
a
clade,
or
both.
The
identity
of
the
clades
depends
on
the
location
of
the
root.
If
the
root
is
in
the
AB
half
of
the
split
(i.e.,
on
the
terminal
branch
leading
to
A
or
that
leading
to
B)
then
(CDEF)
is
a
clade.
If
the
root
is
in
the
CDEF
half
of
Clade
Location
of root
Clade
Location
of root
Location
of root
Clade
© Draft materials. Do not circulate.
Page 5
Clade
the
split,
then
(AB)
is
a
clade,
but
C+D+E+F
is
not.
Only
if
the
root
is
on
the
branch
separating
AB
from
CDEF
will
they
both
be
clades.
While
an
unrooted
tree
does
not
identify
clades,
it
does
rule
out
certain
sets
of
taxa
forming
a
clade.
Only
a
set
of
tips
that
is
one
half
of
a
split
in
an
unrooted
tree
can
be
a
clade
on
a
rooted
tree.
For
example,
the
unrooted
tree
shown
is
incompatible
with
a
(CE)
clade
because
there
is
no
branch
that
separates
C+E
to
one
side
and
A+B+D+F
to
the
other.
Rooting
not
only
affects
which
sets
of
taxa
form
clades,
but
also
inferences
as
to
the
direction
of
character
evolution.
Here
is
an
unrooted
tree
of
the
monocot
clade
of
flowering
plants:
all
of
which
are
herbaceous
except
for
the
palms,
which
are
trees.
If
the
root
of
this
tree
were
on
the
branch
leading
to
the
date
palm,
then
we
would
conclude
that
monocots
were
ancestrally
trees
and
later
evolved
into
herbs.
Actually
the
root
is
on
the
branch
with
an
arrow,
which
means
that
the
monocots
evolved
from
herbs
into
palm
trees.
So,
for
this
reason
too,
we
have
good
reasons
to
want
rooted
rather
than
unrooted
trees.
Unrooted
trees
in
parsimony
analysis
The
main
reason
that
you
need
to
understand
Herbs Trees
unrooted
trees
is
because
most
phylogenetic
analysis
methods
yield
unrooted
not
rooted
trees.
To
see
why
this
is
so,
let’s
revisit
parsimony
analysis
as
introduced
in
the
last
chapter
using
this
simple,
hypothetical
data
matrix.
1
2
3
4
5
6
7
8
O
0
0
1
0
1
1
0
0
A
0
1
1
0
1
0
1
0
B
1
1
1
1
0
0
1
1
C
0
0
0
1
1
1
0
0
There
are
three
possible
unrooted
trees
for
four
taxa,
as
shown.
O
B
O
© Draft materials. Do not circulate.
A
C
B
A
O
A
C
C
B
Page 6
On
each
of
these
trees
we
can
count
up
the
number
of
steps
required
to
explain
each
character.
In
this
case
a
character
state
change
is
marked
on
a
tree,
but
the
polarity
is
unclear.
For
example,
here
is
the
mapping
of
characters
1,
3
and
4
on
tree
1.
The
bar
divides
the
tree
into
two
regions,
one
with
state
0
and
one
with
state
1.
Because
we
do
not
have
an
axis
of
time
defined,
it
is
left
unclear
whether
0
evolved
into
1
or
vice
versa.
O
B
O
B
O
B
11
0
0 1
1
0
Char. 1
Char. 3
Char. 4
A
C
A
C
A
C
In
the
chase
of
character
2
there
are
two
equally
parsimonious
mappings,
each
with
two
steps.
O
B
O
B
0
11
1
0
1
10
0
1
A
C
A
C
If
you
refer
back
to
this
same
example
in
the
rooted
case
you
will
see
that
everything
is
exactly
the
same:
that
tree
1
can
be
explained
with
one
step
for
characters
1,
2,
and
4
and
two
steps
for
character
3.
Indeed,
the
entire
summary
matrix
is
the
same.
1
2
3
4
5
6
7
8
O
0
0
1
0
1
1
0
0
A
0
1
1
0
1
0
1
0
B
1
1
1
1
0
0
1
1
C
0
0
0
1
1
1
0
0
Total
length
L
on
tree
1
1
2
1
1
1
2
2
1
11
L
on
tree
2
1
2
1
2
1
2
2
1
12
L
on
tree
3
1
1
1
2
1
1
1
1
9
© Draft materials. Do not circulate.
Page 7
Mostparsimonious tree
This
illustrates
that
for
standard
parsimony,
the
length
of
a
rooted
tree
and
its
corresponding
unrooted
trees
is
the
same.
If
it
takes
two
steps
to
explain
a
character
on
the
unrooted
tree
it
also
takes
two
on
the
rooted
tree.
But
recall
that
each
unrooted
trees
corresponds
to
many
(2N‐3)
rooted
trees.
So,
standard
parsimony
can
select
among
unrooted
trees
but
it
cannot
select
among
the
different
rooted
versions
of
the
optimal
unrooted
tree.
For
example,
parsimony
allows
us
to
favor
unrooted
tree
3
for
these
data,
but
it
is
equally
compatible
with
all
five
rootings
of
that
unrooted
tree.
Parsimony
and
almost
all
other
methods
for
estimating
phylogenies
select
among
unrooted
trees
but
are
agnostic
among
the
many
rooted
trees
that
correspond
to
each
unrooted
tree.
But
what
we
desire
is
a
rooted
tree.
Rooting
is
achieved
by
one
of
three
main
methods,
as
outlined
in
the
following
sections.
Tree­based
rooting
The
most
widely
used
rooting
method
is
usually
called
outgroup
rooting.
Outgroup
rooting
is
an
example
of
a
more
general
approach,
tree‐based
rooting.
In
tree‐based
rooting
we
use
prior
phylogenetic
information
to
orientate
a
new
analysis.
© Draft materials. Do not circulate.
Page 8
There
are
many
aspects
of
phylogenetic
history
that
are
well
known.
By
including
taxa
whose
phylogenetic
placement
is
already
well
known,
we
can
identify
either
an
exact
root
or
a
subset
of
branches
to
which
the
root
could
attach.
Here
are
four
hypothetical
unrooted
trees
that
include
familiar
taxa
whose
phylogenetic
relationships
are
well
known.
Based
on
established
phylogenetic
knowledge
I
have
marked
the
branches
to
which
the
roots
should
attach
(by
drawing
them
especially
thick).
For
the
upper
two
trees
a
single
rooted
tree
is
implied.
I
will
leave
it
to
you
to
draw
it
out
and
see
what
clades
are
implied.
For
the
lower
trees
there
is
sufficient
uncertainty
in
the
rooting
that
a
single
rooted
tree
is
not
determined.
The
normal
way
to
handle
such
situations
is
to
provide
an
unrooted
tree
that
is
oriented
in
such
a
way
that
it
is
easy
to
read
off
the
unambiguously
supported
clades.
Examples
are
shown
below.
© Draft materials. Do not circulate.
Page 9
It
is
important
to
emphasize
that
the
sampling
of
taxa
in
these
examples
is
atypical.
A
phylogenetic
analysis
does
not
usually
include
so
few
and
so
divergent
taxa.
Usually
a
study
is
focused
on
resolving
relationships
in
some
particular
clade
for
which
many
accessions
are
included.
Indeed,
as
will
be
discussed
further
in
Chap
X,
phylogenetic
inference
can
yield
erroneous
results
when
taxon
density,
the
degree
to
which
taxa
are
separated
by
long
branches,
is
low.
Suppose
you
wanted
to
study
the
relationships
among
maples
(Acer).
The
taxon
sampling
would
typically
include
sequences
of
one
or
more
target
gene
from
as
many
distinct
maple
species
as
you
can
obtain.
If
you
already
knew
that
maples
were
divided
completely
into
two
sister
clades,
then
you
would
not
need
any
more
taxa.
You
would
just
place
the
root
of
the
unrooted
tree
on
the
internal
branch
that
yields
these
two
clades.
The
study
would
not
be
testing
this
rooting,
but
accepting
it
as
established.
But
what
if
you
are
are
not
sure
how
Acer
is
rooted?
The
well‐established
methodology
is
to
include
in
the
data
matrix
taxa
that
are
known
to
be
outside
of
the
maple
clade:
outgroups.
The
trees
will
then
be
rooted
either
between
the
ingroup
and
outgroup
(if
there
is
only
one
outgroup
or
if
all
the
outgroup
taxa
form
a
clade)
or
somewhere
within
the
outgroup.
© Draft materials. Do not circulate.
Page 10
In
principle,
all
that
is
needed
to
root
a
tree
is
one
outgroup
taxon,
and
it
can
be
anything
that
is
not
in
the
ingroup
(the
taxa
whose
relationships
are
being
studied):
a
fly
could
serve
as
the
outgroup
for
our
study
of
maples.
However,
in
practice
it
is
advisable
to
use
quite
closely
related
outgroups,
for
example
horse‐chestnuts.
The
reason
for
using
closely
related
outgroups
is
that
parsimony
and
other
methods
of
analysis
are
most
reliable
when
branch
lengths
are
short
(taxon
density
is
Incorrect rooting:
high).
Also,
close
relative
are
more
likely
to
have
comparable
characters
and
genes
and
to
have
more
easily
aligned
sequence
data.
It
is
also
recommended
to
include
multiple
outgroups.
First
this
increases
taxon
density
so
as
to
make
the
analysis
more
reliable.
Second,
this
provides
insurances
against
inadvertently
picking
an
incorrect
outgroup.
This
point
is
worth
exploring
with
an
example.
Suppose
you
studied
artiodactyls
and
used
a
whale
as
the
sole
outgroup.
You
might
get
a
well
supported
tree
that
suggested
that
the
hippopotamus
was
sister
to
all
artiodactyls,
as
shown
(relationships
from
O’Leary
and
Gatesy
2008).
In
fact
your
rooting
would
be
incorrect
because
Correct rooting:
it
has
become
established
that
whales
are
within
the
“artiodactyls.”
If
you
had
included
additional
outgroups
such
as
horses
and
elephants
you
would
have
obtained
a
tree
in
which
the
supposed
ingroup
could
not
form
a
clade,
meaning
that
one
of
the
outgroups
is
actually
embedded
within
the
ingroup.
This
would
alert
you
the
fact
that
your
prior
assumption
of
Artiodactyl
monophyly
was
mistaken
and
would
allow
you
to
use
other
information
to
correctly
root
the
tree.
Tree‐based
rooting
is
the
preferable
method,
but
it
depends
upon
prior
phylogenetic
© Draft materials. Do not circulate.
Page 11
knowledge:
and
that
knowledge
in
turn
depended
on
prior
rooting.
However,
an
infinite
regress
is
avoidable
because
we
can
use
other
methods
to
root
trees.
Character­based
rooting
Imagine
that
you
were
convinced
that
you
knew
the
ancestral
state
of
a
particular
character
and
that
all
taxa
have
the
derived
state,
1,
except
taxa
A,
B,
and
C,
which
have
state
the
ancestral
state
0.
This
assertion
of
character
polarity
is
absolute,
it
does
not
allow
for
homoplasy
in
the
character.
As
a
result
it
is
equivalent
to
treating
taxa
A,
B,
and
C
as
outgroups.
Thus,
this
way
of
using
characters
to
root
a
tree
blurs
into
phylogeny‐based
rooting.
Real
character‐based
rooting
applies
when
some
or
all
of
the
characters
have
polarity
in
their
expected
evolution,
but
we
do
not
wish
to
rigidly
identify
taxa
as
being
outgroups.
In
a
parsimony
framework,
character‐based
rooting
occurs
when
one
or
more
character
has
an
asymmetric
step‐matrix.
In
this
case
the
length
of
a
tree
varies
with
the
tree’s
root
–
meaning
that
parsimony
is
now
selecting
among
rooted
trees.
Consider
a
simple,
although
extreme
example.
Using
simple
flat‐weighted
parsimony,
these
data
support
the
unrooted
tree
shown
(length
=
7).
A
0
0
0
0
0
0
0
Suppose
that
you
thought
that
B
1
0
0
0
0
0
0
the
rate
of
evolving
from
0
to
1
is
C
1
1
0
0
0
0
0
lower
than
the
rate
of
evolving
in
D
1
1
1
0
0
0
0
the
reverse
direction.
You
could
E
1
1
1
1
0
0
0
capture
this
with
a
2:1
gain:loss
F
1
1
1
1
1
0
0
step‐matrix,
as
shown.
This
G
1
1
1
1
1
1
0
implies
that
the
H
1
1
1
1
1
1
1
correct
rooting
of
this
To:
tree
is
on
the
branch
From:
0
1
leading
to
taxon
H.
This
is
because
such
a
rooting
0
0
2
implies
only
changes
from
1
to
0,
which
have
a
cost
of
one,
resulting
in
a
parsimony
1
1
0
score
of
7.
A
tree
rooted
on
taxon
A,
for
example,
is
less
© Draft materials. Do not circulate.
Page 12
parsimonious
because
it
implies
changes
from
0
to
1
(for
all
except
character
1),
and
an
overall
cost
of
13.
Parsimony
can
also
select
a
rooted
tree
when
some
characters
are
judged
to
show
irreversible
evolution.
This
means
that
once
an
ancestor
acquires
character
state
‘1’
all
descendants
retain
state
‘1.’
Irreversible
characters
can
be
thought
of
as
having
an
infinitely
asymmetric
step‐matrix
for
those
parts
of
the
tree
that
manifest
the
irreversible
state.
This
helps
explain
why
they
yield
rooted
tree.
For
example,
if
we
concluded
that
all
the
characters
in
the
matrix
were
irreversible,
then
the
optimal
rooted
tree
has
state
‘0’
at
the
base
for
all
characters:
which
would
result
in
rooting
the
Cost = 7
tree
on
taxon
A.
The
dollo
Cost = 13
assumption
(page
XX)
asserts
that
a
more
complex
character
state,
‘1’,
can
only
arise
once,
whereas
the
simpler
character
state,
‘0,’
can
arise
many
times.
This
may
sound
asymmetric,
but
actually,
dollo
parsimony
does
not
select
a
rooted
tree.
The
dollo
assumption
basically
holds
that
all
nodes
on
an
unrooted
tree
that
are
on
the
path
between
tips
that
have
state
‘1’
must
have
state
‘1.’
Having
mapped
a
character
this
way,
all
rootings
will
have
the
same
length.
[figures
needed?][cut
these
two
paras?]
This
discussion
should
serve
to
show
that
you
can
infer
the
root
of
a
tree
without
using
prior
phylogenetic
knowledge
if
you
are
willing
to
assert
that
characters
have
asymmetric
probabilities
of
changing
in
the
two
directions.
But
when
should
you
assert
that?
An
asymmetric
stepmatrix
captures
the
assumption
that
the
frequency
of
different
character
states
has
changed
over
time.
For
example,
the
2:1
gain:loss
step
matrix
given
above
implies
that
lineages
acquire
state
0
more
frequently
than
they
shift
back
from
state
0
to
state
1.
This
means
that
the
number
of
lineages
having
state
1
will
tend
to
decrease
over
© Draft materials. Do not circulate.
Page 13
time.
Such
an
assumption
is
called
non­stationarity.
It
contrasts
with
the
more
normal
assumption
of
stationarity,
that
the
frequency
of
different
states
(and
the
probabilities
of
change)
are
constant
throughout
the
phylogeny.
So,
the
decision
to
use
an
asymmetric
step‐matrix
should
be
based
on
evidence
that
there
is
likely
to
have
been
non‐stationary
evolution
in
a
group.
I
can
imagine
cases
where
non‐stationarity
is
defensible.
For
example
if
you
know
that
a
group
has
evolved
in
a
period
of
steady
climate
warming
you
might
expect
a
steady
increase
in
traits
associated
with
warmer
conditions.
In
that
case
it
might
make
sense
to
root
the
unrooted
tree
at
the
point
that
has
the
fewest
warm
weather
adaptations.
However,
I
think
you
will
agree
that
such
an
inference
is
rarely
going
to
be
convincing.
So
while
it
is
worth
knowing
about
the
theoretical
possibility
of
character‐based
rooting,
you
can
hopefully
see
why
it
is
very
rarely
used
in
practice.
Rate­based
rooting
As
discussed
in
chapter
2,
we
often
have
reason
to
expect
the
rate
of
evolution
to
be
reasonably
similar
in
different
lineages.
This
means
that
the
length
of
different
branches
of
a
tree
provides
some
information
as
to
the
likely
location
of
the
root.
The
tree
shown
is
an
unrooted
phylogram
for
five
living
taxa:
the
length
of
each
branch
is
proportional
to
the
amount
of
change
inferred
to
have
occurred
on
that
branch
based
on
a
particular
data
set.
Below
are
two
alternative
rootings
of
this
tree:
one
on
the
branch
leading
to
taxon
A,
and
one
on
the
internal
branch
separating
D+E
from
A+B+C.
While
the
parsimony
score
would
be
same
for
these
two
trees,
unless
we
have
contrary
evidence,
the
tree
on
the
right
is
more
likely
to
be
true.
To
see
why,
consider
what
the
left‐hand
tree
implies.
On
the
left‐hand
tree
the
rate
of
evolution
has
varied
greatly
among
branches.
The
lineage
leading
to
A
has
evolved
about
one‐tenth
the
rate
of
the
lineage
that
© Draft materials. Do not circulate.
Page 14
leads
to
D
and
E.
Even
if
the
last
common
ancestor
of
A
and
E
had
a
rate
of
evolution
intermediate
between
A
and
E,
this
rooting
implies
that
the
rate
of
evolution
itself
must
have
evolved
significantly.
The
rate
at
which
evolution
happens,
is
determined
by
features
of
the
biology
of
organisms
(see
page
xx):
the
efficiency
with
which
mutations
are
corrected,
the
generation
time,
population
size,
the
intensity
of
selection,
etc.
While
these
certainly
do
change,
they
probably
change
slowly.
Therefore,
we
have
reason
to
favor
the
right‐hand
tree,
which
is
compatible
with
near
constancy
of
the
rate
of
evolution,
over
the
left‐hand
tree,
which
implies
dramatic
changes
in
the
rate
of
evolution.
In
effect
this
is
an
extension
of
the
principle
of
parsimony.
Just
as
we
favor
a
tree
with
the
fewest
number
of
character
state
changes,
we
also
favor
the
tree
with
the
fewest
changes
in
the
rate
of
evolution.
Rate‐base
rooting
is
best
implemented
in
the
framework
of
statistical
models
of
evolution,
for
example
using
maximum
likelihood
(see
chaps
X
and
XX).
Nonetheless
even
in
parsimony
framework,
when
we
lack
good
outgroup
information,
we
can
root
trees
using
the
principle
that
the
rate
of
evolution
is
unlikely
to
be
very
variable
among
lineages.
The
obvious
procedure
would
be
to
take
the
most
parsimonious
tree
and
search
for
a
position
to
place
the
root
such
that
the
variation
in
the
rate
of
evolution
among
branches
is
minimized.
This
is
possible,
but
a
simpler
strategy
is
usually
used
instead:
midpoint
rooting.
In
midpoint
rooting
we
calculate
the
patristic
distance
between
each
pair
of
taxa.
Recall
that
the
patristic
distance
is
the
sum
of
the
length
of
all
the
branches
on
the
path
between
two
tips
(page
x).
Then
we
place
the
root
at
the
midpoint
of
the
longest
path.
Below
is
a
tree
with
estimated
branch
lengths.
To
the
right
are
the
patristic
distances
between
each
pair
of
tips.
The
longest
path
is
between
taxa
D
and
F,
so
the
root
is
placed
on
the
midpoint
of
that
path,
as
shown
in
the
rooted
tree
below.
This
example
illustrates
that
we
can
often
make
reasonable
rooting
inferences
even
without
outgroups,
provided
we
are
willing
to
assume
that
the
rate
of
evolution
is
roughly
similar
among
lineages.
© Draft materials. Do not circulate.
Page 15
A
G
H
2 1
B
1
0
2
3
I
1
1
E
0
4
5
D
E
F
G
H
A
-
B
18
-
C
25
23
D
94
96
94
E
17
3
22
95
F
97
99
106
109
98
G
20
8
25
96
7
101
H
21
9
26
97
8
102
3
I
18
6
22
94
5
99
2
3
J
19
7
24
97
6
100
9
10
I
7
1.5
2
8
33
A
56
F
C
J
15
C
B
53
D
54.5
Avoiding
common
mistakes
Before
leaving
the
topic
of
rooting
it
is
important
to
highlight
a
few
of
the
more
common
mistakes
that
arise
from
a
misunderstanding
of
how
trees
are
rooted
in
practice.
Three
problems
have
been
most
obvious
to
me
in
my
interactions
with
students
of
the
field.
The
first
mistake
is
to
infer
that
a
study
supports
ingroup
monophyly
when
actually
ingroup
monophyly
was
assumed
at
the
outset.
For
example,
suppose
that
you
conducted
a
phylogenetic
analysis
of
animals,
including
two
ascomycete
fungi,
yeast
Penicillium.
Suppose
that
the
tree
to
right
was
obtained
following
bootstrap
analysis.
Would
it
be
correct
to
state
© Draft materials. Do not circulate.
Page 16
A
n
i
m
a
l
s
and
the
that
there
is
a
97%
bootstrap
support
for
monophyly
of
the
animals?
The
answer
is,
no.
The
97%
bootstrap
indicates
that
for
97%
of
the
bootstrap
data
sets,
the
unrooted
tree
contains
an
internal
branch
that
separates
yeast
and
Penicillium
from
the
other
six
taxa.
This
tells
us
that
either
the
ascomycete
fungi
or
the
animals
are
monophyletic,
but
does
not
tell
us
which.
These
data
do
not,
therefore,
support
animal
monophyly
(although
they
do
not
contradict
it).
The
second
mistake
that
I
have
seen
is
related
to
the
previous,
but
usually
occurs
when
there
is
a
single
outgroup
taxon
(and
when
the
tree
is
drawn
in
a
rooted
format).
To
illustrate
this
consider
this
tree,
which
I
generated
from
some
real
data
(subsampled
from
Qiu
et
al.
2006).
C
l
a
d
e
1
?
In
analogous
cases,
I
have
observed
students
asking
the
question:
why
is
there
no
bootstrap
score
on
the
branch
marked
with
a
question
mark?
An
equivalent
question
is:
what
is
the
support
for
Clade
1?
I
have
even
reviewed
papers
submitted
to
scholarly
journals
that
were
so
sure
that
there
should
be
bootstrap
score
for
clade
1
that
they
have
looked
at
how
many
bootstrap
data
sets
had
this
clade,
and
have
found
that
the
answer
is
100%.
© Draft materials. Do not circulate.
Page 17
If
you
are
wondering
what
is
wrong
with
this
reasoning
recall
that
whereas
internal
branches
corresponds
to
the
meaningful
splits
that
define
the
L. marmoratus
Aquatic larvae
tree
topology,
external
branches
D. ochrophaeus
D.apalachicola
do
not.
External
branches
do
not
D. monticola
contain
topological
information,
D. fuscus
they
just
show
where
each
taxon
D. santeetlah
attaches
to
the
unrooted
tree.
Direct
D. conanti
Every
tree
includes
each
taxon,
so
development
D. welteri
the
same
external
branches
must
D. brimleyorum
be
present
on
all
trees.
The
point
D. auriculatus
to
note
is
that
the
branch
with
a
© Draft materials. Do not circulate.
Page 18
question
mark
is
an
external
not
an
internal
branch:
it
shows
were
the
outgroup
(the
liverwort,
Marchantia)
attaches
to
the
broader
unrooted
tree.
Any
rooted
tree
will
have
Marchantia
sister
to
the
entirety
of
clade
1.
Thus
the
data
do
not
contain
any
information
on
this
relationship
and,
consequently,
measures
of
support
for
this
branch
have
no
meaning.
The
final
kind
of
mistake
involves
mistaken
inferences
of
trait
evolution
that
are
made
on
a
rooted
tree.
This
can
be
illustrated
by
considering
the
following
hypothetical
example
(but
involving
a
real
situation).
Suppose
you
generated
the
following
phylogeny
(based
on
Titus
and
Larson,
1996)
for
species
of
the
salamander
genus
Desmognathus,
and
rooted
the
tree
with
a
species
of
Leurognathus.
You
know
that
Leurognathus
has
aquatic
larvae
whereas
these
Desmognathus
species
are
direct
developers:
forming
adult
salamanders
directly
from
land‐laid
eggs.
What
might
you
conclude
about
the
evolution
of
life
history?
You
might
be
tempted
to
infer
that,
because
the
outgroup
has
aquatic
larvae,
direct
development
is
a
derived
feature
of
Desmognathus.
This
inference
is
invalid.
Based
on
this
unrooted
tree,
we
can
explain
life
history
evolution
with
one
change
on
the
external
branch
separating
Leurognathus
and
Desmognathus,
the
same
branch
to
which
the
root
attaches.
What
we
cannot
tell
without
more
information
is
whether
the
change
in
life
history
lies
on
the
Desmognathus
side
of
the
root,
implying
that
direct
development
is
derived,
or
on
the
Leurognathus
side
of
the
root,
implying
that
aquatic
larvae
are
derived
in
Leurognathus.
In
fact,
examination
of
additional
taxa
suggests
that
Leurognathus
evolved
aquatic
larvae
from
a
direct
developing
ancestor
(Titus
and
Larson,
1996).
Leurognathus
Desmognathus
Leurognathus
Desmognathus
Aquatic
Direct
larvae
development
Aquatic
Direct
larvae
development
Major
points
© Draft materials. Do not circulate.
Page 19
Most
methods
of
phylogenetic
analysis
specify
a
rooted
tree
but
do
not
select
among
the
various
possible
rooted
versions
of
that
tree.
To
avoid
mistakes
in
the
interpretation
of
rooted
trees,
it
is
important
to
understand
the
methods
used
to
arrive
at
that
rooting.
The
most
common
way
to
root
a
tree
is
to
use
prior
phylogenetic
information.
It
is
usual
to
include
additional
taxa,
outgroups,
in
a
phylogenetic
study
that
are
known
to
be
outside
the
group
whose
relationships
are
being
studied,
so
that
the
trees
can
be
rooted
at
the
end
of
the
study.
Where
prior
knowledge
does
not
allow
trees
to
be
rooted,
information
on
the
position
of
the
root
may
(rarely)
come
by
including
characters
with
asymmetric
probabilities
of
change.
More
commonly,
a
tree’s
root
is
inferred
by
assuming
that
the
overall
rate
of
evolution
is
similar
in
different
lineages.
© Draft materials. Do not circulate.
Page 20
Download