chen_poster - Department of Mathematics

advertisement
Distribution of Passenger Mutations in Exponentially Growing Wave 0
Cancer Population
Yifei Chen1; Mentor: Rick Durrett2
1Duke
Abstract
We examined the occurrence of neutral or “passenger”
mutations in exponentially growing population of type 0
cancer cells. Assuming mutations follow the infinite sites
model, we derived the equations for the site frequency
spectrum for the population and for a sample of size n,
which depend on ν, the rate for passenger mutations per
cell division, γ, the rate a family that do not die arises, x, the
fraction of the population and m, number of individuals in
the sample with the mutation. To test these equations, we
used Matlab to simulate the population growth and
mutations. Afterwards, a sample of individuals was
randomly taken from the population and the site frequency
spectrum was computed for the sample and the population
as a whole. The simulated data fit well with the derived
equations with the exception of small x in the population
frequency and the m=1 case of the site frequency spectrum
of the sample. This observation prompted the derivation of a
more accurate formula for that case (not shown here).
Introduction
Cancer cells accumulate mutations that confer selective advantages
in the microenvironment of the body, so they out compete the normal
cells and grow uncontrolled, invading neighboring tissues and
metastasizing to distant locations. Somatic mutations that confer
cancer cells increased fitness are called driver mutations while those
that do not affect fitness are called passenger mutations.
Identification of driver mutations in order to find treatments is an
important goal of cancer research. Current investigations involve
sequencing and analyzing mutations in a sample of 10-20 examples
of a particular cancer. However, since many genes are sequenced it is
difficult to tell whether a mutation is a driver or a passenger. A better
understanding the frequency distribution of passenger mutations will
help to distinguish between the two types.
Mathematical Model: Site Frequency Spectrum for Type 0 Cells
The cancer cells are modeled by a branching process Zo(t) in which
individuals give birth at rate a0 and die at rate b0<a0. When Zo(t) is
conditioned to not die out and Y0(t) is set to be the number of
individuals whose families do not die out, then Y0(t) is a Yule process
in which births occur at rate γ=λ0/a0 where λ0= a0 -b0 .
TEMPLATE DESIGN © 2008
www.PosterPresentations.com
University ’13; 2Department of Mathematics, Duke University
In investigating the frequencies of neutral mutations of the branching
process in context of the infinite sites model, it is only necessary to
look at those that occur in Y0(t) In the Yule process, mutations and
branching of lineages are both exponential and that the number of
mutations when there are a certain number of lineages is given by a
shifted geometric distribution with p=γ/(γ+ν) where ν is the passenger
mutation rate. Also in a Yule process, limiting fraction of the
population descended from an individual is given by a beta
distribution. Using these properties, the site frequency spectrum for
the population, F(x) and for a sample of size n, Eηn,m is derived to be:
F ( x) 
E
The math model predicts many mutations occur in very small portion
of the population with few occurring in more than 20% of the
population. This is expected because genealogies in exponentially
growing population tend to be star-shaped so for a mutation to be
present in a large proportion of the population, it would have to have
occurred very early on.

x
1


n

(
) log( N ) when m 1



n , m  n

when2 m  n 1


m
(
m

1
)

F(x) is the expected number of
mutations present in more than a
fraction x of the population
Eηn,m is the expected number of
sites in a sample of size n with m
mutants
FIGURE 2. Site frequency spectrum results for a sample of size 10
ν is the passenger mutation rate
γ is the rate of branching/birth
of a new not dying out family
Nγ is the size of the population
of Yule process
Results for Site Frequency Spectrum of Population
To simulate the genealogy of a population of exponentially growing
population of cancer cells, we used Matlab’s built in random number
generator based on the uniform (0, 1) distribution to create distinct
lineages since the fraction of individuals descended from one half of
a branching lineages is uniformly distributed. The level of the
branching is determined by the number of breakpoints, k, which
creates k+1 distinct lineages and the interval between breakpoints
indicate the proportion of the population that is in that specific lineage.
At each level, the number of mutations is given by a shifted geometric
distribution with p=γ/(γ+ν), so to simulate the accumulation of
passenger mutations in the population, we used Matlab’s built in
geometric random number generator. The simulation records where
each mutation occurred on the genealogy and a histogram is
generated to show how many mutations occurred in greater than x of
the population.
The simulation was run for 100 times with γ=0.01, ν=0.01, down to
level 1000 and the number of mutations that occurred in each interval
was averaged and graphed along with the equation for the expected
result F(x) = 0.01/(0.01*x).
FIGURE 1. Site frequency spectrum results for the population. The
count at x includes the count of all those greater than x.
From figure 2, it is evident that for cases of m>1, the fit between theory
and simulated result is more or less perfect. However, for m=1, the
formula Eηn,m predicts many more singletons than are observed
indicated. After this work was completed, a improved formula has been
derived which predicts 36.6 for m=1.
Summary
As seen in figure 1, the formula fits the simulated data very well
except for small values of x.
1.
Results for Site Frequency Spectrum of Sample Size n
The simulation begins with the same steps for the population
frequency. Then, Matlab samples n individuals from the population
and counts how many, m, of those individuals have each mutation.
The simulation was run for 100 times with γ=0.01, ν=0.01, to level
1000 and with a sample of size 10. The number of sites with m
individuals in the sample having the mutation was averaged and
plotted along with the expected value at each m based on the
equation:
E
1


n

(
) log( N ) when m 1



n , m  n

when2 m  n 1

 m ( m 1)
2.
3.
We studied an exponentially growing population of type 0 cancer
cells with mutations following the infinite sites model and derived
equations for the site frequency spectrum for the population and
for a sample of size n.
We used Matlab to produce simulations to test the theoretical
predictions about the site frequency spectrum.
The fit of the mathematical model to the simulated data with the
mathematical models, is generally very good except for small x in
the population frequency and for m=1 in the site spectrum of the
sample.
References
Durrett, R. Population genetics of neutral mutations in exponentially
growing cancer cell populations. In preparation
Download