Document 10910775

advertisement
Hindawi Publishing Corporation
International Journal of Stochastic Analysis
Volume 2013, Article ID 826321, 23 pages
http://dx.doi.org/10.1155/2013/826321
Review Article
Simulating the Emergence of Mutations and
Their Subsequent Evolution in an Age-Structured Stochastic
Self-Regulating Process with Two Sexes
Charles J. Mode,1 Candace K. Sleeman,2 and Towfique Raj3
1
Department of Mathematics, Drexel University, Philadelphia, PA 19104, USA
Nokia Corporation, 5 Wayside Road, Burlington, MA 01803, USA
3
Division of Genetics, Harvard Medical School, Brigham and Women’s Hospital, Boston, MA 02115, USA
2
Correspondence should be addressed to Charles J. Mode; cjmode@comcast.net
Received 28 August 2012; Revised 30 November 2012; Accepted 3 December 2012
Academic Editor: Peter Olofsson
Copyright © 2013 Charles J. Mode et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The stochastic process under consideration is intended to be not only part of the working paradigm of evolutionary and population
genetics but also that of applied probability and stochastic processes with an emphasis on computer intensive methods. In particular,
the process is an age-structured self-regulating multitype branching process with a genetic component consisting of an autosomal
locus with two alleles for females and males. It is within this simple context that mutation will be quantified in terms of probabilities
that a given allele mutates to the other per meiosis. But, unlike many models that are currently being used in mathematical population genetics, in which natural selection is often characterized in terms of parameters called fitness by genotype or phenotype, in this
paper the parameterization of submodules of the model provides a framework for characterizing natural selection in terms of some
of its components. One of these modules consists of reproductive success that is quantified in terms of the total expected number
of offspring a female contributes to the population throughout her fertile years. Another component consists of survival probabilities
that characterize an individual’s ability to compete for limited environmental resources. A third module consists of a parametric
function that expresses the probabilities of survival in a birth cohort of individuals by age for both females and males. A forth module
of the model as an acceptance matrix of conditional probabilities such female may show a preference for the genotype or phenotype
as her male sexual partner. It is assumed that any force of natural selection acts at the level of the three genotypes under consideration
for each sex. By assigning values of the parameters in each of the modules under consideration, it is possible to conduct Monte
Carlo simulation experiments designed to study the effects of each component of selection separately or in any combination on a
population evolving from a given initial population over some specified period of time.
1. Introduction
Comparatively little attention has been given to models on
the evolution of age-structured populations in the extensive
literature on evolutionary and population genetics. Much of
the work on this class of models has, however, been reviewed
and extended in the book by Charlesworth [1] on evolution
in age structured populations. This book also contains a long
list of references citing papers that are quite well known by,
among others, Norton, Haldane, and Fisher, but it is beyond
the scope of this paper to provide an overview of the content
of these early works. It is of interest to note, however, that
Norton began work on models of age structured populations
as early as 1910, soon after Mendel’s laws of inheritance had
become widely known. For the most part, models on the
evolution of age structured populations that have been presented in this literature belong to the deterministic paradigm,
but in this paper a special class stochastic models on the
evolution of age structured populations will be the focus of
attention.
This special class of stochastic processes is rooted in ideas
from branching processes that also have an extensive literature dating back to about 100 years. Rather than attempting to
cite papers from this extensive literature, books on branching
processes will be cited which do contain extensive lists of
literature. During the second half of the 20th century, a book
2
by Harris [2] provided a stimulus for other workers enter the
field of branching processes. In the period that followed the
publication of this book, other books on this subject were
published. Included in these books are those of Athreya and
Ney [3], Jagers [4], and Mode [5]. In particular, the class of
branching process under consideration in this paper is an
extension of the general class of branching processes presented in Jagers, that contains a one type formulation of this
process and Mode, which contains a multitype version of the
general branching process. The multitype case of the general
processes is the most useful when formulating processes
applicable to evolutionary and population genetics, because
this case accommodates not only mutations among the types
but also differences in reproductive success among the types
as well as the age structure of an evolving population. Other
more recent books on branching processes are Asmussen and
Hering [6], Kimmel and Axelrod [7], and Haccou et al. [8].
Each of these books has an extensive list of references that an
interested reader may wish to explore.
The working paradigm underlying the class of branching
processes to be formulated and studied in this paper differs
markedly from that used in the books cited above. In these
books, as well as the literature on branching processes
published elsewhere, emphasis was placed on finding
threshold conditions under which the population either
became extinct or its total number, total population size,
would increase without bound. Perhaps the most significant
departure from the working paradigm of classical branching
processes considered in this paper is that the process is
self-regulating in the sense that its total size is dependent on
the carrying capacity of the environment expressed in terms
of parameters of survival functions, in which the parameter
values may differ among genotypes. Because the survival of
individuals during each time period under consideration is a
non-linear function of total population size, the formulation
being considered has nonlinear properties which arise
in connection with the interaction of individuals in the
population, as they compete for food and other resources.
There are also other non-linear properties of the formulation
under consideration that arise in connection with females
selecting male sexual partners by genotypes that leads to the
production of offspring that are necessary for the long-term
survival of a population. Given a formulation that accommodates selection of mates by genotype, it is possible to study
the effects of sexual selection on the long-term evolution of
an age-structured population. For the most part, interactions
among individuals were not accommodated in the formulations of classical branching processes in the books cited
above, but the more recent book by Haccou et al. [8], where a
one type a density dependent process is formulated in terms
of, among other things, the well-known logistic function that
played a role in the development of chaos theory that may be
of interest to readers, but it is beyond the scope of this paper
to present further details here. It should also be mentioned
that, unlike the classical analysis of branching processes that
often focuses on limit theorems, it is possible to study the long
transient period of evolution a process before the process
converges to some limit by using Monte Carlo simulation
methods.
International Journal of Stochastic Analysis
There is another aspect of the formulation to be considered in this paper which is also a significant departure
from the working paradigm of classical branching processes,
namely, the embedding of systems of nonlinear difference
equations in the stochastic processes, which will be discussed
in detail in a subsequent section and will be referred to as
the embedded deterministic model. After software had been
written to provide a computer implementation of the embedded deterministic model, given a set of numerical assignments to the parameters, it was possible to conduct computer
simulation experiments, such that 10,000 to 20,000 years
of evolution of an age structured population may be simulated
with a computer running time from one to four minutes.
Given this rapid computer response time, it was easy to determine whether a particular set of assigned numerical values
of the parameters belong to a set of points in the parameter
space, such that the population becomes extinct or grows
to a total size where environmental resources limit further
growth. Consequently, in this paper the task of finding criteria that partition a multidimensional parameter space into
regions such that, according to the embedded deterministic
model, the population either becomes extinct or its total size
approaches some limit determined by the environment will
not be under taken, but it is hoped that some analysts may
be stimulated to undertake this task. After interesting regions
of the parameter space have been found while using the
embedded deterministic model, a Monte Carlo simulation
experiment is conducted in which sample realizations of the
sample functions of the stochastic process are computed
using the same parameter values as those used in an experiment with embedded deterministic model to test the extent
to which the predictions of deterministic are consistent with
those of the process. As will be shown by examples, the
predictions of the embedded deterministic process are not
always consistent with those of the process, which is another
facet of how the working paradigm of this paper differs from
that of classical branching processes. It should also be mentioned that the formulation of an age structured process is
essentially a complete revision of the age structured process
considered in chapter 12 of Mode and Sleeman [9], where all
reported computer experiments were based on applications
of the embedded deterministic model.
2. Genotypes, Gametes, Mutation, and Types of
Matings in an Age Structured Population
Only three genotypes with respect to two alleles, 𝐴 and π‘Ž, at
some autosomal locus will be considered, and let
G = {𝐴𝐴, π΄π‘Ž, π‘Žπ‘Ž}
(1)
denote the set of three genotypes. To simplify the notation,
elements of G will be denoted by the symbol 𝜏, and to distinguish the sexes, female and male genotypes will be denoted
by πœπ‘“ and πœπ‘š , respectively. In what follows, the genotypes in
the set G will sometimes be denoted by 1, 2, and 3 and will
be made clear what sex is under consideration. The number
of age classes under consideration is π‘Ÿ for both sexes, and the
ages of females will be denoted by π‘₯ = 0, 1, 2, . . . , π‘Ÿ, where
International Journal of Stochastic Analysis
3
π‘₯ = 0 denotes infants or newborns. Similarly, ages of males
will be denoted by 𝑦 = 0, 1, 2, . . . , π‘Ÿ. Let π‘₯π‘š denote the
minimum age at which fertile females may bear children, and
let π‘₯max denote the maximum age at which they fertile. Let
𝑦 = π‘¦π‘š , . . . , π‘Ÿ denote the ages of males, such that they may
sire offspring. From now on such males will be referred to
as fertile. It will be assumed for the sake of simplicity that
π‘₯π‘š = π‘¦π‘š .
Mutations among the alleles 𝐴 and π‘Ž, which will be
assumed to occur during meiosis in both sexes, will be
formulated in terms of conditional probabilities in the 2 × 2
matrix
πœ‡ πœ‡
M = ( 11 12 ) ,
πœ‡21 πœ‡22
(2)
where πœ‡12 is the conditional probability that allele 𝐴 mutates
to allele π‘Ž per meiosis. The conditional probability that allele
does not mutate is πœ‡11 = 1 − πœ‡12 . Similarly, πœ‡21 is the conditional probability that allele π‘Ž mutates to 𝐴 per meiosis and
πœ‡22 = 1 − πœ‡21 . It will be assumed that these conditional probabilities hold for both sexes as well as throughout the fertile
ages for both females and males.
The next step in the formulation of the model is to
derive a gametic distribution for each of the three genotypes,
which will be symbolized as follows. Let 𝑖 = 1 denote allele
𝐴, and let 𝑗 = 2 denote allele π‘Ž, then the set of two alleles
under consideration may be denoted by A = {1, 2}. Similarly,
the set of three genotypes will be denoted by G = {(1, 1),
(1, 2), (2, 2)} = {1, 2, 3}. When there are mutations among the
two alleles, each of these genotypes may produce a gamete
containing an allele 𝑗 ∈ A. Let 𝑝𝑔 (𝑖; 𝑗) denote the conditional
probability that genotype 𝑖 ∈ G produces the gamete 𝑗 ∈
A. By way if an illustration, a formula for the conditional
probability 𝑝𝑔 (1; 1) will be derived. It will be assumed that
the allele on the left in the symbol 𝐴𝐴 was contributed by
the maternal parent of an individual and the one on the
right was contributed by the paternal parent. The conditional
probability that the allele on the left in the genotype 𝐴𝐴 is
contributed to the gamete pool of the population is πœ‡11 /2.
Similarly, the conditional probability that the allele on the
right is contributed to the gamete pool is πœ‡11 /2. By adding
these probabilities it follows that 𝑝𝑔 (1; 1) = πœ‡11 . By a similar
argument, it can be shown that 𝑝𝑔 (1; 2) = πœ‡12 . Furthermore,
by continuing this line of reasoning, it can be shown that
𝑝𝑔 (2; 1) = (πœ‡11 + πœ‡21 )/2, 𝑝𝑔 (2; 2) = (πœ‡12 + πœ‡22 )/2, 𝑝𝑔 (3; 1) =
πœ‡21 , and 𝑝𝑔 (3; 2) = πœ‡22 .
A mating between a female and male of genotypes 𝑖 and
𝑗, respectively, will be denoted by the mating type πœ… = (𝑖, 𝑗).
Because there are three genotypes of each sex, it follows that
there are 9 types of matings. Throughout this paper the set of
mating types will be denoted by
T = {(1, 1) , (1, 2) , (1, 3) , (2, 1) , (2, 2) ,
(2, 3) , (3, 1) , (3, 2) , (3, 3)}
(3)
in the order indicated. For every mating type πœ… ∈ T, let 𝑝(πœ…; 𝜈)
denote the conditional probability that a mating of type πœ… =
(𝑖1 , 𝑖2 ) produces an offspring of genotype 𝜈 = (𝑖, 𝑗) ∈ G. By
assumption females and males have the same gametic distributions, and it also assumed that female and male gamete
combine independently. From these assumptions, it follows
that
𝑝 (πœ…; 𝜈) = 𝑝𝑔 (𝑖1 , 𝑖) 𝑝𝑔 (𝑖2 , 𝑗) ,
(4)
for all couple types πœ… = (𝑖1 , 𝑖2 ) ∈ T and genotypes 𝜈 =
(𝑖, 𝑗) ∈ G. In computer implementations of the model under
consideration, these conditional probabilities are computed
in the order indicated in the set T.
As discussed in Mode and Sleeman [9] in chapter 12,
a module for couple formation, depending on the ages of
the female and male, is an integral part of the formulation
of an age dependent stochastic population process, then
the number of couple types may become very large, which
leads to the necessity of processing large arrays in computer
implementations of the process. The processing of large arrays
in a computer, in turn, becomes problematic, unless some
scheme is adopted to reduce the number of couple types. It is
of interest, therefore, to formulate a two sex age dependent
population process in which sexual contacts may occur
between females and males but without partnerships or couples which may last for long time periods. Accordingly, the
purpose of this section is to formulate a process that includes
sexual contacts between females and males but not partnerships of females and males of long duration but are transient
and may vary from year to year among age groups of fertile
females and males. This approach mating system modelled
by this approach may be justified by assuming this type of
polygamy was part of the evolution of the human species
before the idea of long-term marital partnerships evolved.
3. Births in a Two Sex Age
Dependent Population Process
without Couple Formation
When an age structured population is under consideration,
individuals in a population will also be classified by type. If
for example, a female of genotype πœπ‘“ = 𝑖 is of age π‘₯, then
her type will be denoted by t𝑓 = (𝑖, π‘₯). Similarly, a male type
will be denoted by tπ‘š = (𝑗, 𝑦), where 𝑦 is his age and 𝑖 and 𝑗
belong to the set G of genotypes. At this point in defining the
components of the model, it should be mentioned that age
and time will expressed in terms of a lattice: 𝑇 = (0, β„Ž, 2β„Ž,
3β„Ž, . . .), where β„Ž is some time unit. For example, if the
evolution of a human population is under consideration, then
usually β„Ž will be a year. Hence, from now on β„Ž = 1. For every
𝑑 ∈ 𝑇, let X(𝑑) denote 3 by π‘Ÿ + 1 matrix valued function with 3
rows and π‘Ÿ + 1 columns, where 𝑋(𝑑; 𝑖, π‘₯) denotes the number
of females to type t𝑓 = (𝑖, π‘₯) at time 𝑑 ∈ 𝑇 in a population.
The 3 by π‘Ÿ + 1 matrix Y(𝑑) is defined similarly for males. To
reduce the size of the arrays to be processed in a computer,
it will be assumed that sexual contacts between females and
males do not depend on the age of the male but only on
the frequency of the male’s genotype. For 𝑦 = π‘¦π‘š , . . . , π‘Ÿ, let
the random function π‘Œ(𝑑; 𝑖, 𝑦) denote the number of males
of type tπ‘š = (𝑖, 𝑦) in the population at time 𝑑. Observe that
the random function π‘Œ(𝑑; 𝑖, 𝑦) is the number of individuals of
4
International Journal of Stochastic Analysis
genotype 𝑖 and age 𝑦 in the males population at time 𝑑 ∈ 𝑇.
Then it follows that, the random function
sexual contacts with a males of genotype 𝜈 during the time
interval [𝑑, 𝑑 + β„Ž). Then, let
π‘Ÿ
Z (𝑑; 𝑖, π‘₯) = (𝑍 (𝑑; 𝑖, π‘₯, 𝜈) | 𝜈 = 1, 2, 3)
π‘Œ (𝑑; 𝑖) = ∑ π‘Œ (𝑑; 𝑖, 𝑦)
𝑦=π‘¦π‘š
(5)
is the total number of fertile males of genotype 𝑖 in the
population at time 𝑑. Therefore, the frequency of fertile males
of genotype 𝑖 in the population at time 𝑑 is given by the
random function
π‘Œ (𝑑; 𝑖)
,
π‘ˆπ‘š (𝑑; 𝑖) =
π‘Œ (𝑑; ∘)
(6)
π‘Œ (𝑑; ∘) = ∑π‘Œ (𝑑; 𝑖)
(7)
for π‘Œ(𝑑; ∘) > 0, where
𝑖
𝛼𝑓 (1, 1) 𝛼𝑓 (1, 2) 𝛼𝑓 (1, 3)
A𝑓 = (𝛼𝑓 (2, 1) 𝛼𝑓 (2, 2) 𝛼𝑓 (2, 3))
𝛼𝑓 (3, 1) 𝛼𝑓 (3, 2) 𝛼𝑓 (3, 3)
π‘ˆπ‘š (𝑑; 𝑗) 𝛼𝑓 (𝑖, 𝑗)
∑𝑗 π‘ˆπ‘š (𝑑; 𝑗) 𝛼𝑓 (𝑖, 𝑗)
(8)
(9)
is the conditional probability that a female of genotype 𝑖 has a
sexual contact with a male of genotype 𝑗 during the any time
interval [𝑑; 𝑑 + 1), and let
𝛾𝑓 (𝑑; 𝑖) = (𝛾𝑓 (𝑑; 𝑖, , 𝑗) | 𝑗 = 1, 2, 3)
Z (𝑑; 𝑖, π‘₯) ∼ CMultinom (𝑋 (𝑑; 𝑖, π‘₯) , 𝛾𝑓 (𝑑; 𝑖)) ,
(10)
denote a 1×3 vector of these conditional probabilities. At time
𝑑, for π‘₯ = π‘₯π‘š , . . . , π‘₯max , let the random function 𝑋(𝑑; πœπ‘“ , π‘₯)
denote the number of fertile females of type t𝑓 = (𝑖, π‘₯) in the
population at time 𝑑, and let the random function 𝑍(𝑑; t𝑓 , πœπ‘š )
denote the number of females of type t𝑓 = (𝑖, π‘₯) who have a
(12)
for π‘₯ = π‘₯π‘š , . . . , π‘₯max and 𝑖 = 1, 2, 3. More precisely, the
components of the vector are
(13)
Observe for each 𝑖 = 1, 2, 3 the elements in the vector
𝑍(𝑑; 𝑖, π‘₯, 𝑗) for 𝑗 = 1, 2, 3 are the random number of females
of genotype 𝑖 and age π‘₯ that have sexual contacts with a males
of genotype 𝑗. Altogether there are 9 types of sexual contacts
when there are 3 genotypes of each sex as was shown in the
previous section. The random numbers of these 9 types of
contacts during any time interval [𝑑, 𝑑 + 1) is given by the
random vector
Z (𝑑; π‘₯) = (Z (𝑑; 𝑖, π‘₯) | 𝑖 = 1, 2, 3) .
denote the 3×3 matrix of acceptance probabilities for females.
For example, 𝛼𝑓 (1, 1) is the conditional probability that a
female of genotype 1 finds a male of genotype 1 acceptable as
a sexual partner, and in general, given a female of genotype
𝑖, 𝛼𝑓 (𝑖, 𝑗) is the conditional probability that she finds a male
of genotype 𝑗 acceptable as a sexual partner. For the sake of
simplicity, it will be assumed that this matrix does not depend
on the age π‘₯ of a fertile female. If 𝛼𝑓 (𝑖, 𝑗) = 1 for all pairs
(𝑖, 𝑗), then, by definition, females select male genotypes at
random for sexual contacts. But, if, for example, 𝛼𝑓 (𝑖, 3) is
greater than the other two probabilities for all 𝑖 = 1, 2, 3, then
all females prefer males of genotype 3 as sexual partners. This
is an example of what is called sexual selection.
Given these definitions and an application of Bayes’s rule,
it follows that
𝛾𝑓 (𝑑; 𝑖, 𝑗) =
denote a 1×3 random vector whose elements are the indicated
random functions. It will be assumed that this random
vector has a conditional multinomial distribution with index
𝑋(𝑑; 𝑖, π‘₯) and probability vector 𝛾𝑓 (𝑑; 𝑖). In symbols,
Z (𝑑; 𝑖, π‘₯) = (𝑍 (𝑑; 𝑖, π‘₯, 1) , 𝑍 (𝑑; 𝑖, π‘₯, 2) , 𝑍 (𝑑; 𝑖, π‘₯, 3)) .
and π‘ˆπ‘š (𝑑; 𝑖) = 0 if π‘Œ(𝑑; ∘) = 0, where 𝑖 = 1, 2, 3.
With a view towards minimizing the size of arrays that
need to be processed in a computer, it will be assumed that
for each fertile female of age π‘₯ expresses preferences for males
sexual partners by genotype. To simplify the notation, denote
the three genotypes under consideration by 1 = 𝐴𝐴, 2 = π΄π‘Ž,
and 3 = π‘Žπ‘Ž, and let
(11)
(14)
Sexual selection, whereby females of all genotypes prefer
males of one genotype over the others, is thought to be one
of the driving forces of natural selection. The process just
described accommodates this component of natural selection. It should also be mentioned that this simple mating process just described not only simplifies the computer implementation of the two-sex model under consideration, but it
may also be a plausible model for the evolution of a population prior to the time long-term monogamous sexual
partnerships evolved.
Reproductive success is also thought to be a component
of natural selection. For each fertile female, who has sexual
contacts with a fertile male, this component will be characterized by the parameters πœ†(𝑖, π‘₯) denoting the expected number
of offspring each fertile female type t𝑓 = (𝑖, π‘₯) contributes to
the population during a time interval [𝑑, 𝑑 + 1) for 𝑖 = 1, 2, 3.
Human females are most fertile during their twenties, and the
value of πœ†(𝑖, π‘₯) starting with πœ†(𝑖, π‘₯π‘š ) will increase with π‘₯ up
to about age 25 and then decreases to some value πœ†(𝑖, π‘₯max )
at age π‘₯max , the age at which a female is no longer fertile. Let
πœ† 𝑖 denote the expected number of offspring produced by a
female of genotype 𝑖 throughout her fertile ages. Then, for
every 𝑖 = 1, 2, 3,
π‘₯max
∑ πœ† (𝑖, π‘₯) = πœ† 𝑖 .
π‘₯=π‘₯π‘š
(15)
When assigning values to the parameters of a model in the
process of desiging a computer experiment, suppose one
assigns a value to each πœ† 𝑖 for 𝑖 = 1, 2, 3. For example, for the
case of a population with high fertility and no selection
among the three genotypes, then one could assign the value
International Journal of Stochastic Analysis
5
πœ† 𝑖 = 10 for 𝑖 = 1, 2, 3. Whereas, by way of an illustrative
example, if it is assumed that selection is such that πœ† 𝑖 increases
with 𝑖, then the assigned values of the parameters could be
πœ† 1 = 10, πœ† 2 = 11, and πœ† 3 = 12, so that, from the point of
view of natural selection, genotype 3 would have a selective
advantage over the other two genotypes.
Given (15), a question that naturally arises is how should
one distribute the assigned value πœ† 𝑖 over the fertile ages, such
that (15) holds? One approach to answer to this question is to
find a probability density 𝑓(π‘₯), such that
π‘₯max
∑ 𝑓 (π‘₯) = 1.
(16)
π‘₯=π‘₯π‘š
Then, if πœ†(𝑖, π‘₯) = 𝑓(π‘₯)πœ† 𝑖 , it follows that
π‘₯max
∑ πœ† (𝑖, π‘₯) = πœ† 𝑖 ,
(17)
π‘₯=π‘₯π‘š
for all 𝑖 = 1, 2, 3, so that (15) is satisfied. The density function
𝑓(π‘₯) may be called the fertility distribution of females during
the fertile ages.
Among the many choices of the density function 𝑓(π‘₯) is
a truncated Poisson distribution with parameter πœ† 0 > 0. It
is well known that the mode of a Poisson density function
is at the value π‘₯ = πœ† 0 when πœ† 0 is a positive integer. If, for
example, a female becomes fertile at age π‘₯π‘š = 15, then, under
the assumption that her peak in fertility is at age 25, it follows
that one should choose πœ† 0 = 10, so that the mode of the
distribution is age at age π‘₯ = 15 + 10 = 25. Next suppose that
the maximum age a female is fertile is π‘₯max = 40. Then the
number of numbers between π‘₯π‘š and π‘₯max starting at π‘₯π‘š = 15
and ending in 40 is 40 − 14 = 26. To create a truncated
Poisson distribution, compute the numerical values of the
density function
𝑦
(πœ† )
𝑔 (𝑦) = exp [−πœ† 0 ] 0
𝑦!
(18)
for 𝑦 = 0, 1, 2, . . . , 25. The next is to compute the normalizing
constant
25
𝐢 = ∑ 𝑔 (𝑦) .
(19)
𝑦=0
Let 𝑓(π‘₯; πœ† 0 , 40) denote the probability density function of
Poisson distribution truncated at age 40. Then, for π‘₯ = 15 + 𝑦
𝑓 (π‘₯; πœ† 0 , 40) =
𝑔 (𝑦)
𝐢
(20)
for 𝑦 = 0, 1, . . . , 25. Other choices of the density of the distribution of the age of child bearing could be made, but, for the
sake of simplicity, in all the computer experiments reported
in this paper, this distribution will be that of the truncated
Poisson described above.
In what follows, a well-known property of the family
of Poisson distributions will be used repeatedly. Let 𝑋𝑖 for
𝑖 = 1, 2, . . . , 𝑛 be a collection of independent Poisson random
variables with expectations 𝛼𝑖 > 0 for all 𝑖. Furthermore,
suppose each of these random variables takes values in the
set of nonnegative integers {π‘₯ | π‘₯ = 0, 1, 2, 3 . . .}. Then, the
random variable π‘Œ = 𝑋1 + 𝑋2 + ⋅ ⋅ ⋅ + 𝑋𝑛 has a Poisson
distribution with expectation (parameter) 𝛽 = 𝛼1 + 𝛼2 + ⋅ ⋅ ⋅ +
𝛼𝑛 . It should be remarked that this property is easy to prove
using probability generating functions. In what follows, it
will be assumed that the number of females 𝑍(𝑑, 𝑖, π‘₯, 𝑗) of
age π‘₯ with sexual contacts of type (𝑖, 𝑗) produce offspring
independently, and, moreover, females of different ages also
produce offspring independently. It should also be mentioned
that all conditional distributions and expectations are conditioned on the state of (X(𝑑), Y(𝑑)) at time 𝑑, so that the idea of
conditional independence is in force.
In principle, under the assumption that the number 𝑁𝑖
of offspring produced by an individual female of genotype
𝑖 follows a Poisson distribution with expectation πœ† 𝑖 , then
given a realization of the random variable 𝑍(𝑑; 𝑖, π‘₯, 𝑗) and
the expected value πœ†(𝑖, π‘₯) = πœ† 𝑖 𝑓(π‘₯; πœ† 0 , 40), it follows that
the random number of offspring produced by females of
genotype 𝑖 and age π‘₯ that had contacts with a males of genotype 𝑗 would follow a Poisson distribution with expectation
𝑍(𝑑; 𝑖, π‘₯, 𝑗)πœ†(𝑖, π‘₯). Then, because it is assumed that females
of different ages also produce offspring independently, it
follows that the random π‘Š(𝑑; 𝑖, 𝑗) denoting the total number
of offspring produced by females of genotype 𝑖 having male
sexual partners of genotype 𝑗 follows a Poisson distribution
with parameter
π‘₯max
𝛾 (𝑑; 𝑖, 𝑗) ∑ 𝑍 (𝑑; 𝑖, π‘₯, 𝑗) πœ† (𝑖, π‘₯)
π‘₯=π‘₯π‘š
π‘₯max
(21)
= ( ∑ 𝑍 (𝑑; 𝑖, π‘₯, 𝑗) 𝑓 (π‘₯; πœ† 0 , 40)) πœ† 𝑖
π‘₯=π‘₯π‘š
during the time interval [𝑑, 𝑑 + 1) for all nine combinations of
sexual partnerships (𝑖, 𝑗) ∈ T.
To simplify the notation, denote the sum in parentheses
by 𝑉(𝑑; 𝑖, 𝑗). Then 𝛾(𝑑; 𝑖, 𝑗) has the simpler form
𝛾 (𝑑; 𝑖, 𝑗) = 𝑉 (𝑑; 𝑖, 𝑗) πœ† 𝑖 .
(22)
From this formula, it can be seen that if 𝑉(𝑑; 𝑖, 𝑗) > 1, then
the expected number of total number of offspring produced
by partnerships of type (𝑖, 𝑗) during the time interval [𝑑, 𝑑 +
1) would exceed πœ† 𝑖 , indicating that if this relationship held
forward in time, then the expected number of offspring produced by partnerships of type (𝑖, 𝑗) would increase over time.
On the other hand, if 0 ≤ 𝑉(𝑑; 𝑖, 𝑗) < 1 and if this relationship
held forward in time, then eventually mating types of the
form (𝑖, 𝑗) would disappear from the population. It should
also be noted that if there are no individuals of genotype 𝑖 in
the population at time 𝑑, then 𝑉(𝑑; 𝑖, 𝑗) = 0 for all 𝑗 = 1, 2, 3. It
is of interest to observe that if the initial population consists
only of individuals of genotype 𝑖 = 1 = 𝐴𝐴 for both sexes,
then initially 𝑉(𝑑; 𝑖, 𝑗) = 0 for 𝑖 = 2, 3, until a sufficient numbers of mutant genotypes π΄π‘Ž and π‘Žπ‘Ž have arisen as a result
of mutations before 𝑉(𝑑; 𝑖, 𝑗) > 0 for 𝑖 = 2, 3 and 𝑗 = 1, 2, 3.
Moreover, if the initial number of females of genotype 𝐴𝐴 is
6
International Journal of Stochastic Analysis
small, then population may become extinct before the mutant
genotypes π΄π‘Ž and π‘Žπ‘Ž appear in the population. It is also of
interest to observe that if for some 𝑑 ≥ 0 all individuals, the
female and male population, are of genotype 1 = 𝐴𝐴, then
𝛾(𝑑; 𝑖, 𝑗) = 0 except for the case 𝑖 = 𝑗 = 1.
Therefore, when writing software to simulate realizations
of Poisson random variables, zero valued parameters need to
be accommodated. Symbolically, let CPois(𝛼) denote a procedure for simulating a realization of a Poisson random variable
with parameter 𝛼 ≥ 0, given the state of the population at
time 𝑑. Let the random variable π‘Š(𝑑; 𝑖, 𝑗) denote the number
of matings of a female of genotype 𝑖 with a male of genotype
𝑗 during the time interval [𝑑, 𝑑 + 1). Then, if 𝛾(𝑑; 𝑖, 𝑗) = 0, then
π‘Š(𝑑; 𝑖, 𝑗) = 0, but if 𝛾(𝑑; 𝑖, 𝑗) > 0, then
π‘Š (𝑑; 𝑖, 𝑗) ∼ CPois (𝛾 (𝑑; 𝑖, 𝑗)) ,
(23)
indicating that the random variable π‘Š(𝑑; 𝑖, 𝑗) has a conditional Poisson distribution with parameter 𝛾(𝑑; 𝑖, 𝑗). When
the parameter 𝛾(𝑑; 𝑖, 𝑗) is large, then it is well known that the
distribution of π‘Š(𝑑; 𝑖, 𝑗) may be approximated by a normal
distribution with expectation 𝛾(𝑑; 𝑖, 𝑗) and variance 𝜎2 =
𝛾(𝑑; 𝑖, 𝑗).
In general, consider a random variable π‘Š with a Poisson
distribution with parameter πœ† > 0, and suppose πœ† is large.
Then,
π‘Š ∼ 𝑆 = πœ† + √πœ†π‘,
(24)
where 𝑍 is a standard normal random variable with expectation 0 and variance 1 and ∼means approximately in distribution. When 𝑍 > 0, the sum on the right is always positive, but
if 𝑍 < 0, then this sum may be negative, and, therefore, π‘Š
may be negative. But, every realization of a Poisson random
variable must take a value in the set nonnegative integers. A
question that naturally arises is if we set π‘Š = [πœ† + √πœ†π‘],
where [∘] is the greatest integer function, then for what values
of πœ† will the sum 𝑆 = πœ† + √πœ†π‘ > 0 with sufficiently high
probability? Observe that 𝑆 = √πœ†(√πœ†+𝑍). Thus, 𝑆 < 0, if and
only if 𝑍 < −√πœ†.
For any value of 𝑧 > 0, it follows that
𝑃 [𝑍 < −𝑧] = 1 − 𝑃 [𝑍 ≤ 𝑧] = 𝑃 [𝑍 > 𝑧] ,
(25)
because the standard normal distribution is symmetric about
0. Let 𝑧 = 2.8 be a trial value. Then, it can be shown by either
computing 𝑃[𝑍 ≤ 2.8] or by consulting a reliable table for the
standard normal distribution that 𝑃[𝑍 ≤ 2.8] = 0.9974. Thus,
𝑃 [𝑍 < −2.8] = 1 − 0.9974 = 0.0026 = 𝑃 [𝑍 > 2.8] .
(26)
Therefore, for all πœ† > 0 such that √πœ† > 2.8, it follows that
𝑃 [𝑍 > √πœ†] < 𝑃 [𝑍 > 2.8] = 0.0026.
(27)
It seems reasonable, therefore, for all πœ† > 7.84 = (2.8)2 to use
the central limit theorem approximation when simulating a
realization of a Poisson random variable. When simulating
realizations of many Poisson random variables, it is most
efficient to use the normal approximation whenever it is
reasonable, because each realization requires only one call to
a procedure for simulating a realization of a standard normal
random variable. Consequently, the procedure just described
was implemented in the software, but, of course any investigator would be free to choose any trial value of 𝑧 that suits his
goals.
For those readers who may be uncomfortable with this
choice of the Poisson distribution for biological reasons, it
would be straight forward if, for example, an investigator
chooses to use negative binomial distribution for the offspring distribution in a branching process formulation. A
procedure for simulating realizations of random variables
would be more complicated using this two-parameter distribution than that described for the one parameter Poisson
distribution. It could, however, be accomplished with a little
more effort from some investigator by writing the software to
accomplish the task in a programming language of his choosing. But, just as in the case of using a Poisson distribution, at
some point in the development of the software, a central limit
theorem would need to be invoked to simulate realizations
of sums of conditionally independent random variables. If
a reader is interested in a review of parametric offspring
distributions that have been used extensively in demographic
studies of human populations, it is suggested that the book
Mode [10] be consulted.
Given a realization of the random variable π‘Š(𝑑; 𝑖, 𝑗), let
the 1 × 3 random vector
O (𝑑; 𝑖, 𝑗) = (𝑂 (𝑑; 𝑖, 𝑗; π‘˜) | π‘˜ = 1, 2, 3)
(28)
denote the random number of offspring produced partnerships of type (𝑖, 𝑗) of each of the genotypes π‘˜ = 1, 2, 3. Then,
the vector O(𝑑; 𝑖, 𝑗) has a conditional multinomial distribution with index, sample size, π‘Š(𝑑; 𝑖, 𝑗), and probability vector
p(𝑖, 𝑗) = (𝑝(𝑖, 𝑗; π‘˜) | π‘˜ = 1, 2, 3) derived in the previous section. In symbols, for every mating type πœ… = (𝑖, 𝑗)
O (𝑑; 𝑖, 𝑗) ∼ CMultinom (π‘Š (𝑑; 𝑖, 𝑗) , p (𝑖, 𝑗)) .
(29)
Let the random variable
𝑂 (𝑑; ∘, π‘˜) = ∑ 𝑂 (𝑑; 𝑖, 𝑗, π‘˜)
(𝑖,𝑗)∈T
(30)
denote the total number of offspring of genotype π‘˜ produced
by females during the time interval [𝑑, 𝑑 + 1) for π‘˜ = 1, 2, 3.
Given that the probability that an offspring is a girl is 𝑝𝑓 , then
𝐡𝑓 (𝑑; π‘˜) ∼ CBinom (𝑂 (𝑑; ∘, π‘˜) , 𝑝𝑓 ) ,
(31)
and the number of females with genotype π‘˜ born during
this time interval [𝑑.𝑑 + 1). Therefore, the number of boys of
genotype π‘˜ born during this time interval is
π΅π‘š (𝑑; π‘˜) = 𝑂 (𝑑; ∘, π‘˜) − 𝐡𝑓 (𝑑; π‘˜, 0)
(32)
for π‘˜ = 1, 2, 3.
In general, for π‘₯ = 1, 2, . . . , π‘Ÿ and 𝑦 = 1, 2, . . . , π‘Ÿ at time 𝑑,
let the random functions 𝑋(𝑑; 𝑗, π‘₯) and π‘Œ(𝑑; 𝑗, 𝑦) denote,
International Journal of Stochastic Analysis
7
respectively, the number of females of type 𝑑𝑓 = (𝑗, π‘₯) and
the number of males of type π‘‘π‘š = (𝑗, 𝑦) in the population at
time 𝑑 for 𝑗 = 1, 2, 3. Let 𝑋(𝑑; 0) denote a 3 × 1 vector with
the components 𝐡𝑓 (𝑑; 𝑗) for 𝑗 = 1, 2, 3 for females, and define
the 3 × 1 vector π‘Œ(𝑑; 0) similarly for males. These 3 × 1 column
vectors contain the number of females and males by genotype,
respectively, born during the time interval [𝑑, 𝑑 + 1). Then let
𝑋(𝑑; π‘₯ ≥ 1) denote a 3 × π‘Ÿ matrix with the column vectors
𝑋(𝑑; π‘₯) for π‘₯ = 1, 2, . . . , π‘Ÿ, and define the vector π‘Œ(𝑑, 𝑦 ≥ 1)
for males similarly. Then, the 3 × (π‘Ÿ + 1) matrix for females at
time 𝑑 + 1 is given by the matrix
X (𝑑) = X (𝑑; 0) , X (𝑑; π‘₯ ≥ 1) ,
(33)
where the, indicates that the 3 × π‘Ÿ matrix is catenated onto
the 3 × 1 vector X(𝑑; 0) to construct a 3 × (π‘Ÿ + 1) matrix.
The matrix π‘Œ(𝑑) is constructed for males in the same way.
Observe that at time 𝑑 females of age π‘Ÿ are in column π‘Ÿ + 1
of the 3 × (π‘Ÿ + 1) matrix X(𝑑). Thus, in the construction of the
3 × (π‘Ÿ + 1) matrix X(𝑑), females of age π‘Ÿ at time 𝑑 are dropped
from the population. But, because very few females will
survive to age π‘Ÿ, this loss will be negligible. This remark also
applies to males. In the next section, a parametric model of
a birth cohort survival function, governing the survival of
individuals in a birth cohort as a function of their age will
be described.
The next component of the age structured process under
consideration is that of taking mortality into account in terms
of condition survival probabilities by age. Before moving on
to an overview of the survival component of the age structured process under consideration, it is appropriate to mention that the book Mode [10] also contains a description of
stochastic models of human reproduction expressed in terms
monthly estrous cycle of human females along with such factors as abortions, spontaneous and induced age of marriage
that other factors involving the control of the number of
births experienced by a female during her fertile period. Even
though the inclusion of these details was useful in the quantifying of the short-term effects of family planning programs
to include them in age structured evolutionary process under
consideration with a long-term evolutionary perspective
seems inappropriate for model that is in an early stage of
development.
4. Survival of Individuals in
a Birth Cohort as a Function of Age
Another component of natural selection that can be accommodated in the evolution of the age structured population
under consideration is that of mortality of individuals. Let
𝑆(π‘₯) denote the probability that an individual in a birth
cohort, who by definition was age π‘₯ = 0 at birth, survives to
an age of at least π‘₯ > 0. In symbols, if 𝑋 is a random variable
taking values in the set R+ = {π‘₯ ∈ R | π‘₯ ≥ 0} of nonnegative
real numbers, where R is the set of real numbers, denoting
the life span of an individual, then
𝑆 (π‘₯) = 𝑃 [𝑋 > π‘₯] .
(34)
This function has the property that 𝑆(0) = 1 and 𝑆(π‘₯) is a
monotone decreasing function, as π‘₯ increases π‘₯ ↑ ∞ and in
the limit
lim 𝑆 (π‘₯) = 0.
π‘₯↑∞
(35)
The purpose of this section is to derive a formulas for a
survival function in terms of parametric risk functions.
For many species of animals, offspring are at high risk
of death following hatching or birth, but as age increases the
risk of death decreases. Therefore, the risk function for deaths
of individuals for whom this risk function applies will be
assumed to have a latent risk function of the exponential form
as follows:
πœƒ0 (π‘₯) = 𝛼0 𝛽0 exp [−𝛽0 π‘₯] ,
(36)
where π‘₯ ≥ 0, and 𝛼0 and 𝛽0 are positive parameters. The
integral of this latent risk function is
π‘₯
𝐻0 (π‘₯) = ∫ πœƒ0 (𝑠) 𝑑𝑠 = 𝛼0 (1 − exp [−𝛽0 π‘₯])
0
(37)
for π‘₯ ≥ 0, and, by using well known methods for expressing a
survival function in terns of an integral of the risk function, it
follows that the survival function corresponding to this risk
function is
𝑆0 (π‘₯) = exp [−𝐻0 (π‘₯)] .
(38)
In what follows, it will be assumed that this survival function
is in force for ages π‘₯ = 0, 1, 2, . . . , 30, because after age 30 the
risk of death function will be assumed to be increasing. From
now on all parameters of survival functions will depend on
genotypes and sex of individuals, but simply the notation the
genotype and sex of individuals will be suppressed except in
cases when clarity is an issue.
Observe that 𝐻0 (0) = 0, so that 𝑆(0) = 0 as it should. The
integral 𝐻0 (π‘₯) will need to be modified to accommodate the
conditions that π‘₯ = 0, 1, 2, . . . , 30. Let 𝐹0 (π‘₯) = 1 − exp(−𝛽0 π‘₯)
and let the modification of 𝐻0 (π‘₯) be defined as
𝐻0∗ (π‘₯) =
𝛼0 𝐹0 (π‘₯)
.
𝐹0 (30)
(39)
Let 𝑆∗ (π‘₯) = exp(−𝐻0∗ (π‘₯)) denote the modified survival function. Then it follows that 𝐻0∗ (30) = 𝛼0 , so that 𝑆∗ (30) =
exp(−𝛼0 ), which is the probability that an individual born at
age π‘₯ = 0 survives to at least age 30. Thus, if one has some
prior knowledge of the probability 𝑆∗ (30) = 𝑝(30), then 𝛼0 =
− ln(𝑝(30)). Given this estimate of 𝛼0 , a task that remains is
that of finding a value for the parameter 𝛽0 . Next observe that
𝑆∗ (1) = exp(−𝐻0∗ (1)) is the conditional probability that an
infant born at π‘₯ = 0 survives to age π‘₯ = 1 year. Therefore,
if one has some prior knowledge of the probability 𝑆∗ (1) =
exp(−𝐻0∗ (1)) = 𝑝(1), by knowing 𝛼0 the parameter 𝛽0 could,
in principle, be determined by finding a value of 𝛽0 > 0 that
would satisfy the equation 𝑆∗ (1) = exp(−𝐻0∗ (1)) = 𝑝(1). This
is a nonlinear equation in the unknown parameter 𝛽0 , and it
may be possible by plotting 𝑆∗ (1; 𝛽0 ) as a function of 𝛽0 > 0
to find a value 𝛽0∗ , such that 𝑆∗ (1; 𝛽0∗ ) = 𝑝(1).
8
International Journal of Stochastic Analysis
But, there is at least one other approach to find a value of
𝛽0 . Observe that
for π‘₯ ≥ 0. Therefore, the latent survival function for this component is
𝐹0 (π‘₯) = 1 − exp [−𝛽0 π‘₯]
𝑆2 (π‘₯) = exp [−𝛼2 (exp [𝛽2 π‘₯] − 1)]
(40)
is the distribution function of a exponentially distributed
random variable 𝑋0 with expectation
𝐸 [𝑋0 ] =
1
.
𝛽0
(41)
Thus, the parameter 𝛽0 will determine the speed at which
deaths occur. Small values of 𝛽0 will correspond to longer
infant survival times, while large values of 𝛽0 will correspond
to shorter survival times. Such ideas may be quantified by
assigning trial values to 𝐸[𝑋0 ] to find an estimate of the
form 𝛽0 = 1/𝐸[𝑋0 ]. For example, a plausible value of this
expectation may be chosen as 𝐸[𝑋0 ] = π‘₯π‘š /2, which yields
𝛽0 = 2/π‘₯π‘š as a preliminary estimate of 𝛽0 . The parameter 𝛽0
may also depend on the sex and genotype of an individual
and will be denoted by the symbols 𝛽0 (πœπ‘“ ) and 𝛽0 (πœπ‘š ) for
females and males of genotypes πœπ‘“ and πœπ‘š , respectively.
Given a value of 𝛽0 , one would have to check whether
𝑆∗ (1; 𝛽0∗ ) = 𝑆∗ (1; 𝛽0∗ ) = 𝑝(1) is a plausible estimate of 𝑝(1),
the conditional probability that an infant born at age π‘₯ = 0
survives to age π‘₯ = 1 year. If the estimate of 𝑝(1) is not
plausible, too low for example, then it would be necessary to
try another estimate of 𝛽0 .
To accommodate accidents that may occur throughout
the life span of an individual, the latent risk function for this
component of a survival function will have the simple form
πœƒ1 (π‘₯) = 𝛼1 for all π‘₯ ≥ 0, where 𝛼1 is a positive constant. In
this case, the integral of the risk function is
π‘₯
𝐻1 (π‘₯) = ∫ πœƒ1 (𝑠) 𝑑𝑠 = 𝛼1 π‘₯
0
(42)
for π‘₯ ≥ 0. Therefore, for π‘₯ ≥ 0, the latent survival function
has the simple form
𝑆1 (π‘₯) = exp [−𝛼1 π‘₯] .
(43)
A useful trial value of 𝛼1 , based on period studies of human
mortality, is about 0.001. This component of the model is
often referred to as the Makeham component and is named
after a 19-century investigator that introduced this component.
A third two-parameter latent risk function, due to Gompertz (19th century), deals with risks of deaths at the older
ages. Let 𝛼2 and 𝛽2 be positive parameters. Then, it will be
assumed that for π‘₯ ≥ 0 the latent risk function πœƒ2 (π‘₯) has the
form
πœƒ2 (π‘₯) = 𝛼2 𝛽2 exp [𝛽2 π‘₯] .
π‘₯
𝐻2 (π‘₯) = ∫ πœƒ2 (𝑠) 𝑑𝑠 = 𝛼2 (exp [𝛽2 π‘₯] − 1)
0
for π‘₯ ≥ 0.
By applying a general but standard formula that a density
is the risk function times the survival function, it can be
seen that the probability density function of the Gompertz
distribution has the form
𝑓2 (π‘₯) = πœƒ2 (π‘₯) 𝑆2 (π‘₯)
= 𝛼2 𝛽2 exp [𝛽2 π‘₯] exp [−𝛼2 (exp [𝛽2 π‘₯] − 1)]
(45)
(47)
for π‘₯ ≥ 0. Although this distribution may be derived
from intuitively appealing assumptions, it is more difficult to
handle from a mathematical point of view than some other
distributions that arise in probability and statistics. Nevertheless, because many advanced mathematical functions are
now available to research workers in such software packages
as MAPLE, MATHEMATICA., and MATLAB, an outline of
the mathematics used in analyzing the Gompertz distribution
seems appropriate.
Because the parameters 𝛼2 and 𝛽2 do not have obvious
statistical interpretations, such as an expectation or variance,
it is difficult to assign tentative values to them. Quite often,
however, there is some feeling about the modal age of death
for those who survive to old age. Let π‘š2 denote the mode
of the Gompertz distribution. Then by using elementary
calculus to find the maximum of the density 𝑓2 (π‘₯), it can be
shown that the equation
𝛼2 = exp [−𝛽2 π‘š2 ]
(48)
formalizes a connection among the parameters 𝛼2 , 𝛽2 , and
π‘š2 . In particular, if π‘š2 is assigned a value and 𝛽2 is known,
then 𝛼2 is determined. But, to find a plausible value of 𝛽2 ,
more input is needed.
Let 𝑋2 denote a random variable with a Gompertz distribution. Then, after considerable analysis, it can be shown that
the exact formula for the expectation is
𝐸 [𝑋2 ] = 𝑒𝛼2 [π‘š2 −
𝜈 𝜈+1
𝐢
1 ∞ (−1) 𝛼2
+ ∑
],
𝛽2 𝛽2 𝜈=0 𝜈!(𝜈 + 1)2
(49)
where 𝐢 ≃ 0.57721 ⋅ ⋅ ⋅ is Euler’s constant. Furthermore, the
exact formula for the second moment is
𝐸 [𝑋22 ] = 𝑒𝛼2 [(π‘š2 −
(44)
It will be assumed that this risk function is in force for ages
π‘₯ = 31, 32, . . . , π‘Ÿ. Observe that, as it should, this risk function increases as age π‘₯ of an individual increases, and, by
assumption, the risk of death increases exponentially with
increasing age. The integral of this risk function has the form
(46)
𝜈 𝜈+1
2 ∞ (−1) 𝛼2
𝐢 2 πœ‹2
) + 2 − 2∑
].
𝛽2
𝛽2 6 𝛽2 𝜈=0 𝜈!(𝜈 + 1)3
(50)
When 𝛼2 > 0 is small, then exp[𝛼2 ] ≃ 1 and the above
infinite series may be neglected. Thus, the approximations
𝐸 [𝑋2 ] ≃ π‘š2 −
𝐸 [𝑋22 ]
𝐢
,
𝛽2
𝐢 2 πœ‹2
≃ (π‘š2 − ) + 2
𝛽2
𝛽2 6
(51)
International Journal of Stochastic Analysis
9
hold for small 𝛼2 . Therefore, a formula for the approximate
variance of the Gompertz distribution is
𝜎22 ≃
πœ‹2
.
𝛽22 6
(52)
πœ‹
.
𝜎2 √6
(53)
Equivalently,
𝛽2 ≃
It will be instructive to present a simple numerical example, illustrating the use of these approximations. Suppose,
for example, one wished to construct a Gompertz distribution
with a mode π‘š2 = 60 years, and suppose the standard
deviation of this distribution is 𝜎2 = 10. Then, this formula
yields the approximation 𝛽2 ≃ 0.12825498301686. Furthermore, by using the formula connecting 𝛼2 , 𝛽2 , and π‘š2 , it
can be seen that this formula yields the approximation 𝛼2 =
4.54960943585548 × 10−4 . Moreover,
exp (4.54960943585548 × 10−4 ) = 1.00045506445401 ≃ 1.
(54)
Because of the approximation of the variance is 𝜎22 ≃ πœ‹2 /𝛽22 6,
it appears that the parameter 𝛽2 will belong to the interval
(0, 1) for plausible values of 𝜎2 , so that it can be seen that
when the mode π‘š2 of the distribution is sufficiently large, the
parameter 𝛼2 > 0 will be small. As can be seen form the above
discussion, when 𝛼2 is small, the approximation for 𝜎22 will
yield good results. A more extensive account of parametric
forms of risk functions for death as well as historical references has be given in Mode and Sleeman [11, Section 13.2],
but the details will be omitted here.
In the class of self regulating branch process under
consideration, the effects on the mortality of individuals are
formulated as a function of total population size 𝑍tot (𝑑) at
time 𝑑. The formula for computing a realization of the random
function 𝑍tot (𝑑) will depend on the class of stochastic process
under consideration. For the class of branching processes
under consideration, as stated in a previous section, let
𝑋(𝑑; πœπ‘“ , π‘₯) denote the number of females of type t𝑓 =
(πœπ‘“ , π‘₯) in the population at time 𝑑, and let the random function π‘Œ(𝑑; πœπ‘š , 𝑦) be defined similarly for males. Then, let the
random function
π‘Ÿ
𝑋 (𝑑; ∘, ∘) = ∑ ∑ 𝑋 (𝑑; πœπ‘“ , π‘₯)
πœπ‘“ ∈Tπ‘₯=0
(55)
denote the total number of females in the population at
time 𝑑, and let π‘Œ(𝑑; ∘, ∘) denote the corresponding random
function for males. Then, 𝑍tot (𝑑) = 𝑋(𝑑; ∘, ∘) + π‘Œ(𝑑; ∘, ∘) is total
population size at time 𝑑.
It will be assumed that the probability for all members of
the population at time 𝑑 survive to time 𝑑 + 1 is governed by a
Weibull type survival function depending on total population
size of the form
𝑆3 (𝑍tot (𝑑)) = exp (− (𝛽tot 𝑍tot (𝑑)𝛼tot )) .
(56)
At this point in the discussion, it is appropriate to discuss
the rational for choosing values of the parameter 𝛽tot . For
example, suppose the carrying capacity of the environment is
107 individuals. Then, 𝛽tot may be chosen as 𝛽tot = 10−7 . As
indicated with regard to all parameters under consideration,
the positive parameters 𝛼tot and 𝛽tot will depend on sex and
genotype. It follows, therefore, if one genotype has a smaller
value of the parameter 𝛽tot , it will be able to survive in larger
populations and will thus have a selective advantage over
the others. From now on the function (𝛽tot 𝑍tot (𝑑))𝛼tot will
play a role similar to that of an integral of a risk function as
illustrated above.
But, just as the total fertility of a female πœ† was distributed
over her reproductive ages, a density function will needed
to distribute (𝛽tot 𝑍tot (𝑑))𝛼tot over the ages of individuals π‘₯ =
1, 2, . . . , π‘Ÿ in the population at time 𝑑 > 0, which gives rise to
the question as to how such a distribution may be chosen. A
plausible answer to this question is that it may be constructed
from the risks functions as functions of age as illustrated by
the parametric risk functions introduced above. Observe that
the larger the values of a total risk functions of an individual
at these ages, the greater will be the proportion assigned to
the total contribution of the Weibull component.
Thus, if the Makeham component is included in the risk
function for the ages π‘₯ = 1, 2, . . . , 30, then the total risk function for these ages is
πœƒtot (π‘₯) = 𝛼0 𝛽0 exp [−𝛽0 π‘₯] + 𝛼1 .
(57)
For ages π‘₯ > 30 the total risk function is
πœƒtot (π‘₯) = 𝛼2 𝛽2 exp [𝛽2 π‘₯] + 𝛼1 .
(58)
Let πœƒtot be defined by the sum
π‘Ÿ
πœƒtot = ∑ πœƒtot (π‘₯) .
(59)
π‘₯=1
Then, the density for distributing (𝛽tot 𝑍tot (𝑑))𝛼tot over the ages
may be chosen as
𝑓tot (π‘₯) =
πœƒtot (π‘₯)
πœƒtot
(60)
for π‘₯ = 1, 2, . . . , π‘Ÿ. Given these definitions, the proportion,
𝐻(𝑍tot (𝑑), π‘₯), for age π‘₯ of the total (𝛽tot 𝑍tot (𝑑))𝛼tot will be
chosen as
𝐻 (𝑍tot (𝑑) , π‘₯) = 𝑓tot (π‘₯) × (𝛽tot 𝑍tot (𝑑))
𝛼tot
(61)
for π‘₯ = 1, 2, . . . , π‘Ÿ.
The next step in the formulation of the mortality component of the process is to express the final survival function
in terms of integrals of risk functions. Thus, for ages π‘₯ =
1, 2, . . . , 30 the sum of these integrals is
𝐻tot (𝑍tot (𝑑) , π‘₯) = 𝐻0∗ (π‘₯) + 𝐻1 (π‘₯) + 𝐻 (𝑍tot (𝑑) , π‘₯) . (62)
Recall that 𝐻1 (π‘₯) is the integral of the constant Makeham risk
function, namely, 𝛼1 π‘₯. For ages π‘₯ > 30,
𝐻tot (𝑍tot (𝑑) , π‘₯) = 𝐻1 (π‘₯) + 𝐻2 (π‘₯) + 𝐻 (𝑍tot (𝑑) , π‘₯) . (63)
10
International Journal of Stochastic Analysis
Therefore, the survival function for all ages π‘₯ = 1, 2, . . . , π‘Ÿ is
𝑆 (𝑍tot (𝑑) , π‘₯) = exp (−𝐻tot (𝑍tot (𝑑) , π‘₯)) .
(64)
The last step in formulating the mortality component of
the process is to derive a formula for the conditional probability that an individual that is alive at age π‘₯ survives to age
π‘₯ + 1. Let 𝑝(𝑍tot (𝑑), π‘₯) denote this probability. Then,
𝑝 (𝑍tot (𝑑) , π‘₯)
=
𝑆 (𝑍tot (𝑑) , π‘₯ + 1)
𝑆 (𝑍tot (𝑑) , π‘₯)
= exp (− ((𝐻tot (𝑍tot (𝑑) , π‘₯ + 1) − 𝐻tot (𝑍tot (𝑑) , π‘₯))) .
(65)
From this formula, it follows that in order that 0
𝑝(𝑍tot (𝑑), π‘₯) < 1, the condition
𝐻tot (𝑍tot (𝑑) , π‘₯ + 1) − 𝐻tot (𝑍tot (𝑑) , π‘₯) > 0
≤
𝑋 (𝑑 + 1; 𝑖, π‘₯ + 1) ∼ CBinom (𝑋 (𝑑; 𝑖, π‘₯) ; 𝑝𝑓 (𝑍tot (𝑑) ; 𝑖, π‘₯)) .
(70)
(66)
Similarly, if π‘Œ(𝑑; 𝑖, π‘₯) is the number of males of type t = (𝑖, π‘₯)
alive at time 𝑑, then
must be satisfied for all π‘₯ = 1, 2, . . . , π‘Ÿ. Equivalently, the function 𝐻tot (𝑍tot (𝑑), π‘₯) must be strictly monotone increasing as a
function of π‘₯. It seems likely that because all parameters 𝛽 will
be small, the component 𝐻(𝑍tot (𝑑), π‘₯) for population density
will be small in relation to the others in 𝐻tot (𝑍tot (𝑑), π‘₯).
It seems reasonable, therefore, that if an investigator in a
preliminary experiment ignores the component for population density, then it is likely that strict monotonicity of
this function will be preserved when the population density
component is included. To complete of the definition of
survival function 𝑆(𝑍tot (𝑑), π‘₯) for π‘₯ = 0, by definition
𝐻tot (𝑍tot (𝑑), 0) = 0, so that 𝑆(𝑍tot (𝑑), 0) = 1.
When checking the plausibility of a conditional survival
probability
𝑝 (𝑍tot (𝑑) , π‘₯)
(67)
in a preliminary experiment, the probabilities 𝑝(𝑍tot (𝑑), 0)
and 𝑝(𝑍tot (𝑑), 30) require special attention. For example,
𝑝 (𝑍tot (𝑑) , 0) =
𝑆 (𝑍tot (𝑑) , 1)
= exp (−((𝐻tot (𝑍tot (𝑑) , 1) ))
𝑆 (𝑍tot (𝑑) , 0)
(68)
so that the probability on the right should be consistent with
a plausible probability that an infant dies in his first year of
life. Similarly, because the risk function is increasing for all
ages π‘₯ ≥ 31, conditional probability
𝑝 (𝑍tot (𝑑) , 30)
= exp (− ((𝐻tot (𝑍tot (𝑑) , 31) − 𝐻tot (𝑍tot (𝑑) , 30) ))
The component of natural selection that needs to be
considered in formulating the mortality component of the
stochastic process under consideration is that of describing a
procedure for simulating the number of individuals classified
by age, sex, and genotype that are alive at time 𝑑 and survive
to time 𝑑 + 1. For example, let 𝑝𝑓 (𝑍tot (𝑑); 𝑖, π‘₯) denote the
conditional probability that a female of genotype 𝑖 and age π‘₯ at
time 𝑑 survives to age π‘₯ + 1 at time 𝑑 + 1. At time 𝑑 as before
let 𝑋(𝑑; 𝑖, π‘₯) denote the number of females of t = (𝑖, π‘₯), and
let 𝑋(𝑑 + 1; 𝑖, π‘₯ + 1) denote the number of these females who
survive to age 𝑋 + 1 at time 𝑑 + 1. Then, by assumption, 𝑋(𝑑 +
1; 𝑖, π‘₯ + 1)has a conditional binomial distribution with index,
sample size 𝑋(𝑑; 𝑖, π‘₯), and probability 𝑝𝑓 (𝑍tot (𝑑); 𝑖, π‘₯). In
symbols,
(69)
should satisfy the condition 0 < 𝑝(𝑍tot (𝑑), 30) < 1, so that
the condition (𝐻tot (𝑍tot (𝑑), 31)−𝐻tot (𝑍tot (𝑑), 30)) > 0 should
also be satisfied. If this condition holds when the component
for population density is ignored, then it seems likely, for
reasons stated above, that this condition will also hold when
the population density factor is taken into account.
π‘Œ (𝑑 + 1; 𝑖, π‘₯ + 1) ∼ CBinom (π‘Œ (𝑑; 𝑖, π‘₯) ; π‘π‘š (𝑍tot (𝑑) ; 𝑖, π‘₯))
(71)
for π‘₯ = 0, 1, 2, . . . , π‘Ÿ.
5. An Embedded Deterministic Model in
an Age-Structured Stochastic Process
As demonstrated in recent publications, Mode and Sleeman
[9], Mode et al. (2011), [12, 13], by deriving formulas for the
conditional expectation of any random variable at time 𝑑,
given the evolution of the process prior to 𝑑, it is possible to
derive a set of recursive nonlinear difference equations, such
that, given the initial values of the random functions at time
𝑑 = 0, estimates of the sample functions of the process can be
derived for all 𝑑 > 1. In previous publications dating back at
least a decade, this derived set of equations has been called
the deterministic model embedded in the stochastic process
under consideration. The purpose of this section is to derive
formulas for a deterministic model embedded in the age
structured stochastic process under consideration. As will be
seen in what follows, the derivation of formulas for the
embedded deterministic model is closely related to procedures for computing Monte Carlo realizations of the process.
To fix ideas, suppose at time 𝑑 = 0 the elements of the
random 3 × (π‘Ÿ + 1) matrices X(0) and Y(0), denoting, respectively, the assigned numbers of females and males classified
by age and genotype. All these initial numbers are nonnegative integers. Next consider the matrix valued conditional
expectation
𝐸 [X (1) | X (0)] .
(72)
Because X(0) is known, 𝐸[X(1) | X(0)] is known. Thus,
Μ‚ (1) = 𝐸 [X (1) | X (0)]
X
(73)
Μ‚
is, as is well known that X(1),
the best estimate of X(1) in the
sense of minimum mean square error. This procedure may be
International Journal of Stochastic Analysis
11
continued recursively by considering the conditional matrix
valued conditional expectation
𝐸 [X (𝑑) | X (𝑑 − 1)] .
(74)
But, this conditional matrix valued expectation is a random
valuable, because X(𝑑 − 1) is a random variable. However, if
one has the estimate
Μ‚ (𝑑 − 1)
X
(75)
that has been calculated recursively, then
Μ‚ (𝑑) = 𝐸 [X (𝑑) | X
Μ‚ (𝑑 − 1)]
X
(76)
is a reasonable estimate of the matrix valued random function
X(𝑑) for 𝑑 = 1, 2, . . .. This idea is the essential concept underling the procedure for embedding a recursive deterministic
system in a stochastic process whose sample functions take
values in matrices of non-negative integers. Deterministic
estimates of the matrix valued random functions for males
Y(𝑑) may be derived by an identical procedure for 𝑑 ≥ 1.
Observe that the stochastic process and the embedded deterministic equations have the same parameter space, which
is usually multidimensional.
Given these definitions, suppose that all the random
matrix valued functions for males in (5), (6), and (7) are all
Μ‚ 𝑖, 𝑦) starting in (5).
replaced by estimates of the form Y(𝑑;
Then, the estimate of the vector of the conditional probabilities in (9) have the form
𝛾̂𝑓 (𝑑; 𝑖, 𝑗) =
Μ‚π‘š (𝑑; 𝑗) 𝛼𝑓 (𝑖, 𝑗)
π‘ˆ
Μ‚π‘š (𝑑; 𝑗) 𝛼𝑓 (𝑖, 𝜈)
∑𝜈 π‘ˆ
(77)
for 𝑗 = 1, 2, 3. Observe that, because 𝛼𝑓 (𝑖, 𝑗) is constant for
all nine pairs (𝑖, 𝑗) of matings types, the 𝛼𝑓 (⋅, ⋅) parameters do
not need to be estimated. Thus, the estimate of the probability
vector in (10) is
𝛾𝑓 (𝑑; 𝑖, 𝑗) | 𝑗 = 1, 2, 3) .
𝛾̂𝑓 (𝑑; 𝑖) = (Μ‚
(78)
The next step in deriving the embedded deterministic
model is to describe a procedure for estimating the random
functions 𝑍(𝑑; 𝑖, π‘₯, 𝑗), which by definition is the number of
matings of females of age π‘₯ and genotype 𝑖 with males of
genotype 𝑗, during the time interval [𝑑, 𝑑 + 1). According to
(11), the random vector
Z (𝑑; 𝑖, π‘₯) = (𝑍 (𝑑; 𝑖, π‘₯, 𝑗) | 𝑗 = 1, 2, 3)
(79)
has a conditional multinomial distribution index, sample size,
𝑋(𝑑; 𝑖, π‘₯), and probability vector 𝛾𝑓 (𝑑; 𝑖) in (10). As is well
known, given the state of the population S(𝑑) = (X(𝑑), Y(𝑑))
at time 𝑑, the conditional expectation of the vector Z(𝑑; 𝑖, π‘₯) is
the vector
is an estimate of the random function 𝑍(𝑑; 𝑖, π‘₯, 𝑗) for all pairs
Μ‚ 𝑖, π‘₯) and
of mating types (𝑖, 𝑗) of females of age π‘₯, where 𝑋(𝑑;
𝛾̂(𝑑; 𝑖, 𝑗) are estimates, respectively, of the random functions
𝑋(𝑑; 𝑖, π‘₯) and 𝛾𝑓 (𝑑; 𝑖, 𝑗) at time 𝑑. From these definitions it
follows that
π‘₯max
Μ‚ (𝑑; 𝑖, π‘₯, 𝑗) πœ† (𝑖, π‘₯)
𝛾̂ (𝑑; 𝑖, 𝑗) = ∑ 𝑍
π‘₯π‘š
(82)
is an estimate of the random function 𝛾(𝑑, 𝑖, 𝑗) at time 𝑑, see
(21). Let
π‘₯max
Μ‚ (𝑑; 𝑖, π‘₯, 𝑗) 𝑓 (π‘₯; πœ† 0 , 40)) ,
Μ‚ (𝑑; 𝑖, 𝑗) = ( ∑ 𝑍
𝑉
π‘₯π‘š
(83)
where 𝑓(π‘₯; πœ† 0 , 40) is the density of the Poisson distribution
defined in (20). Then, at time 𝑑
Μ‚ (𝑑; 𝑖, 𝑗) πœ† 𝑖
𝛾̂ (𝑑; 𝑖, 𝑗) = 𝑉
(84)
is an estimate of 𝛾(𝑑, 𝑖, 𝑗) for all mating pairs of type (𝑖, 𝑗),
where πœ† 𝑖 is the expected potential number of total offspring
produced by a female of genotype 𝑖 throughout her life span.
Let the random function π‘Š(𝑑; 𝑖, 𝑗) denote the number of
offspring produced by matings of type (𝑖, 𝑗) during the time
interval [𝑑, 𝑑 + 1). When it is not zero, then according to (23)
this random function has a conditional Poisson distribution
with parameter 𝛾𝑓 (𝑑; 𝑖, 𝑗) and sample size 1. Therefore, the
estimate of this random function is
Μ‚ (𝑑; 𝑖, 𝑗) = 𝛾̂ (𝑑; 𝑖, 𝑗) .
π‘Š
(85)
The next step in deriving formulas for the embedded
deterministic model is to estimate the random functions
𝑂(𝑑; 𝑖, 𝑗; π‘˜) defined in the vector (28). Because, by assumption
(29), the elements of this vector have a conditional multinomial distribution with index π‘Š(𝑑; 𝑖, 𝑗) and probability vector
p(𝑖, 𝑗) = (𝑝(𝑖, 𝑗; π‘˜) | π‘˜ = 1, 2, 3) derived in a previous section,
it follows that an estimate of the random function 𝑂(𝑑; 𝑖, 𝑗; π‘˜)
is
Μ‚ (𝑑; 𝑖, 𝑗; π‘˜) = 𝛾̂ (𝑑; 𝑖, 𝑗) 𝑝 (𝑖, 𝑗; π‘˜)
𝑂
(86)
for all mating types (𝑖, 𝑗) ∈ T and genotypes π‘˜ ∈ G. Observe
that the probability 𝑝(𝑖, 𝑗; π‘˜) depends only on the probabilities of mutation, which, by assumption, are constant and,
therefore, there is no need to estimate them during any time
interval. From these results, it follows that
Μ‚ (𝑑; 𝑖, 𝑗; π‘˜)
Μ‚ (𝑑; ∘, π‘˜) = ∑ 𝑂
𝑂
(𝑖,𝑗)∈T
(87)
Μ‚ 𝑖, π‘₯, 𝑗) denote the estimate of the random
Therefore, let 𝑍(𝑑;
function 𝑍(𝑑; 𝑖, π‘₯, 𝑗). Then,
is an estimate of the total number of offspring of genotype π‘˜
produced by females during the time interval [𝑑, 𝑑 + 1).
Let 𝑝𝑓 denote the conditional probability that an offspring
is a female. Then, according to (31) an estimate of 𝐡𝑓 (𝑑; π‘˜),
the number of females of genotype π‘˜ born during the time
interval [𝑑, 𝑑 + 1), is
Μ‚ (𝑑; 𝑖, π‘₯, 𝑗) = 𝑋
Μ‚ (𝑑; 𝑖, π‘₯) 𝛾̂ (𝑑; 𝑖, 𝑗)
𝑍
Μ‚ (𝑑; ∘, π‘˜) 𝑝𝑓 .
𝐡̂𝑓 (𝑑; π‘˜) = 𝑂
𝐸 [Z (𝑑; 𝑖, π‘₯) | S (𝑑)] = 𝑋 (𝑑; 𝑖, π‘₯) 𝛾𝑓 (𝑑; 𝑖) .
(80)
(81)
(88)
12
International Journal of Stochastic Analysis
From this result it follows that
Μ‚ (𝑑; ∘, π‘˜) − 𝐡̂𝑓 (𝑑; π‘˜)
π΅Μ‚π‘š (𝑑; π‘˜) = 𝑂
(89)
is an estimate of the number of males if genotype π‘˜ born
during the time interval [𝑑, 𝑑 + 1). The estimate for the 3 × (π‘Ÿ +
1) matrix X(𝑑) at time 𝑑 is constructed by using the formula
Μ‚ (𝑑) = X
Μ‚ (𝑑; 0) , X
Μ‚ (𝑑; π‘₯ ≥ 1) ,
X
(90)
Μ‚ 0) is a 3 × 1 vector with elements 𝐡̂𝑓 (𝑑; π‘˜) for π‘˜ =
where X(𝑑;
1, 2, 3. If a reader is interested in more details, the discussion
pertaining to (33) should be consulted, because the procedure
for constructing estimates of the random matrix X(𝑑) involves
replacing a call for random numbers by a conditional expectation. The construction of an estimate of the 3 × (π‘Ÿ + 1)
random matrix for males Y(𝑑) is identical of that for females.
The last step in constructing estimates of the state of the
population at time 𝑑 + 1, given estimates at time 𝑑, is to
replace the calls for random variables in (70) and (71) with
conditional expectations of random variables with binomial
distributions for those individuals who survive from 𝑑 to 𝑑+1.
Thus,
Μ‚tot (𝑑) ; 𝑖, π‘₯) ,
Μ‚ (𝑑 + 1; 𝑖, π‘₯ + 1) = 𝑋
Μ‚ (𝑑; 𝑖, π‘₯) 𝑝𝑓 (𝑍
𝑋
Μ‚tot (𝑑) ; 𝑖, π‘₯)
Μ‚ (𝑑 + 1; 𝑖, π‘₯ + 1) = π‘Œ
Μ‚ (𝑑; 𝑖, π‘₯) π‘π‘š (𝑍
π‘Œ
(91)
for all genotypes 𝑖 ∈ G and ages π‘₯ = 0, 1, 2, . . . , π‘Ÿ.
6. Overview of Procedures and Issues Involved
in Designing Computer Experiments
Among the issues that arise as one begins the process of
designing a computer simulation experiment is that of choosing a numerical value for each of the parameters in the multidimensional parameter space of the age structured stochastic
population process under consideration. As can be seen from
Section 2, the minimal age π‘₯π‘š at which a female becomes
fertile and may bear children as well as the maximum age
π‘₯max that she is fertile must be assigned numbers. For most
human populations, the age at which females become fertile
varies around 15 years. In all computer experiments reported
in this paper, however, variation among the ages when a
female becomes fertile will be ignored, and for the sake of
simplicity that parameter π‘₯π‘š will be assigned the value π‘₯π‘š =
15. For similar reasons, the maximum fertile age for females
will be assigned the value π‘₯max = 40 years. Another parameter
that must be assigned a numerical value is π‘Ÿ, the greatest age
an individual, female or male, may attain. For most of the
computer experiments reported in this paper, this parameter
was assigned the value π‘Ÿ = 50 years. Part of the rationale
for choosing this value was the thought that in ancient
populations the life span of individuals was short when compared with those of some modern populations, so that age 50
years would be a plausible maximum life span for an ancient
human population. The other issue underlying this choice
was the time required to complete a computer experiment
covering an evolutionary time span of thousands of years.
Simply put, the larger the value chosen for the parameter π‘Ÿ,
the greater the time span required to complete a Monte Carlo
simulation experiment consisting of some number 𝑀 ≤ 100
of replications. It was found in preliminary experiments with
the assigned value π‘Ÿ = 50 could be completed in reasonable
lengths of time, given personal computer resources that were
available to conduct the experiments.
The evolutionary driving force of mutation will play a
basic role in all computer experiments reported in this paper,
and, as can be seen from (2), the probability πœ‡12 that allele 𝐴
mutates to allele π‘Ž per meiosis must be assigned a numerical
value. Similarly, the probability πœ‡21 that allele π‘Ž mutates to
allele 𝐴 must be assigned a numerical value. The assigned
values for these mutations will differ among the experiments
reported in this paper; thus, in what follows, the values used
in the experiments reported in subsequent sections will be
sated in the initial computer input for each experiment.
In Section 3 on simulating the number of births that occur
in a population in any year, it was shown that the birth process
depends on two sets of parameters. The first set of these
parameters is set forth equation (8) where a 3 × 3 matrix of
acceptance probabilities 𝛼𝑓 (𝑖, 𝑗) displayed that characterizes
preferences of females of genotype 𝑖 for males of genotype 𝑗
as sexual partners for all pairs (𝑖, 𝑗) of sexual partnerships.
The second set of parameters is (πœ† 1 , πœ† 2 , and πœ† 3 ), where πœ† 𝑖
the expected total number of offspring produced by a female
of genotype 𝑖 = 1, 2, 3 during her fertile period from ages
π‘₯π‘š = 15 up to and including age π‘₯max = 40. The values
chosen for these two sets of parameters will also vary among
experiments, so that no specific values will be given here. In
all experiments reported in this paper, it was assumed that
the probability that a baby was a girl was assigned the value
𝑝𝑓 = 100/205, and by assumption, the probability a baby was
a boy was assigned the value π‘π‘š = 1 − 𝑝𝑓 .
In Section 4, which is devoted to the survival of individuals in a population by age and from year to year, the
two-parameter Weibull distribution was chosen to provide a
framework to model the regulationof total population size.
As may be seen from (61), a function of the form (𝛽𝑍tot (𝑑))𝛼 ,
where 𝑍tot (𝑑) is total population size in year 𝑑, and 𝛼 and 𝛽 are
positive parameters that may differ by sex and genotype of an
individual, must be assigned numerical values. Because two
sexes and three genotypes are under consideration, the total
number of alpha and beta parameters is 12. For the sake of
simplicity, in all experiments reported in this paper, the values
of all alpha parameters were chosen as 𝛼 = 1, which was
equivalent to assuming that all survival functions had a simple exponential form. This simplification of the formulation
reduced the dimension of the parameter space to 6, but in
order to provide a framework for simulating the evolution
of a population that was, in part, dependent on the carrying
capacity of the environment, it would still be necessary to
assign to six beta parameters to accommodate the notion that
one genotype may have a selective advantage over others by
being a superior competitor for resources and was able to
survive and reproduce in higher population densities. To take
into account that individuals of genotypes may be more
competitive users of environmental resources, and in all
computer experiments reported in this paper, the values of
International Journal of Stochastic Analysis
the beta parameters will be given. Even though the survival
capabilities may differ by sex, it was assumed, for the sake of
simplicity, in all experiments reported in this paper that the
assigned value of each parameter in the parameter space was
equal for males and females.
In the component of survival function described in the
first paragraphs of Section 4 for ages π‘₯ = 0, 1, 2, . . . , 30, the
parameter 𝛼0 , which may differ among the three genotypes
and the two sexes, was among the parameters that must be
assigned numerical values when designing a computer simulation experiment. As indicated in Section 4, this parameter
has the interpretation that in a birth cohort for any year 𝛼0
may be interpreted as potential probability that an individual
in this cohort would survive to age 30. In all computer experiments reported in the experiments reported in this paper,
this parameter was assigned the value 𝛼0 = 0.99 for each sex
and genotype. For the sake of simplicity, it was also assumed
that the parameter 𝛽0 had the same value, 2/30, for both
sexes and genotypes within sexes. Thus, by assumption, in the
computer experiments reported in this paper, no genotype
or sex had a selective advantage over others with respect to
the component of natural selection expressed in terms of this
survival component of the formulation for ages π‘₯ = 0, 1,
2, . . . , 30.
Another aspect of the formulation governing the survival
of individuals was that of the one-parameter Makeham component, taking into account deaths due to accidents at all ages.
The positive parameter of this component was denoted by 𝛼1 ,
and for each sex and genotypes within sexes, this parameter
was assigned numerical value 𝛼1 = 0.001, which has also been
used by others in connection with parametric models of
human mortality. The technical details concerning this component are expressed in (42) and (43). The survival component in the formulation for ages π‘₯ = 31, . . . , π‘Ÿ was described
in terms of the two parameter risk function in (44). As was
shown in the discussion following this equation, the positive
parameters 𝛼2 and 𝛽2 of the Gompertz distribution may be
expressed in terms of the mode and standard deviation of
this distribution. For each genotype within sexes, the mode
and standard deviation of this distribution were assigned the
values π‘š2 = 50 and 𝜎2 = 10 years, respectively. In summary,
the procedure just described, entailing the assignment of
values of the parameters that were the same for both sexes
and genotypes within each sex, was the implementation of
an approach to quantifying the idea that forces governing the
evolution of the age structured population were neutral with
respect to survival by age.
The last component of the formulation that requires
assignments of values, expressed in terms of non-negative
integers, is the elements of the 3 by π‘Ÿ + 1 matrices X(0) and
Y(0), denoting the number of individuals by genotype 𝑖 and
age π‘₯ in the initial population at time 𝑑 = 0. In all experiments
reported in this paper, it was assumed that the initial
population for both sexes consisted only of individuals of
genotype 1 = 𝐴𝐴, so that the genotypes 2 = π΄π‘Ž and 3 = π‘Žπ‘Ž
could arise in an evolving population only as a result of
the allele 𝐴 mutating to allele π‘Ž, see (2). According to the
“Out of Africa Hypothesis,” the people who now make up
13
the current populations of Europe, Asia, Australia, and the
Americas are descendents of a group of our species who
emigrated out of Africa about 60,000 years ago. If a reader
is interested in a detailed account of this hypothesis by a
paleoanthropologist, it is suggested that the video course
and book are Hawks [14] of consulted. It is also thought by
many that this group of emigrants consisted of a rather small
number of individuals with a total population size ranging
from the hundreds to at most a few thousands. To simulate
this initial founder population, whose size was uncertain, a
random number generator was used to estimate the number
of individuals of each age and sex for genotype 1, and the
number 0 was assigned to all ages for individuals of genotypes
2 and 3. A broad objective of the computer simulation experiments reported in this paper is to study the evolution of the
mutant genotypes π΄π‘Ž and π‘Žπ‘Ž in an age structured population
evolving from the ancestral genotype 𝐴𝐴.
Various procedures will be used to statistically summarize
data generated in Monte Carlo simulation experiments. These
methods will not be described here, but if a reader is interested in a detailed overview of these methods, it is suggested
that the paper of Mode and Gallop [15] is consulted. With
a view towards complete openness as to the generation of
random numbers used in the Monte Carlo simulation experiments reported in this paper, it is appropriate to mention the
random number generator used in these experiments is that
for 32-bit word computers designed by Deng and described
in Mode and Gallop, see Section 2 of that paper. Further discussion of issues concerning random number generator may
be found in the papers by Deng and Lin [16], L’Ecuyer and
Touzin [17], and Deng [18]. But, unlike the 32-bit word
computers that the Deng random number generator was
designed for, the word length for computer used in the Monte
Carlo simulation experiments reported in this paper was a
64-bits. Because the developers of the version 11 of the APL
2000 software used to write programs to implement the age
structured process under consideration did not include a
default 64-bit word random number generator, it was necessary for users to implement such a generator. However,
to implement one of the 64-bit random number generators
described in Deng would have required a rather long period
of time to develop the necessary APL code in way, such that
the times taken to complete Monte Carlo simulation experiments would be satisfactory. Consequently, this developmental task will be postponed to some future date. It is felt,
however, that even if the software for a 64 bit random number
generator would have been developed, from a qualitative
point of view, the results of the Monte Carlo simulation experiments reported in this paper would not have been significantly different, due to, among other things, the long period of
the random number generator used in these experiments, and
it verified randomness properties.
Collections of human mortality data are also sometimes
useful in choosing numerical input to age structured stochastic processes. In this connection, the collection of data in
the book by Alderson [19] may be useful. A search for historical human mortality data on the internet may also yield
informative information.
14
7. On the Emergence of
Rare Mutations-Differing Deterministic
and Stochastic Predictions
The objective of the computer simulation experiment
reported in this section was to compare the predictions of the
embedded deterministic model with those of the stochastic
process with respect to the emergence of rare mutations. To
simulate the occurrence of rare mutations, it was assumed
that the conditional probability that the ancestral allele 𝐴
mutated to allele π‘Ž per meiosis was πœ‡12 = 10−10 , and the
conditional probability of the back mutation π‘Ž → 𝐴 was
assigned the value πœ‡21 = 10−11 . These assignments were
based on anecdotal evidence gleaned in conversations with
microbiologists, who studied the evolution of a bacterial population for thousands of generations. The mutation studied
in the experiment were rare in the sense that in previous
experiments it was assumed that probabilities of mutation
were in the interval (10−8 , 10−5 ) per cell division or, in the case
of diploids, per meiosis. It was also assumed that genotypes
of individuals carrying the mutant allele had a reproductive
advantage over individuals with the ancestral genotype. To
quantify this idea, the potential number of offspring of
females of each of the three genotypes (1 = 𝐴𝐴, 2 = π΄π‘Ž,
and 3 = π‘Žπ‘Ž) throughout their fertile years were assigned the
values (πœ† 1 , πœ† 2 , πœ† 3 ) = (6, 8, 10). Given these assignments, it
was expected that the population would grow rapidly so as
to increase the likelihood of a rare mutation being realized in
a Monte Carlo simulation experiment.
It was also assumed that mating was random in the sense
that females of any genotype when choosing a male sexual
partner had no preferences with regard to the genotype of
her male partner. To quantify this assumption it was assumed
that all the elements on the 3 × 3 matrix A𝑓 of acceptance
probabilities in (8) were assigned value 1, but it may be noted
that any constant probability would suffice for the case that
female mating preferences were neutral with respect to the
genotype of the male. With regard to the carrying capacity of
the environment, it was assumed that the beta parameters for
each genotype within each sex had the constant value 𝛽−7 . It is
of interest to note that this value is greater than the probability
πœ‡12 = 10−10 of a rare mutation, which raised the question
as to whether the carrying capacity of the environment was
sufficiently large to assure that at least one rare mutation
would actually be observed in a Monte Carlo simulation
experiment.
Given these assigned values of the parameters of the
model, a preliminary experiment was conducted using the
embedded deterministic model to simulate the evolution of
an age structured population for 5,000 years. The numerical
operations entailed in an experiment with the embedded
deterministic model consist essentially of multiplications and
additions, so that even for mutations with small probabilities
of occurrence, the number of mutant genotypes will eventually exceed one for the case of high fertility considered in the
experiment. As presented in Figure 1 there are the trajectories
of total population size for each of the three genotypes in the
female population for 5,000 years of evolution.
International Journal of Stochastic Analysis
As can be seen form this figure, the total size of the initial
ancestral population of females, consisting only of individuals
of ancestral genotype 1 = 𝐴𝐴, reached the carrying capacity
of the environment within 500 years of evolution. But, after
about 3,000 years of evolution, the total size of population of
individuals of the ancestral genotype began a steep decline as
the total numbers of the females of the mutant genotypes 2 =
π΄π‘Ž and 3 = π‘Žπ‘Ž grew to significant numbers. Sometime before
4,500 years of evolution, the total number of individuals of
genotype 2 starts a decline, as the total number of individuals
of genotypes 3 rose to predominance in the population. This
rise was due to the assumption that females of this genotype
had a selective advantage over the other genotypes due to a
higher level of reproductive success as quantified in terms
of the numerical values of the parameters (πœ† 1 , πœ† 2 , πœ† 3 ) =
(6, 8, 10). That the trajectory for the total number of individuals of genotype 3 that was still rising near the carrying capacity
of the environment suggests that this limit would be reached
shortly after 5,000 years of evolution.
To check whether the predictions of a sample of the
process would be such that the trajectories computed using
the embedded deterministic model could be interpreted as
measures of central tendency for the sample functions of the
stochastic processes, a Monte Carlo simulation experiment
was run with an evolutionary time span of 2,000 years with 50
replications. In preliminary experiments with 5 replications,
it had been observed that within 2,000 years of evolution
genotypes carrying rare mutant alleles had been realized in
the simulated data. A decision was, therefore, made to run
a confirmatory Monte Carlo simulation experiment with 50
replications of 2,000 years of evolution, which 28 took about
30 hours of computer time to complete the experiment. The
reason for computing only 50 realizations of the process was
purely practical, for it had been decided to compute 100
realizations of the process, and it would have taken about
60 hours of computing time to complete the experiment.
Moreover, the risk of power outage, resulting in the loss of
simulated data, during such a time period was significant
for the home computer used to carry out the experiments
reported in this paper. If a reader is interested in a more in
depth discussion of the issues that arise in choosing the number of replications for a Monte Carlo simulation experiment,
the paper by Mode et al. [13] may be consulted.
Whenever a Monte Carlo simulation experiment is carried out based on an age structured framework with several
genotypes, a plethora of simulated data is produced, which
gives rise to problems of presenting the results of such
experiments in an informative manner. In this section,
attention will be focused on the evolution of the population
of individuals of genotype 3 = π‘Žπ‘Ž in which each individual
carries two copies of the rare mutant allele π‘Ž. As presented in
Figure 2 there are the Min, 𝑄50 , and Max trajectories of the
total number of individuals of genotype 3 for 2,000 years
of evolution from a founder population consisting only of
individuals of the ancestral genotype 1 = 𝐴𝐴.
As can be seen from this figure, significant numbers of
individuals of mutant genotype 3 = π‘Žπ‘Ž appeared between
1,000 and 1,200 years of evolution, but, as can be seen
from Figure 1, significant numbers of this genotype were not
International Journal of Stochastic Analysis
15
×107
18
Number of individuals
16
14
12
10
8
6
4
2
0
0
500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Years
Genotype 1
Genotype 2
Genotype 3
Figure 1: Trajectories of the total population size of the three female genotypes as computed using the embedded deterministic model.
×107
14
Number of individuals
12
10
8
6
4
2
0
0
200
400
600
800 1000 1200 1400 1600 1800 2000
Years
Max
Min
Q50
Figure 2: Estimated quantile trajectories of the total number of females of mutant genotype 3.
realized in the population until somewhere between 3,000
and 4,000 years of evolution according to the predictions of
the embedded deterministic model. Consequently, this is a
clear cut case in which the predictions of the timing of the
emergence of a mutant genotype differs significantly as can be
seen by comparing the trajectories of the stochastic process
and the embedded deterministic model. If a reader is interested in other examples in which the predictions of the
stochastic process differ significantly from those of the
embedded deterministic model, the papers of Mode et al.
(2011) [12, 13] may be consulted.
When an age structured model is under consideration,
most of the simulated data arising in a Monte Carlo simulation experiment is in the form of arrays in which the number
of individuals of each age is recorded. Furthermore, when
many replications of realizations of the stochastic process are
computed, it becomes necessary to statistically summarize
the data in an informative manner. In the experiment under
consideration the mean number of individuals for each
age group and its standard deviation were computed. For
example, let πœ‡Μ‚π‘“ (𝑑; π‘₯) denotethe estimated mean number of
individuals of age π‘₯ in the female population in year 𝑑 of a
Monte Carlo simulation experiment, and let πœŽΜ‚π‘“ (𝑑; π‘₯) denote
the estimated standard deviation. Then for year 𝑑 = 300, the
graph of the estimates πœ‡Μ‚π‘“ (𝑑; π‘₯) is plotted in the upper panel of
Figure 3 for ages π‘₯ = 0, 1, 2, . . . , 50 and females of genotype 3.
Similarly, in the lower panel of this figure the estimates of the
standard deviations by age for this genotype are plotted.
From these two graphs, it can be seen that the graph
of number of individuals by age declines rapidly as age π‘₯
increases, which is a signature of rapidly growing population
with quite high levels of mortality. It is also of interest to note
that the number of individuals of the mutant genotype 3 = π‘Žπ‘Ž
was present in the population, with numbers ranging from
16
International Journal of Stochastic Analysis
Year 300 age
Number of individuals
Number of individuals
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
9
8
7
6
5
4
3
2
1
0
Year 300 standard deviation
0
5
10
15
20
Age
(a)
25 30
Age
35
40
45
50
(b)
Figure 3: Graphs of the mean and standard deviation by age for females of genotype 3 in year 300.
the low single digits to 25, within at least 300 years of simulated evolution, which illustrates that if one replied solely on
the observing the graph of the total number of females with
this genotype, this earlier appearance of the mutant genotype
would have been missed. From the lower panel of this figure,
it can be seem from the graph of the standard deviation by
age is quite irregular for those ages representing the young,
but these fluctuations level off age increases. The magnitude
of the values of the standard deviation by age are indicators
of the variations among the 50 replications of the process, particularly for ages less than 10 years. As presented in Figure 4
there are the graphs for the mean number of individuals by
age along with the corresponding standard deviation for year
𝑑 = 2, 000 in the Monte Carlo simulation experiment for
females of mutant genotype 3.
From these graphs, it can be seen in the upper panel of
the figure that, in year 2,000 of the Monte Carlo simulation
experiment, the population of infants, π‘₯ = 0, and ages near
that of infants had reached a population size of nearly 9 × 106 ,
that is, a size approaching the carrying capacity of the
environment. In the lower panel, it can be seen that the
standard deviations for these early years of life were near the
large value 6 × 105 , which is indicative of a rather high level of
variation among the 50 replications of Monte Carlo realizations of the process for these ages. As age increases, the
numerical values depicted in the graphs of the mean and standard deviations for the number of individuals decrease rather
rapidly, which again was a signature of high levels of mortality in the population across the ages. It should be noted,
however, that there is a steep drop in the graphs for both the
means and standard deviations at age 30, where it was
assumed that mortality beyond this age mortality was governed by a Gompertz type risk function. There are methodological reasons for this drop. A reader may recall that the
potential probability that an infant survives to age 30 was
assigned number 0.99. It seems plausible, therefore, that this
potential probability should be lowered. On the other hand,
this drop also suggests that different values for the two parameters of the Gompertz distribution may also help level of
f the steep decline in these graphs at age 30. While developing
the formulation of the age structured stochastic process
described in this paper, the most difficult problem encountered in this initial period of model development was the
choices made in constructing and choosing the parametric
components of the model governing the survival or mortality
of individuals. It should also be noted that for ancient populations of man considered in the experiments reported in this
paper, there is little or no data on mortality of individuals
apart from the fossil record that suggest through measurements of radioactive that maximum ages of those individuals
whose skeletons survived was in the range from 40 to 50
years. In passing, it should be mentioned that it is doubtful
whether the changes in mortality parameters just suggested
would have changed the differences in estimated timings for
the emergence of rare mutations as observed in the realized
predictions of the embedded deterministic and the stochastic
process used in the simulation experiments reported in this
section.
8. Sexual Selection in Evolving
Age-Structured Populations
Among the many questions addressed by Darwin in his
seminal work on evolution was that of the impact of sexual
selection on the evolution of two-sex-species such as man,
other mammals, and birds. A historical sketch of these ideas
is beyond the scope of this paper, but if a reader is interested
in the subject, a search of the internet may yield useful results,
or one could also consult text books on population and evolutionary genetics, which are available in university libraries as
well as for sale on the internet. For example, the book on population genetics by Hartl and Clarke [20] devotes some space
to sexual selection. Essentially all the literature on population
genetics in which models of sexual selection are presented
are confined to deterministic paradigm, but in the book
by Mode and Sleeman [9] a formulation of a two sex selfregulating branching process is the subject of chapter 11 along
with reported results of some Monte Carlo simulation experiments. A limitation of the model considered in that chapter
was that the age structure of an evolving population was
not included in the formulation. Also this chapter contains
the results of a Monte Carlo simulation experiment designed
International Journal of Stochastic Analysis
Year 2000 age
×105
6
Number of individuals
Number of individuals
×106
9
8
7
6
5
4
3
2
1
0
0
17
5
10
15
20
25
30
35
40
45
50
Age
(a)
Year 2000 standard deviation
5
4
3
2
1
0
0
5
10
15
20
25 30
Age
35
40
45
50
(b)
Figure 4: Graphs of mean and standard deviation by age for females of genotype 3 in year 2,000.
to study the rise of a recessive mutant allele in an evolving
population when the character or trait it expressed was
subject to sexual selection. In that chapter, sexual selection
was characterized in terms of matrices of acceptance probabilities regrading both females and males as in (8). In Monte
Carlo simulation experiments, it was shown that when it was
assumed that the mutant genotype 3 = π‘Žπ‘Ž was preferred as a
sexual partner over the other two genotypes, it would eventually arise to predominance in a population, but its steep rise
to predominance would not occur until the total population
size of individuals of genotype π‘Žπ‘Ž had reached a threshold. In
this experiment the embedded deterministic model was a fair
predictor of the central tendencies of Monte Carlo realization
of the sample functions of the process, but in a related experiment, Mode et al. [12], the embedded deterministic model
predicted a rare mutation that had sexual selective advantage would eventually become predominant in an evolving
population, but in the corresponding Monte Carlo simulation
experiment, the mutant recessive genotype did not appear,
so that the predictions of the deterministic and stochastic
models were not consistent.
In the Monte Carlo experiment reported in this section,
the number of years of evolution under consideration was
10,000. Among the reasons underlying this choice was that
about 10,000 years ago, in geographical regions of the world,
such as Egypt and Mesopotamia, man began to change from
a hunter-gather means of living to a more sedentary form of
life, the development of agriculture that resulted in a more
dependable source of food. At the outset it was known that if
50 Monte Carlo replication of an experiment were computed,
the span if time taken to complete the experiment would be
too long, given the private computer facilities available to
carry out the experiment. So a decision was made to compute
only 10 replications of 10,000 years of simulated evolution,
which took about 30 to 40 hours of computer time. Because
it was observed in some experiments not reported in this
section that a significant number of individuals may be alive
at age 50, a decision was made to set the maximum age at 80
years, which required the computer to process lager arrays
that in turn would increase the expected time needed to
complete an experiment. To take into account that sexual
selection was operative in the experiment, the matrix of
acceptance probabilities for females was chosen as
0.1 0.9 0.9
A𝑓 = (0.1 0.9 0.9) .
0.1 0.9 0.9
(92)
According to this selection of numerical values, the lower
case allele π‘Ž, which arose from the mutation 𝐴 → π‘Ž, was
dominant with respect to sexual selection. Interestingly, in
preliminary experiment with the embedded deterministic
model, it was observed that if the values of the matrix A𝑓 were
chosen, such that the values in the first and second columns
were assigned the constant 0.1, and the values in the third
column were assigned the constant value 0.9, then individuals
of the mutant genotype π‘Žπ‘Ž would eventually become predominate in the population, but if these values were used
in a Monte Carlo simulation experiment, then the computer
time required to complete an experiment would be too long.
Thus, for this reason the values in the matrix A𝑓 were chosen,
because the study of the emergence of a dominant mutation
would also be of interest in its own right.
By assumption, selection was neutral with respect to
reproductive success so that the potential expected number
of offspring contributed to the population by each female
through out her lifespan was assigned the number 4 for all
three genotypes. Observe that under this assumption the pace
of evolution was expected to be slower than in the experiment
reported in Section 6. To assure that an individual with a
mutant gene would appear in a simulated population within
an acceptable number of years, the probability of the mutation
𝐴 → π‘Ž per meiosis was assigned the value πœ‡12 = 10−5 , and
the probability of the back mutation π‘Ž → 𝐴 per meiosis was
assigned the value πœ‡21 = 10−6 . The carrying capacity of the
environment for each genotype for both sexes was assigned
the value, expressed in terms of the beta parameters, and had
the constant value 10−9 . All parameters not mentioned in
the above discussion were assigned the same values as those
in the experiment reported in Section 6. Moreover, it was
assumed that the initial population of females and males was
all of genotype 𝐴𝐴 and assigned the same initial numbers as
those used in the experiment reported in Section 7.
18
International Journal of Stochastic Analysis
Number of individuals
×109
7
6
5
4
3
2
1
0
0
100
200
300
400
500
600
700
800
900 1000
Years
Genotype 1
Genotype 2
Genotype 3
Figure 5: Deterministic trajectories of total numbers for each of the three male genotype under sexual selection.
As displayed in Figure 5 there are the trajectories of
the three genotypes, expressed in terns of the numbers of
individuals per genotype as functions of time, for the first
1,000 years as computed using the embedded deterministic
model.
As can be seen from these trajectories, from an initial
population of 2,693 males of genotype 𝐴𝐴, within 500 years of
simulated evolution the mutant genotype π‘Žπ‘Ž starts to become
predominant in the population, and by 1,000 years the
number of individuals of this genotype has risen to over 6 ×
109 but has not yet reached the maximum carrying capacity
of the environment. As one would expect, the number of
individuals of genotype π΄π‘Ž occurs in significant numbers in
the time interval from 500 to 1,000 years but is far below
the number of individuals of genotype 3 throughout this
time period. An inspection of the simulated data in the years
preceding 10,000 years indicates that a balanced polymorphism had been reached in which the genotype π‘Žπ‘Ž was
predominant with genotypes 𝐴𝐴 and π΄π‘Ž present in smaller
numbers. Unlike the cases in which the recessive genotype π‘Žπ‘Ž
is favored by sexual selection presented in Mode and Sleeman
[9] and Mode et al. [12], there is gradual rise but no steep rise
to predominance in the population of individuals of genotype
π‘Žπ‘Ž.
From now on, due to limitations on space, attention will
be focused on the predominant genotype 3 with respect to
total population size for males of this genotype. As presented
in Figure 6 there are graphs of the Min, 𝑄50 , and Max
trajectories as estimated from the Monte Carlo realizations
of the process as well as the deterministic trajectory (DET)
for the total number of individuals of mutant genotype π‘Žπ‘Ž for
the first 200 years of 10,000 years of simulated evolution.
As can be seen from this graph, after about 140 years
of evolution, the number of individuals of genotype 3 =
π‘Žπ‘Ž had risen to significant numbers, but after 150 years of
evolution, the variation among the 10 realizations of the
process increased as can be seen from the distances among
the Min, 𝑄50 , and Max trajectories at year 200. Interestingly,
however, at year 200, there is a large distance between the
deterministic trajectory (DET) and the Max of the stochastic
process. This observation demonstrates that even with the
high probabilities of mutation used in the experiment under
consideration, the embedded deterministic model tends to
underestimate the number of mutations that actually arise
among the simulated realizations of the process in the first
200 years of simulated evolution. But, as shown in the next
figure, after a sufficient amount of evolutionary time, the four
trajectories in Figure 6 tend to coalesce.
As can be seen from this figure on the scale of the vertical
axis, the distance between the Min and DET trajectories is
close up to about 2,500 years of evolution. But, in the years following 2,500, the four trajectories coalesce and remain near
one another at 4,000 years at evolution at least on the scale
of the vertical axis. In the years following 4,000, it was seen,
by inspecting the statistical summarizations of the Monte
Carlo realizations of the process as well as that for embedded
deterministic model, that the pattern observed at 4,000 years
of evolution continued to 10,000 years. For the class of
stochastic processes under consideration, the trajectory of the
embedded deterministic model was quite typical in the sense
that in the early years of an experiment the predictions of
the deterministic model may differ significantly from those
of the process, but in the long term, with the exception of a
few cases, the trajectories of the stochastic process and that
of the deterministic model tend to coalesce. But, the waiting
time to coalescence may indeed be very long when expressed
in terms of human life spans (Figure 7).
Because we are dealing with an age structured process, it
is appropriate to view graphs of a simulated age distribution
at some selected year of the experiment. As presented in the
upper panel of Figure 8 ther is a graph for males of the mutant
genotype 3 = π‘Žπ‘Ž, at year 200 of the experiment, and the lower
panel contains the graph of the standard deviations for each
age as estimated from the simulated data.
International Journal of Stochastic Analysis
19
×104
2
1.8
Number of individuals
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
20
40
60
80
100
120
140
160
180
200
Years
DET
Max
Min
Q50
Figure 6: Estimated Min, 𝑄50 and Max trajectories of the process and the deterministic trajectory (DET) for the total number males of
genotype 3.
×109
7.1
Number of individuals
7
6.9
6.8
6.7
6.6
6.5
6.4
6.3
6.2
1000
1500
2000
2500
3000
3500
4000
Years
Min
Q50
DET
Max
Figure 7: Stochastic and deterministic trajectories for the total numbers of males of genotype 3 for years from 1,000 to 4,000 of evolution.
250
Number of individuals
Number of individuals
600
500
400
300
200
100
0
10
20
30
40
50
60
70
80
200
150
100
50
0
10
20
30
Age
(a)
40
50
60
70
80
Age
(b)
Figure 8: Estimated mean age distribution and standard deviations by age for the number of males of genotype 3 in year 200.
International Journal of Stochastic Analysis
Number of individuals
20
600
500
400
300
200
100
0
0
200
10 20
30
40 50
60 70
Ages
80 0
100
50
150
Years
Figure 9: Three dimensional view of the evolution of the mean age distribution for males of genotype 3.
From the upper panel of the figure, it can be seen that
by year 200 of the experiment the number of males of the
mutant genotype 3 had risen to sufficiently high numbers to
form a nearly monotone decreasing mean age distribution
that one would expect in a population evolving with high
levels of mortality and fertility as realized in a large number of
births and deaths each year. But as can be seen from the lower
panel for the graph of standard deviations by age, there was
a considerable amount of variation by age around the mean
age distribution.
It is also of interest to view a three-dimensional version
of the evolving mean age distribution for males of mutant
genotype 3 = π‘Žπ‘Ž. As presented in Figure 9 there is a three
dimensional view of the evolving age distribution for males
of genotype 3 for the first 200 years of evolution.
As can be seen from this figure, the π‘₯-axis denotes age,
the 𝑦-axis denoted time in years, and the 𝑧-axis denotes the
number of individuals. Unlike the conventional displays of an
evolving age distribution, in which the numbers are normalized such that they all are in the interval [0, 1], in this figure
the number of individuals for each age is depicted. From an
inspection of this figure, it can be seen that for the first of
about 140 years of evolution the age distribution is a flat plane,
indicating that the number of individuals of mutant genotype
3 was either zero at each group or a number too small to depict
in a graph. However, sometime after 140 years of evolution the
number of individuals of mutant genotype 3 increases in the
population, and by year 200 the age distribution is typical of
a fast growing population with high levels of mortality as is
also shown in the upper panel of Figure 4.
A question that naturally arises is what is the threedimension form of the three dimension standard deviation
surface corresponding to the graph presented in Figure 9?
Presented in Figure 10 is a graph of this evolving surface.
The π‘₯- and 𝑦-axes are the same as those in Figure 9, but
on the 𝑧-axis the standard deviations are expressed in terms
of units of individuals. Observe that shape of the standard
deviation surface is very similar to that for the age distribution in Figure 9, but the range of values for the 𝑧 in Figure 10
is from 0 to 200; whereas in Figure 9 the range of the 𝑧 axis is
from 0 to 600. Thus, at its maximum, the 𝑧 axis in Figure 10
is about only one-third as that for that axis in Figure 9.
A question that naturally arises is to what extent would
the above figures change, if 50 to 100 realizations of the
process had computed rather than the small sample of ten? It
seems likely, based on other Monte Carlo simulation experiments, that from a qualitative perspective the estimated trajectories, as estimated from a Monte Carlo simulation experiment, would not change significantly, but if a larger number
of realizations of the process were computed, it is likely
more outliers would appear in the simulated data. Thus, for
example, the Max at 200 years may have risen to a larger
number than that displayed in Figure 6. Moreover, the
standard deviations by age in the lower panel of Figure 8
would very likely be lager, indicating a greater range of
variation among the simulated realizations of the process.
Another question that arises is what is the mathematical
basis underlying the apparent stochastic-balanced polymorphism that was observed in the latter years of 10,000 years
of simulated evolution? It can be shown that the two-sex age
structured process under consideration is a Markov chain in
discrete time with a state space consisting of pairs (π‘₯, 𝑦) if
3 × (π‘Ÿ + 1) matrices of non-negative integers, where the rows
represent genotypes and columns ages of individuals and π‘₯
and 𝑦 females and males, respectively. In the absence of immigration, the state (0, 0) is an absorbing state, indicating extinction of the population. Given that extinction that did not
occur, it appears that the process had converged to a quasistationary distribution of the Markov chain. No attempt
will be made here to prove this assertion, but if the reader is
interested in a derivation of a quasi-stationary distribution for
a Wright-Fisher process with two absorbing sates, chapters 5
and 6 of Mode and Sleeman [9] may be consulted.
9. Discussion, Further Reading,
and Future Experiments
In the two experiments reported in this paper dealing with
selection, only one component of selection was dealt with
in each experiment. For example, in one experiment dealing
with the component of reproductive success, if females of one
genotypes contributed to average a greater expected number
of offspring than females of other genotypes, this selective
advantage was sufficient for individuals of this genotype to
become predominant in the population in the long run. On
the other hand, if individuals of one genotype or a dominant
phenotype in males was preferred by females as sexual
partners for the case of sexual selection, this preference was
Number of individuals
International Journal of Stochastic Analysis
250
200
150
100
50
0
21
200
15
150
0
10
20 30 40
50
Ages
50
60 70 80 0
100
Yea
Years
Figure 10: Three dimensional view of the evolving standard deviation surface for males of genotype 3.
sufficient for that genotype or phenotype to become predominant in a population in the long run. In both these
experiments, all other components of selection were neutral
in the sense that for each component not under consideration
each genotype was assigned the same parameter values. In
an experiment not reported in this paper, it was also shown
that if individuals of genotype 3 = π‘Žπ‘Ž for either sex had
a competitive advantage over the other two with respect
to competition for resources, then individuals of genotype
3 would eventually become predominant in an evolving
population.
A component of selection that has so far not been studied
using the formulation under consideration is that of different
survival patterns by age as formulated in terms of the parametric birth cohort survival function described in a previous
section. For example, if females of one of the genotypes were
very good mothers in the sense that most of their children
survived to reproduce, then individuals of that genotype
would have a selective advantage over individuals with other
genotypes. It would be of interest to study this component
of selection, but such computer experiments will be deferred
to future research. Experiments dealing with combinations of
components of selection would also be of interest. There is at
least one other class of experiments that would be of considerable interest, namely, a case in which a human population
evolves during a phase of evolution in which life is maintained
by hunting animals, fishing, and gathering plant foods. Under
such conditions the carrying capacity of the environment
would be less than that considered in the experiments
reported in this paper, in which it was tacitly assumed that
agriculture had advanced to the point at which the environment could support population sizes of the order 109 . In a
hunter-gather society, however, a plausible estimate of the
carrying capacity of the environment may be one or two
people per square mile. As an illustrative example, suppose
that the environment could support two people, a female and
male, per square mile, then a population of 1,000 females and
males would require an area of 500 hundred square miles to
support such a population. Under such conditions, one would
expect that the pace of evolution would be very slow, because
a population would not attain a total population size that
would make it probable that rare beneficial mutations would
occur and become predominant in a population.
Even though it was mentioned that genetic evidence suggests that the present world human population are all descendants of a small band of our species that migrated out of
Africa about 60,000 years ago, little attention in this paper was
given to the evolution of the world population over time. In
this connection, the book by McEvedy and Jones [21] would
be of interest in which estimates of population size are given
from about 400 years before the common era (BCE) to 1975.
Another book of interest is that of McKeown [22], which
traces the evolution of the European population, and in
particular the population of England and Wales, where the
rise modern population did not start to occur until about
1750. As pointed out in this book, the evolution of modern
medicine played a significant role recent rise of population.
The book by Hostetler [23] is also of interest, because it
provides a documentation of the Hutterite society over a little
more than a century, since they migrated to South Dakota,
USA, in the 19th century and have expanded their population
to various communities in Montana and in neighboring
prairie provinces of Canada. The Hutterites are of particular
interest, because, as a population with records spanning over
a century, they experienced one of the highest levels of
fertility in recorded human history.
As this paper is being written, the estimated size of the
world human population is at about 7 billion, and as this population grows, a question that naturally arises is how many
people can the earth support? In a definitive book with this
title, Cohen [24] addresses this question in depth. Among the
many issues addressed in this book are questions regarding
the carrying capacity of the earth. It is very difficult to provide
usable answers to this question, but it may turn out that the
approach used to formulate the notion of a carrying capacity
suggested in this paper for an age structured population may
be useful in obtaining experimental answers to this question.
But it should also be mentioned that other approaches yet to
be invented may also be useful in an experimental approach
to estimate the carrying capacity of the earth.
It should also be mentioned that climate change has also
played an important role in human evolution, and moreover,
catastrophic events during the past have led to an abrupt
climate change. An example of such a change was the eruption
of the mega Toba volcano on the island of Sumatra about
70,000 to 75,000 years ago. According to Rampino and
22
Ambrose [25] during this mega eruption vast quantities of
materials emanating from the earth’s crust were thrust into
the earth’s atmosphere resulting in a “nuclear winter” that
cooled the earth to the extent that there were frosts in the
tropics for five to ten years that resulted in population crash
of animals and plants, including populations of humans.
According to this view, all human on the earth today are
descendants of a relatively small population that survived
the Toba eruption. A common term used by evolutionists to
describe small population size during evolution of a species
is “bottleneck.”
The short list of references cited after that are informative
but not current. It is, therefore, of interest to search for more
recent publications on the evolution of world population.
When this phrase was typed into an internet search engine,
a total of about 29,100,000 sites were listed. Within this large
listing of sites, a large number of technical papers were also
listed that were beyond the scope of this paper. But, there were
two significant papers that were within the scope of this paper.
One on these papers was that of Hawks et al. [26] in which
evidence for a population bottleneck was investigated during
the Pleistocene period of human evolution that included the
Toba eruption. Another paper was that of Stearns et al. [27] in
which the measurement of selection was investigated, using
data from contemporary populations. In comparison with
an average human lifespan, evolution of a population occurs
over very lager periods of time expressed in terms of generations. Therefore, when a contemporary set of data is used in
attempts to measure selection, it is necessary to investigate the
heritability of a trait measured from generation to generation,
and as this paper is read it is recommended that a reader be
mindful of the concept of heritability. For if an expression
genotype or phenotype is not governed by some region (or
regions) of a genome, the trait cannot be passed on from
generation to generation and thus cannot be subject to natural
selection.
In the evolving literature on evolutionary and population
genetics, two themes have been received considerable attention, namely, genome wide searches for signatures of selection
and methods for simulating models of genomes. In genome
wide searches, a problem that often arises is that of developing
statistical criteria to differentiate among regions of a genome,
such that selection has acted in the recent evolutionary past
from regions that presumedly have evolved under neutral
evolution. Two papers describing methods for detecting signatures of positive selection are those of Sabeti et al. [28] and
Grossman et al. [29]. Additional references on this subject
may be found in the cited literature in Mode and Gallop [15].
As yet methods have not been developed to model the evolution region of a genome with respect to mutation and selection within the framework of the age structured process
under consideration, but if a reader is interested in a brief
review of published papers dealing with simulating the
evolution of a genomic region, the papers cited in chapter
14 of Mode and Sleeman [9] may be consulted. Among these
papers there are Chadeau-Hyam et al. [30], Hoggart et al. [31],
and Hoggart et al. [32]. Preliminary algorithms for simulation
various types of mutations, such as nucleotide substitutions,
International Journal of Stochastic Analysis
deletions, insertions, and repeats are also included in chapter
14 of Mode and Sleeman.
References
[1] B. Charlesworth, Evolution in Age-Structured Populations, vol.
1 of Cambridge Studies in Mathematical Biology, Cambridge
University Press, Cambridge, Mass, USA, 1980.
[2] T. E. Harris, The Theory of Branching Processes, Die Grundlehren
der Mathematischen Wissenschaften, Bd. 119, Springer, Berlin,
Germany, 1963.
[3] K. B. Athreya and P. E. Ney, Branching Processes, Springer, New
York, NY, USA, 1972.
[4] P. Jagers, Branching Processes with Biological Applications, John
Wiley & Sons, London, UK, 1975.
[5] C. J. Mode, Multitype Branching Processes: Theory and Applications, American Elsevier, New York, NY, USA, 1971.
[6] S. Asmussen and H. Hering, Branching Processes, vol. 3 of
Progress in Probability and Statistics, Birkhäuser, Boston, Mass,
USA, 1983.
[7] M. Kimmel and D. E. Axelrod, Branching Processes in Biology,
vol. 19 of Interdisciplinary Applied Mathematics, Springer, New
York, NY, USA, 2002.
[8] P. Haccou, P. Jagers, and V. A. Vatutin, Branching Processes:
Variation, Growth, and Extinction of Populations, Cambridge
Studies in Adaptive Dynamics, Cambridge University Press,
Cambridge, UK, 2007.
[9] C. J. Mode and C. K. Sleeman, Stochastic Processes in Genetics
and Evolution: Computer Experiments in the Quantification of
Mutation and Selection, World Scientific Publishing, Hackensack, NJ, USA, 2012.
[10] C. J. Mode, Stochastic Processes in Demography and Their Computer Implementation, vol. 14 of Biomathematics, Springer,
Berlin, Germany, 1985.
[11] C. J. Mode and C. K. Sleeman, Stochastic Processes in Epidemiology, HIV/AIDS, Other Infectious Diseases and Computers, World
Scientific, Singapore, 2000.
[12] C. J. Mode, T. Raj, and C. K. Sleeman, “Monte Carlo implementations of two sex density dependent branching processes and
their applications in evolutionary genetics,” in Applications of
Monte Carlo Methods in Biology, Medicine and Other Fields of
Science, C. J. Mode, Ed., pp. 273–296, INTECH, 2011.
[13] C. J. Mode, T. Raj, and C. K. Sleeman, “Simulating the emergence and survival of mutations using a self regulating multitype branching processes,” Journal of Probability and Statistics,
vol. 2011, Article ID 867493, 39 pages, 2011.
[14] J. Hawks, The Rise of Humans: Great Scientific Debates, The
Teaching Company, Chantilly, Va, USA, 2011, http://www.thegreatcourses.com/.
[15] C. J. Mode and R. J. Gallop, “A review on Monte Carlo
simulation methods as they apply to mutation and selection as
formulated in Wright-Fisher models of evolutionary genetics,”
Mathematical Biosciences, vol. 211, no. 2, pp. 205–225, 2008.
[16] L. Deng and J. Lin, “Random number generation of a new
century,” The American Statistician, vol. 54, no. 2, pp. 145–150,
2000.
[17] P. L’Ecuyer and R. Touzin, “On the Deng-Lin random number
generators and related methods,” Statistics and Computing, vol.
14, no. 1, pp. 5–9, 2004.
[18] L. Deng, “Efficient and portable multiple recursive generators
of large order,” ACM Transactions on Modeling and Computer
Simulation, vol. 15, no. 1, pp. 1–13, 2005.
International Journal of Stochastic Analysis
[19] M. Alderson, International Mortality Statistics, Facts on File,
Inc., New York, NY, USA, 1981.
[20] D. L. Hartl and A. G. Clark, Principles of Population Genetics,
Sinauer Associates, INC., Sunderland, Mass, USA, 1989.
[21] C. McEvedy and R. Jones, Atlas of World Population History,
Facts on File, New York, NY, USA, 1979.
[22] T. McKeown, The Modern Rise of Population, Academic Press,
New York, NY, USA, 1976.
[23] J. Hostetler, Hutterite Society, The Johns Hopkins University
Press, Baltimore, Md, USA, 1974.
[24] J. E. Cohen, How Many People Can the Earth Support? W.W.
Norton & Company, New York, NY, USA, 1995.
[25] M. Rampino and S. Ambrose, “Volcanic winter in the Garden of
Eden: the Toba supereruption and the late Pleistocene human
population crash,” in Symposium: Catastrophism, Natural Disasters and Cultural Change, World Archaeological Congress 4,
University of Cape Town, 1999.
[26] J. Hawks, K. Hunley, S. Lee, and M. Wolpoff, “Population
Bottlenecks and Pleistocene evolution,” Molecular Biology and
Evolution, vol. 17, no. 1, pp. 2–22, 2000.
[27] S. C. Stearns, S. G. Byars, D. R. Govindaraju, and D. Ewbank,
“Measuring selection in contemporary human populations,”
Nature Reviews Genetics, vol. 11, pp. 611–622, 2010.
[28] P. Sabeti et al., “Positive selection in the human linage,” Science,
vol. 312, pp. 1614–1620, 2006.
[29] S. Grossman et al., “A composite of multiple signals distinguishes causal variants in regions of positive selection,” Science,
vol. 327, pp. 883–886, 2010.
[30] M. Chadeau-Hyam et al., “Fregene: simulation of realistic
sequencelevel-data in populations and ascertained samples,”
BMC Bioinformatics, vol. 9, article 364, 2008.
[31] C. Hoggart et al., “Sequence level population simulations over
large genomic regions,” Genetics, vol. 177, pp. 1725–1731, 2007.
[32] C. Hoggart et al., “Genome-wide significance for dense SNP and
resequencing data,” Genetic Epidemiology, vol. 32, pp. 179–185,
2008.
23
Advances in
Operations Research
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Advances in
Decision Sciences
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Mathematical Problems
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Algebra
Hindawi Publishing Corporation
http://www.hindawi.com
Probability and Statistics
Volume 2014
The Scientific
World Journal
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International Journal of
Differential Equations
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Volume 2014
Submit your manuscripts at
http://www.hindawi.com
International Journal of
Advances in
Combinatorics
Hindawi Publishing Corporation
http://www.hindawi.com
Mathematical Physics
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Complex Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com
Stochastic Analysis
Abstract and
Applied Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
International Journal of
Mathematics
Volume 2014
Volume 2014
Discrete Dynamics in
Nature and Society
Volume 2014
Volume 2014
Journal of
Journal of
Discrete Mathematics
Journal of
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Applied Mathematics
Journal of
Function Spaces
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Optimization
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Download