Modeling Population Growth and Extinction It sometimes seems that there are daily news reports about the threats of global warming leading to the extinction of species. However, most of the introductory treatments of models for population dynamics in mathematics classes typically consider only population growth models. In this article, we will construct a pair of more general models that encompass the possibility of both growth and extinction. Population Growth Models We begin with a brief review of the two most common population growth models. The simplest model that arises frequently across the mathematics curriculum at many different levels is the exponential growth model for a population P. At the calculus and higher levels, it is expressed as the differential equation P’ = aP, a > 0. At the precalculus or algebra level, it is expressed discretely using the difference equation ∆Pn = aPn , a > 1, where ∆Pn = Pn+1 - Pn. However, because no population can continue to grow in an exponential pattern indefinitely, we typically consider the inhibited growth or logistic model, which is expressed either as the differential equation P’ = aP - bP2, or discretely as the difference equation ∆Pn = aPn - bPn2. In both cases, a > 1 and b is a positive constant that is considerably smaller than a. In either form, the solution has the S-shaped logistic pattern shown in Figure 1. The horizontal asymptote L is called the limit to growth, the maximum sustainable population, or the carrying capacity of the environment. We solve for L algebraically from either the differential equation or the difference equation to get L = a/b. We can use this value for L to rewrite both models if we factor the parameter b from the equation: P’ = aP - bP2 = bP(L – P) ∆Pn = aPn - bPn2 = b Pn (L - Pn). Both model highlight the fact that the change in the population depends on both the size of the population P and the difference between the current population and the maximum sustainable level, which we can think of as the space remaining for additional members of the species. Furthermore, the inflection point seen in Figure 1, where the population is growing most rapidly, turns out to occur at a height of P = ½L = ½(a/b) because both P’ and ∆P are quadratic functions of P with negative leading coefficients. Figure 1: Graph of a logistic function Figure 1 Both models can be solved algebraically when either P’ = 0 or ∆Pn = 0 to give P = 0 and P = L. These solutions divide the t-P plane or the n-Pn plane into three distinct regions: P > L, 0 < P < L, and P < 0. Whenever P > L, both P’ < 0 and ∆Pn < 0, so that any solution that starts above the maximum sustainable population must decay down toward it. For any solution that starts between P = 0 and P = L, both P’ > 0 and ∆Pn > 0, so that the solution must be increasing. Also, whenever P < 0, both P’ and ∆Pn are also negative, and so any solution that starts below the horizontal axis must decay toward -∞ as time passes. We illustrate all of these possibilities in Figure 2. Figure 2: Different solution patterns for the logistic equation The two horizontal asymptotes, including the horizontal axis, in Figure 2 are the equilibrium levels for the solutions. The upper one at the height of L = a/b is a stable equilibrium since any solution that starts near that level converges to that Figure 2 equilibrium; the lower equilibrium at P = 0 is an unstable equilibrium because any solution that starts near it diverges from that level. Modeling the Extinction of Species We now turn to the question of how to model the extinction of species. We tend to think that, whenever a species is present in an environment, it will grow. Even if there is a catastrophic change in the local conditions – say a major volcanic eruption (such as Mount Saint Helen’s), or a significant rise in temperature to make the habitat uncomfortably warm, or even a meteor impact such as the one that apparently brought an end to the age of the dinosaurs – our image of population dynamics would suggest that the survivors will continue to breed and so expand the population. Obviously, since species do become extinct, there must be an additional biological mechanism that comes into play to change the dynamics of such a situation. In addition to a maximum sustainable population, biologists also talk about a minimum sustainable population – a level K below which there are not enough members of the species to sustain it. Thus, if the population drops below this level for whatever reason, we should expect the population to begin to decay toward zero. At a very simplistic level, one could simply assume that the process follows an exponential decay model. However, as we discuss below, this is not a reasonable pattern for realistic populations. We now attempt to extend the logistic growth model to take this decay toward extinction into account as well. To do so, we will take the above development of the logistic model and essentially reverse the steps to create the model. We now need three equilibrium levels, one for a zero population, another corresponding to the maximum sustainable population L, and a third corresponding to the minimum sustainable population K, as illustrated in Figure 3. These three equilibria divide the plane into four distinct regions. Figure 3 Figure 3: Different regions of the t-P plane for logistic-type growth and extinction Let’s think about the behavior of the solutions in each of these regions. In Region IV, where P < 0, we should expect that the solutions decay toward -∞. In Region I, where P > L, we should expect that the solutions decay toward L, as we have with the logistic model. Similarly, in Region II, where K < P < L, we should expect that the solutions rise toward L, eventually in an asymptotic manner, much as they behave with the logistic model. Finally, in Region III, where 0 < P < K, we should expect that the solutions decay toward zero. But, what should be the actual pattern for the decay in Region III? Should it be a purely concave up pattern, as with exponential decay? Picture what happens in the logistic model with solutions that start on either side of the equilibrium level L. If the initial population is significantly above L¸ the solution will drop very rapidly at first and eventually slow down as it approaches L asymptotically. If the initial population is very close to, but above, L, it will decay very slowly toward L. If the initial population is equal to L, it will remain at that level indefinitely. If the initial population is slightly below L, and above the inflection point at ½L, it rises slowly in a concave down pattern as it approaches L asymptotically. Finally, if the initial population is well below L, and also below the inflection point, it rises ever more rapidly until it passes the inflection point and then begins to slow. The change from one behavior pattern to the next happens continuously. Consequently, we should expect that, in the extended logistic model we are trying to create, there should also be a continuous change in the behavior of solutions depending on where they start with regard to the new equilibrium level at the minimum sustainable population. In particular, the assumption that the behavior of the solutions in Region III automatically follows an exponential decay pattern seems inappropriate since that presumes a sudden drop in the population if the value is slightly below K. Rather, as Figure 4 shown in Figure 4, it makes more sense to expect that the solutions will display a vertically reversed logistic shape and that there will be an inflection point level in this region as well. Figure 4: Solution patterns for logistic-type growth and for extinction We show the right-half of the t-P plane in Figure 5 along with the signs of P’ or ∆Pn needed to produce the patterns of behavior we would expect in each region. In particular, we need to analyze the signs of the factors needed for either P’ or ∆Pn in each of the four regions that will lead to these behavior patterns. Clearly, in Region I, we need P’ < 0 or ∆Pn < 0. In Region II, we need P’ > 0 or ∆Pn > 0. In Region III, we need P’ < 0 or ∆Pn < 0. Finally, in Region IV, we need P’ < 0 or ∆Pn < 0. Figure 5: Signs of P’ in each region of the t-P plane We next see how we might get these sign combinations based on our experience above with the logistic model. The extra equilibrium level of P = K suggests that we might include an additional linear factor of the form K – P in either the differential equation or the difference equation. If we do so, then in Region I, we have P > 0, L – P < 0, and K – P < 0 and their product will be positive, which does not Figure 5 give the correct behavior. We might “fix” this problem, however, by introducing a negative leading coefficient, so that we can try a differential equation of the form P’ = -αP(L – P)(K – P), where α is a positive constant. In Region I, this expression must be negative because P > 0, L – P < 0, and K – P < 0. Next, in Region II, we have P > 0, L – P > 0, and K – P < 0 and their product times –α will be positive, as desired. Similarly, in Region III, we have P > 0, L – P > 0, and K – P > 0, so that -αP(L – P)(K – P) < 0, as desired. Finally, in Region IV, we have P < 0, L – P > 0, and K – P > 0, so that -αP(L – P)(K – P) > 0, which does not give the behavior pattern we expect. We note that this cubic model is known to ecologists as the logistic model with Allee Effect [1, 2] after the biologist Warder Clyde Allee who developed this cubic model in the 1930’s and 1940’s. In order to get the signs to match up appropriately, we need to extend the Allee model by using a quartic model instead. In particular, we use P2 instead of P as the first factor in the differential equation. When we do so, we have the differential equation P’ = -αP2(L – P)(K – P) and it is easy to verify that the sign of P’ is the desired one in each of the four regions. The same is true of the corresponding difference equation model, ∆Pn = -αPn2(L – Pn)(K – Pn). Finding the Inflection Points With the logistic model, we know that the inflection point occurs when the population passes a height of ½L. We now consider the possible inflection points that arise in our quartic growth and extinction model. We expand the expression for P’ as a fourth degree polynomial in P to get P’ = -α[P4 –(K + L)P3 + KLP2]. The inflection points correspond to the points where the derivative of this quartic function of P achieves its maximum or minimum, so we need to solve where the derivative is zero: -α[4P3 - 3(K + L)P2 + 2KLP] = 0. (Note that this is not P”, which is the second derivative with respect to t.) One immediate solution is P = 0 and we note that the concavity of P’ as a function of P changes on either size of the horizontal axis, as seen in Figure 4. The remaining two inflection points correspond to the roots of the quadratic equation 4P2 - 3(K + L)P + 2KL = 0. Using the quadratic formula, we find that P 3( K L) 9 K 2 14 KL 9 L2 . 8 Unfortunately, this expression provides little insight into the locations of the inflection points. To gain a better feel, let’s consider an example with specific values for the parameters. Suppose that the population in question has a maximum sustainable level of L = 2000 and a minimum sustainable level of K = 200. With these values, the above quadratic formula gives P = 131.73 and P = 1518.27 as the two roots, correct to two decimal P' P 1.00E+08 places. We see that the first inflection point corresponds to a height of roughly ⅔ -200 -4.00E+08 Figure 6 800 300 1300 1800 -9.00E+08 of the minimum sustainable population and the -1.40E+09 second inflection point corresponds to a height -1.90E+09 of roughly 3/4 of the maximum sustainable Figure 6 population. We show the graph of the corresponding cubic function of P’ versus P in Figure 6 without the –α coefficient. Figure 6: Graph of P´ versus P with L = 2000 and K = 200 Some Solution Curves We next consider some specific solution curves based on different initial values for the population. Doing this is considerably simpler using the discrete formulation, because we can rewrite ∆Pn = Pn+1 - Pn = -αPn2(L – Pn)(K – Pn) as Pn+1 = Pn - αPn2(L – Pn)(K – Pn) and calculate all successive values given any desired starting value for P0. Alternatively, if we attempted to use the continuous formulation, we could integrate the differential equation in closed form using partial fractions, but that would produce a complicated combination of transcendental expressions and a rational expression in P that is equal to a multiple of t. Since it is not possible to solve the resulting expression for P as a function of t, we would face a complicated implicit function and the best that could be done with it is to approximate the coordinates of a large number of points along each solution curve. Consequently, in what follows, we work with the discrete models. A little exploration shows that, in order to produce reasonable values for Pn based on the difference equation model, it is necessary for the parameter α to take on extremely small values, roughly on the order of 10-10. If we use values for α that are larger than this, the successive values generated by the difference equation for Pn+1 become extremely large very quickly; rather than falling into any reasonably smooth pattern, the successive values typically jump from one of the four regions far into a different region and may bounce around dramatically. Suppose we then select α = 10-10 and 2000 again use K = 200 and L = 2000, as above. We 1600 show the results of starting with initial values of 1200 P0 = 500 and P0 = 1200 in Figure 7. Notice that 800 both curves approach the horizontal asymptote 400 of 2000, which is the maximum sustainable 0 0 population level, very quickly – the upper curve 5 10 15 20 25 30 Figure 7 starting at P0 = 1200 essentially reaches it in about n = 8 time periods and the lower curve starting at P0 = 500 gets there in about n = 28 time periods. Figure 7: Solutions starting from P0 = 500 and P0 = 1200 Moreover, as we indicated above, 2000 the inflection point for the increasing part of the solution curve occurs at a height of about P = 1518. If we consider the the 1600 initial condition P0 = 1600, the solution starts at above this level and its shape is 1200 0 5 10 exclusively concave down, as shown in Figure 8. On the other hand, the solution Figure 8 starting at P0 = 500 begins below this level and its shape, as seen in Figure 7 above, follows a roughly logistic pattern. 15 Figure 8: Solution starting from P0 = 1600 In comparison, in Figure 9, we show 200 the results of using initial values of P0 = 180, P0 = 75, and P0 = -50. The inflection point in Region III occurs at a height of 0 50 100 150 200 250 300 350 roughly P = 132. The solution starting at P0 = 75 begins below this level and its shape -200 follows a decreasing, concave up pattern. In contract, the solution starting at P0 = 180 Figure 9 begins above this level and its shape is obviously decreasing and concave down at first, but clearly changes to decreasing and concave up at around n = 250, where P is roughly 130, as predicted. Finally, the solution starting at P0 = -50 is obviously decreasing and concave down in the view shown; beyond about n = 250 it drops extremely rapidly as it decreases toward -∞. Figure 9: Solutions starting from P0 = 180, P0 = 75 and P0 = -50 Estimating the Parameter Values In any practical setting, one would normally start with a set of observed or experimental data and seek to fit a function of the desired form to that data. When working with data that falls into a logistic pattern, obtaining a logistic function that fits the data extremely well is simple using the built-in logistic routine in most graphing calculators. Failing in that, one could apply either of the following two approaches using Excel, say, which does not have a logistic function routine. We begin with the logistic difference equation ∆Pn = aPn - bPn2, where a and b are two parameters whose values we need to estimate. If we have a set of data consisting of n and Pn values, we can form a table in column form, say, and extend it with an additional column containing the differences ∆Pn of successive values of Pn. Since the difference equation basically says that ∆Pn is a quadratic function of Pn, we can “hit” these two columns with the quadratic polynomial fit and find the best quadratic function in the regression sense to match the data. This is one approach to estimating the values for a and b. Alternatively, we could divide both sides of the difference equation by Pn to get ∆Pn/Pn = a - bPn, which is a linear function of Pn. Consequently, we could create still another column in the table with values for the expression ∆Pn/Pn and “hit” that column and the column for Pn with the linear regression routine to produce a different pair of estimates for a and b. We note that both approaches typically lead to functions that are not as good as the one obtained by the calculator’s logistic regression routine. Both of the above approaches easily generalize to methods for estimating the parameters in either the logistic model with the Allee effect or the difference equation model we developed here. First, the model with the Allee effect is the differential equation P’ = -αP(L – P)(K – P) and we can create the comparable difference equation model ∆Pn = -αPn (L – Pn)(K – Pn). If we multiply this out, we have the equivalent discrete model ∆Pn = -α [KLPn - (K – L) Pn 2 + Pn3], which clearly shows that ∆Pn is a cubic function of Pn. Therefore, we could simply use the cubic polynomial fit routine of Excel or of any calculator applied to appropriate columns of data values to estimate the values of the associated parameters. In turn, this would allow us to estimate the values for both the maximum sustainable population L and the minimum sustainable population K, assuming that these values are not evident from the original data. Alternatively, we could divide both sides of the difference equation by Pn to get ∆Pn/Pn = -α [KL - (K – L) Pn + Pn2], and apply the quadratic regression routine to estimate the parameters. This is probably a preferable approach because we have three parameters to estimate and the quadratic function has three coefficients, while the cubic function has four. Next, we consider the difference equation we have developed here, ∆Pn = -αPn2(L – Pn)(K – Pn). If we multiply this out, we get the equivalent difference equation ∆Pn = -α [KLPn2 - (K – L) Pn 3 + Pn4], so that ∆Pn is now a quartic function of Pn . We can therefore simply apply the quartic polynomial routine of a calculator or Excel to estimate the values for the parameters and, again, to estimate the minimum and maximum sustainable population levels. Alternatively, we could divide both sides of the above difference equation by Pn to get ∆Pn/ Pn = -α [KLPn - (K – L) Pn 2 + Pn3], which says that ∆Pn/ Pn is a cubic function of Pn . We would then apply cubic regression to estimate the parameters. Alternatively, we could divide the difference equation by Pn 2 to get ∆Pn/ Pn 2 = -α [KL - (K – L) Pn + Pn2], so that ∆Pn/ Pn 2 is a quadratic function of Pn . We could then apply quadratic regression to estimate the parameters. Again, since there are three parameters to determine, it probably makes the most sense to go with the quadratic fit to ∆Pn/ Pn 2 rather than the quartic fit to ∆Pn/ Pn or the cubic fit to ∆Pn. References 1. Allee, W. C. (1931). Animal Aggregations. A Study in General Sociology. University of Chicago Press, Chicago. ISBN 0-404-14501-9. 2. Allee, W. C. (1949). Principles of Animal Ecology. W.B. Saunders Co., Philadelphia. ISBN 0-7216-1120-6. 3. Clark, Colin W. (2005). Mathematical Bioeconomics: The Optimal Management of Renewable Resources, 2nd Ed, Wiley-Interscience, New York, ISBN 0-471-75152-9 Acknowledgment The work described in this article was supported by the Division of Undergraduate Education of the National Science Foundation under grants DUE0310123 and DUE-0442160. However, the views expressed are those of the author and do not necessarily reflect those of the Foundation. Keywords Population growth, logistic model, extinction, Allee factor, differential equation, difference equation Abstract The exponential growth model and the logistic model typically introduced in the mathematics curriculum presume that a population grows exclusively. In reality, species can also die out and more sophisticated models that take the possibility of extinction into account are needed. In this article, two extensions of the logistic model are considered, one known as the logistic model with Allee effect, which is a cubic model, and the other a further generalization, which is a quartic model. The approach used is designed to illustrate some techniques of model-building.