Math Modeling getting started & getting solutions K. M. Bliss K. R. Fowler B. J. Galluzzo Publisher Society for Industrial and Applied Mathematics (SIAM) 3600 Market Street, 6th Floor Philadelphia, PA 19104-2688 USA www.siam.org funding provided by The Moody’s Foundation in association with the Moody’s Mega Math Challenge, the National Science Foundation (NSF), and the Society for Industrial and Applied Mathmatics (SIAM). Authors Karen M. Bliss Department of Math & Computer Science, Quinnipiac University, Hamden, CT Kathleen R. Fowler Department of Math & Computer Science, Clarkson University, Potsdam, NY Benjamin J. Galluzzo Department of Mathematics, Shippensburg University, Shippensburg, PA design & Connections to common core state standards PlusUs www.plusus.org PRODUCTION First Edition 2014 Printed and bound in the United States of America No part of this guidebook may be reproduced or stored in an online retrieval system or transmitted in any form or by any means without the prior written permission of the publisher. All rights reserved. CONTENTS 1. introduction 2 2. defining the problem statement 10 3. making assumptions 15 4. defining variables 20 5. building solutions 25 6. analysis and model Assessment 32 7. putting it all together 40 appendices & reference 45 the world around us is filled with important, unanswered questions. 1. INTRODUCTION The world around us is filled with important, unanswered questions. What effect will rising sea levels have on the coastal regions of the United States? When will the world’s human population surpass 10 billion? How much will it cost to go to college in 10 years? Who will win the next U.S. Presidential election? There are also other phenomena we wish to understand better. Is it possible to study crimes and identify a burglary pattern [1, 10]? What is the best way to move through the rain and not get soaked [7]? How feasible is invisibility cloaking technology [6]? Can we design a brownie pan so the edges do not burn but the center is cooked [2]? Possible answers to these questions are being sought by researchers and students alike. Will they be able to find the answers? Maybe. The only thing one can say with certainty is that any attempt to find a solution requires the use of mathematics, most likely through the creation, application, and refinement of mathematical models. A mathematical model is a representation of a system or scenario that is used to gain qualitative and/ or quantitative understanding of some real-world problems and to predict future behavior. Models are used in a variety of disciplines, such as biology, engineering, computer science, psychology, sociology, and marketing. Because models are abstractions of reality, they can lead to scientific advances, provide the foundation for new discoveries, and help leaders make informed decisions. This guide is intended for students, teachers, and anyone who wants to learn how to model. The aim of this guide is to demystify the process of how a mathematical model can be built. Building a useful math model does not necessarily require advanced mathematics or significant expertise in any of the fields listed above. It does require a willingness to do some research, brainstorm, and jump right in and try something that may be out of your comfort zone. 3 1: introduction math modeling vs. word problems Modeling problems are entirely different than the types of word problems most of us encountered in math classes. In order to understand the difference between math modeling and word problems, consider the following questions about recycling. 1. The population of Yourtown is 20,000, and 35% of its citizens recycle their plastic water bottles. If each person uses 9 water bottles per week, how many bottles are recycled each week in Yourtown? 2. How much plastic is recycled in Yourtown? The solution to the first question is straightforward: 0.35 × 20,000 people × 9 bottles bottles = 63,000 person × week week This type of question might appear in a math textbook to reinforce the concept that we translate the phrase “35% of” to the mathematical computation “0.35 times.” It is an example of what we would call a word problem: the problem explicitly gives you all the information you need. You need only determine the appropriate math computation(s) in order to arrive at the one correct answer. Word problems can be used to help students understand why we might want to study a particular mathematical concept and reinforce important mathematical skills. The second question is quite different. When you read a question like this, you might be thinking, “I don’t have enough information to answer this question,” and you’re right! This is exactly the key point: we usually don’t have complete information when trying to solve real-world problems. Indeed, such situations demand that we use both mathematics and creativity. When we encounter such situations where we have incomplete information, we refer to the problem as open-ended. It turns out that mathematical modeling is perfect for open-ended problems. This question, for example, might have been conceived because we saw garbage cans overflowing with water and soda bottles and then wondered how many bottles were actually being thrown out and why they were not being recycled. Modeling allows us to use mathematics to analyze the situation and propose a solution to promote recycling. In the word problem example above, it is assumed that each person in town uses 9 plastic water bottles per week and that 35% of the 20,000 people recycle their water bottles every time they use one. Are these reasonable assumptions? The number 20,000 is probably an estimate of Yourtown’s population, but where is the other information coming from? Is it likely that every person in the town uses exactly 9 water bottles every week? Is it likely that 35% of people recycle every water 4 the volume of plastic waste Yourtown sent to landfills bottle they use while 65% of people never recycle any last year,” then there is exactly one correct answer. of their water bottles? Probably not, but maybe this is However, it’s unlikely that you will ever have sufficient an average value, based on other data. The first problem information to find that answer. In light of this, you doesn’t invite us to determine whether the scenario will develop a model that best estimates the answer is realistic; it is assumed that we will accept the given given the available information. Since no one knows information as true and make the appropriate the true answer to the question, your model is at least computations. as important as the answer itself, as is your ability to In order to answer the second (modeling) problem explain your model. above, you would need to research the situation for In contrast to word problems, we often use the yourself, making your own (reasonable) assumptions phrase “a solution” (as opposed to “the solution”) when and strategies for answering the question. The we talk about modeling problems. This is because question statement doesn’t provide specific details people who look at the same modeling problem may about Yourtown. have different perspectives into its You will have to determine resolution and can certainly come up what factors about Yourtown people who look at the with different, valid alternative solucontribute to the amount of plastic same modeling problem tions. It is worth noting that word that gets recycled. It seems reasonmay have different problems can actually be thought of able to believe that the population perspectives into its as former modeling problems. That is of Yourtown is an important factor, resolution and can to say, someone has already deterbut what else about the city affects certainly come up mined a simple model and provided the recycling rate? The question with different, valid you with all the relevant pieces of statement failed to mention what alternative solutions. information. This is very different types of plastic you should be takfrom a modeling problem, in which ing into account. It would be hard you must decide what’s important and how to piece it to quantify all plastic thrown away. Is it a reasonable all together. assumption to consider only the plastics from food and Mathematical modeling questions allow you to beverage containers if you believe those are the priresearch real-world problems, using your discoveries mary plastic waste sources? You would have to do some to create new knowledge. Your creativity and how you research and make some assumptions in order to make think about this problem are both highly valuable in any progress on this problem. finding a solution to a modeling question. This is part If, after your research, you distill the original probof what makes modeling so interesting and fun! lem into something very specific, such as “Determine 5 1: introduction overview of the modeling process figure 1. Real world problem Building the model defining the problem Getting a solution repeat as needed or as time allows research & brainstorming Making assumptions Defining variables Analysis & model assessment reporting results This guide will help you understand each of the components of math modeling. It’s important to remember that this isn’t necessarily a sequential list of steps; math modeling is an iterative process, and the key steps may be revisited multiple times, as we show in Figure 1. 6 • Getting a Solution What can you learn from your model? Does it answer the question you originally asked? Determining a solution may involve penciland-paper calculations, evaluating a function, running simulations, or solving an equation, depending on the type of model you developed. It might be helpful to use software or some other computational technology. • Defining the Problem Statement Real-world problems can be broad and complex. It’s important to refine the conceptual idea into a concise problem statement which will indicate exactly what the output of your model will be. • Making Assumptions Early in your work, it may seem that a problem is too complex to make any progress. That is why it is necessary to make assumptions to help simplify the problem and sharpen the focus. During this process you reduce the number of factors affecting your model, thereby deciding which factors are most important. • Analysis and Model Assessment In the end, one must step back and analyze the results to assess the quality of the model. What are the strengths and weaknesses of the model? Are there certain situations when the model doesn’t work? How sensitive is the model if you alter the assumptions or change model parameters values? Is it possible to make (or at least point out) possible improvements? • Defining Variables What are the primary factors influencing the phenomenon you are trying to understand? Can you list those factors as quantifiable variables with specified units? You may need to distinguish between independent variables, dependent variables, and model parameters. In understanding these ideas better, you will be able both to define model inputs and to create mathematical relationships, which ultimately establish the model itself. • Reporting the Results Your model might be awesome, but no one will ever know unless you are able to explain how to use or implement it. You may be asked to provide unbiased results or to be an advocate for a particular stakeholder, so pay attention to your point of view. Include your results in a summary or abstract at the beginning of your report. We will address the components in more detail one by one, but we note again that this should not be thought of as a checklist for modeling. Throughout the process of building your model, you’ll likely move back and forth among the components. Take careful notes as you go; it’s easy to get caught up in the modeling process and forget what you’ve done along the way! 7 1: introduction primary examples used throughout this guide We demonstrate the modeling process by looking at three modeling questions in detail. We state those problems directly below and then explore them throughout the remainder of this guide. Plastics aren’t the only problem. So many of the materials we dispose of can be recycled. Develop a mathematical model that a city can use to determine which recycling method it should adopt. You may consider, but are not limited to: waste not, want not: putting recyclables in their place (A selection from Moody’s Mega Math Challenge: 2013 Problem. The full question and a solution paper submitted by Team 1356 from Montgomery Blair High School, Silver Spring, Maryland, coached by David Stein and with student members Alexander Bourzutschky, Alan Du, Tatyana Gubin, Lisha Ruan, and Audrey Shi, is included as Appendix B.) Plastics are embedded in a myriad of modern-day products, from pens, cell phones, and storage containers to car parts, artificial limbs, and medical instruments; unfortunately, there are long-term costs associated with these advances. Plastics do not biodegrade easily. There is a region of the Northern Pacific Ocean, estimated to be roughly the size of Texas, where plastics collect to form an island and cause serious environmental impact. While this is an international problem, in the U.S. we also worry about plastics that end up in landfills and may stay there for hundreds of years. To gain some perspective on the severity of the problem, the first plastic bottle was introduced in 1975 and now, according to some sources, roughly 50 million plastic water bottles end up in U.S. landfills every day. • Providing locations where one can drop off pre-sorted recyclables • Providing single-stream curbside recycling • Providing single-stream curbside recycling in addition to having residents pay for each container of garbage collected Your model should be developed independent of current recycling practices in the city and should include some information about the city of interest and some information about the recycling method. Demonstrate how your model works by applying it to each of the following cities: Fargo, North Dakota; Price, Utah; Wichita, Kansas. 8 Outbreak? Epidemic? Pandemic? Panic? We all dread getting sick. Years ago, illness didn’t spread very quickly because travel was difficult and expensive. Now thousands of people travel via trains and planes across the globe for work and vacation every day. Illnesses that were once confined to small regions of the world can now spread quickly as a result of one infected individual who travels internationally. The National Institutes of Health and the Centers for Disease Control and Prevention are interested in knowing how significant the outbreak of illnesses will be in the coming year in the U.S. Will it Thrill Me? Amusement parks are typically open during the summer months, when the heat and humidity are almost unbearable. The lines for the most popular rides can sometimes be hours long, leaving you to decide whether you should spend your limited time at the park waiting to ride the newest, most popular roller coaster (with the longest line) or instead riding several, possibly less exciting, roller coasters. Unfortunately there is no real metric for scoring roller coasters, although an extensive database exists with information about many rides (see rcdb.com). Innovative roller coaster engineers certainly set out to design a thrilling roller coaster, but what makes a roller coaster exciting and fun? Create a mathematical model that ranks roller coasters according to a thrill factor that you define. 9 2. defining the problem statement Modeling problems are often open ended. Some math modeling problems are clearly defined, while others are ambiguous. This means there is an opportunity for creative problem solving and interpretation. In some cases, it is up to the modeler to define the outputs of the model and which key concepts will be quantified. Defining the problem statement requires some research and brainstorming. The goal is a concise statement that explains what the model will predict. To see how a math modeling question can be interpreted in different ways, consider the roller coaster problem proposed earlier: rank roller coasters according to how thrilling they are. The word “thrilling” here is open to several interpretations. There are many reasonable possibilities in defining and quantifying “thrilling.” For example, one student’s definition of a thrilling ride may be a combination of the maximum height and the number of loops, while another student values a combination of length of a ride and the maximum speed. If these individuals ranked the same list of roller coasters, their ranking systems would likely produce different results, neither of which would be “the” correct ranking. The modeler has room to be creative in deciding how to define “thrilling” but must make sure that no matter what definition she decides upon, there is a systematic ranking that incorporates quantifiable (i.e., measurable) aspects of a roller coaster. Perhaps you’re thinking that the reason the students above didn’t come up with “the” one correct ranking with either of the previous models is because neither of those models incorporate sufficiently sophisticated mathematics. Suppose that we can leverage tools from mathematics and physics to help answer this question. Given the design of a particular roller coaster, we might compute, among other things, velocities and g-forces a rider would experience. Even with this information in hand, it’s not obvious how to use that information to rank roller coasters. Consider four different roller coasters (A, B, C, and D). Coaster A has a larger maximum velocity than B, but B has a higher average velocity. Which is more thrilling? How would these two rank against roller coaster C, which attains a g-force twice as large as A’s or B’s but only does so for 10 seconds of the entire ride? Suppose that roller coaster D never reaches that g-force but sustains g-forces only .5 g less for more than 50 seconds. Which is more thrilling? The modeler must choose a definition for thrilling. Eventually, when communicating the results, a modeler will need to explain why decisions were made and will discuss the strengths and weakness of the model. In the previous discussion we mentioned just a few measurable aspects of roller coasters that one could use to define “thrilling,” including maximum height, the number of peaks, the maximum velocity, or some combination of these. Where does one get a list like this? They come from a process we refer to as brainstorming. Brainstorming is part of the problemsolving process where spontaneous ideas are allowed to flow without evaluation and interruption. 10 Figure 2 Example of mind map to explore the definition of “best” most participation least overall cost to city “best” recycling method processes the most recyclables The roller coaster example demonstrates that brainstorming at the beginning of a project is an essential process that helps reveal different directions that the math model can take. A brainstorming session may include listing all of the things that make a roller coaster thrilling and then digging deeper to see how those properties are measured. At the beginning of the process, however, one may want to just let the ideas flow and then prune the list later after determining what resources are available. This process is related to making assumptions, which we will talk about in more detail in the next section. We’ll look at the brainstorming process in detail by showing how it might work within the context of the recycling problem. In this problem, we want to determine which recycling method would be best for a city to adopt. The word “best” needs to be clearly defined, and there are multiple ways to do that. Let’s imagine that we are on a team that works together to discuss this, and we think of three possible ways to define “best” in this problem. In order to organize our thoughts, we might use a mind map, as in Figure 2, to give us a visual representation of our initial round of brainstorming. A mind map is a tool to visually outline and organize ideas. Typically a key idea is the center of a mind map and associated ideas are added to create a diagram that shows the flow of ideas. In Figure 2, we focus on the definition of “best,” with three possible definitions branching off to be further explored. From here, we can focus our attention on one of the three branches at a time. Let’s think about the least-cost option first. We probably can’t determine how much any recycling program costs without knowing more about the recycling program, so a good place to start is to ask the question “What kinds of recycling programs exist?” If we aren’t familiar with different types of recycling, we might need to do some research to see what kinds of programs exist. If you are working on a long-term modeling project and you have lots of time, you’ll want to do an extensive search to find learn everything you can about the problem. You’ll also want to find out if others have considered modeling this situation. If you are working on a problem and you have a fairly short time frame, you’ll need to be careful to not spend all of your time on the internet researching the problem. Instead, do a quick, preliminary internet search to get a broad perspective (without getting too far into the “weeds”). Suppose that the list of recycling methods consists of drop-off center, curbside single-stream, curbside (presorted), and pay-as-you-throw. Next, we need to consider the costs. Let’s focus on one of the branches, say single-stream curbside pick-up of recyclables. We then ask ourselves, “What contributes to cost for this method?” Then we ask, “For each of those costs, what is the dependence on the properties of the city?” 11 2: defining the problem statement Figure 3 Possible mind map under the assumption that “best” means least cost A possible final mind map for the least-cost approach is shown in Figure 3. Although we will not include the details here, you can imagine that we could proceed in a similar fashion for each of the three definitions of “best.” We would then choose one of the three possibilities, define the problem statement in terms of this choice, and move forward from there to develop a model. During the brainstorming process, explore the problem from different perspectives as if you had access to all the data you could ever need. In the next section, Making Assumptions, we’ll discuss exactly what you can do if you can’t find all of the data you need. Don’t discount any idea simply because you don’t think you’ll be able to find sufficient data. One of the most important aspects of brainstorming is to let the ideas flow freely, especially if done in a group. It is best at this initial phase to stay positive and be openminded. This part of the modeling process is about creativity, so it is important that there is no criticism of anyone’s ideas or suggestions. What seems like a ridiculous approach may later seem innovative after some more thought, so make note of everything! Also, even if your idea isn’t perfect, it might inspire someone else to come up with an even better suggestion. After you’ve explored the problem and considered several possible approaches, you can step back and look at the possible ways a model might be constructed. Your intuition will help you analyze your brainstorming results and decide on a reasonable problem statement. how many are ne dropoff center processes the most recyclables operational cos likelihood of pa incentives/refu curbside single stream operational cost likelihood of par efficiency “best” recycling method least overall cost to city curbside (presorted) operational cost likelihood of participation most participation pay as you throw 12 operational cost likelihood of par any are needed? tional cost hood of participaton ives/refunds? onal cost ood of participation size of city how many/square mile? population how much waste can the center process? start-up? (fixed cost) processing costs/recyclable distance to center how far are people willing to drive? probability based on data? likelihood to participate cost-benefit analysis Limited scope (beverage containers only) start-up? (fixed cost) start-up? (fixed cost) processing center processing costs/recyclable trucks start-up? (fixed cost) how many are needed? GAS area of city onal cost ood of pation onal cost area of city population start-up? (see above map) truck capacity maintenance processing center probability based on data? number of trucks number of trucks employees ood of participation mileage year how many? single stream or pre-sort mapping probability based on data? 13 wage truck capacity 2: defining the problem statement in summary 1 Often math modeling questions are worded in ways that allow for multiple approaches, so you should develop a concise restatement of the question at hand. 2 3 Focus on subjective words that can be interpreted in different ways. Also, identify words that are not easily quantified. Examples include best, thrilling, efficient, robust and optimal. Explore the problem by doing a combination of research and brainstorming, keeping in mind your time constraints. 4 Keep an open mind and a positive attitude; you can prune out ideas later that are not realistic. 5 Brainstorming should be approached as if you had access to any data you need. 6 7 Visual diagrams, such as mind maps, can be a powerful tool leading to the structure of the model. Consider using the website freemind (http://freemind.sourceforge.net/wiki/index.php/Main_Page) [5]. In the end, you should have a concise statement that explains what the model will measure or predict. Activity Create a mind map for the disease-spreading problem. 14 3. Making Assumptions In presenting any scientific work to others, you need to explain how the results were achieved with explicit details so that they can be repeated. If you are explaining a chemistry experiment, for example, you need to list (among other things) which chemicals were used in what quantities and in what order. Other chemists would expect similar results only when they used the same chemicals and procedure. The list of assumptions for a mathematical model are as critical as the experimental procedure in performing a chemistry experiment. The assumptions tell the reader under what conditions the model is valid. Making assumptions can be one of the most intimidating parts of the modeling process for a novice, but it need not be! Assumptions are necessary and help you make a seemingly impossible question much more tractable. Many assumptions will follow quite naturally from the brainstorming process. For the recycling problem, some of our assumptions follow directly from the questions we asked during the brainstorming session, as on the following page. Let’s further examine the assumption about how many people would make use of drop-off centers (termed “likelihood of participation”). The two extremes would be to assume that the 100% of the people near a recycling center would use it or that none would use it. Neither of these seems like a reasonable assumption, so what would be a better assumption? The students whose solution to this problem appears in Appendix B decided they would do some investigation and see if there has been any successful research on participation rates in drop-off centers. They found a study that had been done in Ohio that estimated about 15% of households participated in drop-off center recycling, and made an assumption that this rate would hold in every city across the U.S. One might ask if it is safe to assume that across the U.S. 15% of households will participate in drop-off center recycling if it is available. Is it true that residents of Arizona will behave the same way residents of Ohio do? Certainly some cities would garner a participation rate much higher than 15%, while other cities would have a significantly lower participation rate. In fact, what are the chances that any city would actually have a participation rate of exactly 15%? In some sense, one might say that assigning one participation rate to every city across the U.S. is a ridiculous assumption. In response to that line of thinking, remember two things. First, remember that one must make assumptions in order to make a model. It is not practical or feasible to poll every citizen of every city to determine who will bring recyclables to a dropoff center. If we had to rely on data with that level of certainty at every juncture of the modeling process, we would never get any work done. It’s practical and important to make reasonable assumptions when we cannot find data. 15 3: mAKING ASSUMPTIONS Brainstorming Question ASSUMPTION brainstorming question The best recycling method What is meant by the “best” will be interpreted to mean recycling method? the least cost to the city. We consider only four recycling programs: What recycling methods should drop-off centers, single-stream we consider? curbside, presorted curbside, and pay-as-you-throw. What contributes to cost for the drop-off center method? The cost of drop-off centers depends only on the number of drop-off centers, the quantity of recyclables that pass through each center, and the costs to operate each center. The number of drop-off centers What is the dependence on the properties of the city? needed depends on the area of the city, the population of the city, and the likelihood of participation. 16 Second, you are developing a model that is intended to help one understand some complex behavior or assist in making a complex decision. It is not likely to predict the exact outcome of a situation, only to help provide insight and predict likely outcomes. When you provide a list of your assumptions, you’ve done your part to inform anyone who might use your model. They can decide whether they think your assumption is or is not appropriate to model the behavior they are interested in predicting. In the Analysis and Model Assessment section, we’ll discuss in more detail ways in which you can examine some of the impacts of your assumptions. It’s entirely possible that you may search and search and never find the data you need to make an “educated” guess about a parameter in your model. That’s fine; simply make a note in your write-up that future work might include further investigation in that area. If Team 1356 had not found any estimates for recycling rates, they might have assumed that the recycling rate was 50% in the absence of other data (since it’s the mean of the two extreme cases). That would have been a better assumption than either of the extremes (all residents recycle or no residents recycle). They also might have determined that 25% seemed reasonable (based on their own experiences or intuition) and moved forward with that number. All of these are appropriate as long as they are included as assumptions. Choice of assumptions may also be dictated by the mathematical tools available. Both the National Institute of Health and the Centers for Disease Control and Prevention use mathematical modeling to help them understand the spread of infectious diseases. While their models may be quite sophisticated, they are actually built upon many of the simple principles we will discuss here, which evolve from relatively few assumptions. Let’s focus on determining the number of people who have the disease over time by considering models at multiple math levels. One of the simpler models for disease propagation can be created if we assume that the disease spreads at a constant rate. For example, we might assume that each person who has the disease will spread the disease to 3 people per day or that each person spreads the disease to just 1 person every 5 days. As we move forward, we will refer to this as the constant-rate disease model. Transmission rate drives the spread of disease, and the assumption that it remains constant over time seems unlikely for the duration of the disease. If we have knowledge of calculus and differential equations, we can arrive at another model that accounts for varying transmission rate. 17 3: mAKING ASSUMPTIONS it is legitimate to make a reasonable assumption, In order to decide how the transmission rate should determine how it affects the model moving forward, change with time, it might be helpful to think about and make adjustments to improve the outcome. You the mechanism behind disease transmission: infected can see an example of this later in the Analysis and people somehow in contact with susceptible individuals. Model Assessment section. Make a careful list all of the It makes sense, then, to believe that the transmission assumptions you make along the way; rate depends on how many people a good modeling paper includes a are infected and how many how does one know list of assumptions in the write-up. people are susceptible. We might which is the “best” Additionally, keep track of all the assume that the transmission rate assumption to make? resources used so that you can create a is directly proportional to the there is no easy answer bibliography. product of the number of people to this question; But With all of these options, how infected and the number of people be sure to acknowledge does one know which is the best who are susceptible. We will refer the assumptions you’ve assumption to make? There is no easy to this model as the varying-rate made and discuss their answer to this question; the most disease model. We will revisit both limitations. important thing is to acknowledge the disease models in later sections. Some assumptions are made assumptions you’ve made and, when at the beginning of the modeling appropriate, discuss the limitations process, while others are made as you proceed through that might arise from your assumptions. the modeling process. The modeling process is iterative; 18 in summary 1 Activity Assumptions often come naturally Build on the brainstorming from the from the process of brainstorming and previous section about the roller coaster defining the problem statement. model. Certainly we did not uncover all the ways in which roller coasters are 2 You should do some preliminary research considered thrilling. Define the problem and may find data to help you make statement in your own words, based on assumptions. In the absence of relevant your understanding of the problem. data, make a reasonable assumption and Finally, take your work one step further justify the assumption in your write-up. and list the assumptions on which you could build your model. 3 Different assumptions can lead to different, equally valid models at different mathematical levels. 4 Not all assumptions are made during the initial brainstorming. Some come as you move through the modeling process. Keep track of the assumptions you make and include a list of assumptions in your write-up of the model. 19 4. defining variables another term for output. With the problem statement clearly defined and an initial set of assumptions made (a list that will likely get longer), you are ready to start to define the details of your model. Now is the time to pause to ask what is important that you can measure. Identifying these notions as variables, with units and some sense of their range, is key to building the model. The purpose of a model is to predict or quantify something of interest. We refer to these predictions as the outputs of the model. Another term we use for outputs is dependent variables. We will also have independent variables, or inputs to the model. Some quantities in a model might be held constant, in which case they are referred to as model parameters. Let’s look at a few simple examples that will help you distinguish between these concepts. We’ll also see how they depend on your viewpoint and the problem statement. the inputs of the model. The purpose of a model is to predict or quantify something of interest. We refer to these as the outputs of the model. 20 example 1: Painting A House Suppose that we plan to paint a house. We’ll need to know the dimensions of the house so that we can find the surface area, SA (ft2). We also need to know the efficiency, E (ft2/gal), of the paint, which tells us how many square feet a gallon of the paint can cover. Keep in mind that the efficiency varies from brand to brand. We let V (gal) be the volume of paint in gallons that we need. Here, knowing the units for efficiency can help reveal the relationship between the variables: E= surface area SA = volume V Notice that we can rewrite this relationship and use any of the following three equivalent relationships: 1 E = SA / V 2 SA = E · V 3 V = SA / E dependent variable, and E is a constant model Whether something is a dependent or independent parameter. variable or a parameter often relies on the perspective Suppose instead that you are a homeowner and of the modeler and the problem statement. Imagine want to choose from among five that you own a painting company and always use CoversItAll brand brands of paint to buy to minimize whether something paint. When a client hires you, you the amount necessary to paint your is a dependent or take measurements of the house, house. Under this scenario, the independent variable and then you want to know how surface area of your house is a conor a parameter much paint you’ll need to complete stant, so this is treated like a model often comes from the job. In this scenario, the the parameter. If we know the efficiency the perspective of of each of the five brands of paint, efficiency of CoversItAll paint is the modeler and the we would again use equation (3), but constant, so that is a model paramproblem statement. eter. You would use equation (3), this time with E as the input variable with the surface area of the home as and volume as the output. Thus, the input and the volume of paint needed as an output. in this case, E is the independent variable and V is the Therefore, SA is the independent variable, V is the dependent variable. 21 example 2: Constant-Rate Disease Model For another example, recall the constant-rate disease model discussed in the previous section, wherein we want to determine the number of infected individuals at any given time. Using this problem statement, let t denote the time in days. This will be the input (independent variable). Let I(t), the number of infected individuals at time t, be the output of the model (the dependent variable). In this constant-rate disease model, we assume that each person who has the disease will spread the disease to a certain number of people in a given fixed time period. We define the parameter τ to be the time period during which each infected person will transmit the disease to r other people (so r is also a parameter). Further, we might define a parameter I0 to be the initial number of infected individuals. In other words, I(0) = I0. For example, if we have I0 = 10, τ = 2 days, and r = 3, then we are considering a population starting with 10 infected individuals, where each infected person transmits the disease to 3 uninfected individuals every 2 days. These simple examples show that the problem statement will guide what the dependent variable (i.e., your model output) is going to be. Dependent variables and parameters will often be determined by both your assumptions and the availability of information. The key idea is that independent variables cause a change in the dependent variables. Let’s look at a more complex model that demonstrates how submodels may be needed to provide input to the overarching model. The key idea is that independent variables cause a change in the dependent variables. 22 example 3: determining recycling costs For the recycling problem, we seek a model that will predict the cost for a city to implement and run a recycling program. Hence, the output of the model should be in dollars. What will the inputs to the model be? We can use the results of our brainstorming session to help us. Let’s consider the model to determine the cost of a drop-off center. Based on the brainstorming shown in Figure 3, the cost of using drop-off centers in a given city could depend on how many are needed, the operational cost, the likelihood that people will participate, and the possibility of refunds or incentives. Let’s consider the approach taken in the solution provided in this guide. These students calculated the cost of a drop-off center, based on the cost to maintain the center, as well as the revenue the center would create. This latter fact depends directly on the participation rate by the local population. So, the probability of participation by a single household was needed first. Determining the single-household probability rate is an example of how a submodel can be used to generate input to the main model. Continuing this line of thought, the students based the likelihood that a household would participate on its distance from a drop-off center. Thus, the distribution of houses throughout the city and the location of drop-off centers must be understood. This team assumed that each city was a square and that houses were aligned on grids. To determine the placement of the drop-off centers within the city grid (which will actually determine how many will be needed), students calculated a maximum distance, d, that citizens would be willing to drive per week to a drop-off center. Using that distance, drop-off centers were placed on the grid so that they did not overlap yet the entire city would be covered. Thus, for this approach, note that d is an input to determine where to place the drop off-centers, yet d itself needs to be determined first (because it certainly isn’t clear what a reasonable value of d might be nor is there one known value that suits everyone in the United States). Therefore, ultimately we can use a math modeling approach to determine d based on some assumptions as well. This team decided that d depends on the cost to get to a drop off-center, and this would depend on the price of gas, gas mileage for a typical car, and the amount that a household would be willing to pay to recycle per month. Values for these model parameters were found in the literature through some digging into available resources. To summarize their approach, they assumed that the following were model parameters to find d: • People would be willing to pay $2.29 to recycle per month or $0.53 per week. • People would make biweekly trips to the center. • The average price of a gallon of gas is $3.784. • Th e average mileage of a passenger car is 23.8 miles/gallon. 23 4: defining variables ($0.53/week) · (23 miles/gallon) d = $3.784/gallon With a value of d in place, students were able to take into consideration the size of the city grid and then partition the city to place the appropriate number of recycling drop-off centers to cover the entire city. Next, the team wanted to propose a method for predicting the likelihood of participation, but this couldn’t be done until the previous submodels were developed. Note that the inputs, or independent variables, for their cost model for drop-off centers were the area of the city, the population, the average number of people in a household, the maximum distance citizens would be willing to drive, and the number of drop-off centers needed. However, more modeling was needed to determine values for many of these inputs, as we demonstrated with the input, d. Notice that these are specific details that are implied from the brainstorming shown in the mind map in Figure 2 but at this phase in the process, more detail (and additional brainstorming and assumptions) is needed. /2 = 1.66 miles/week. in summary 1 The problem statement should determine the output of the model. The output variables themselves will be dependent variables. 2 The results of the initial brainstorming can provide insight into which variables should be independent variables and which should be fixed model parameters. 3 4 Keep track of units because they can reveal relationships between variables. You will likely need to do some research and make additional assumptions to obtain values for necessary model parameters. 5 Submodels or multiple models may be needed to generate some of the model input. Activity Determine the dependent and independent variables for ranking roller coasters based on how thrilling they are. What are some possible model parameters? 24 5. Building Solutions Now that you have an initial mathematical model, you will need to use that model to generate preliminary answers to the question at hand. The approach you take of course depends on the type of model you have and your background in mathematics. It may involve simply considering some different values of certain parameters to see how the output changes, it may involve techniques from calculus or differential equations, or it may involve using graphs to understand trends in data. In this chapter we will give you some strategies for choosing how to solve your problem. When you first approach any mathematical problem, you often look into your personal tool kit for a mathematical technique to use. Sometimes, if we start with the incorrect approach, a better approach will naturally emerge. So, the important thing is to just tackle it and see what happens! The following questions may help you. • Have I seen this type of problem before? •I f so, how did I solve it? If not, how is this problem different? •D o I have a single unknown, or is this a multivariable problem with many interdependent variables? • Is the model linear or nonlinear? •A m I solving a system of equations simultaneously, or can I solve sequentially? •W hat software or computational tools are available to me? •W ould a graph or other visual schematic help provide insight? •C ould I approximate my complicated model with a simpler one? •C an I hold some values constant and allow others to vary to see what is going on? 25 5: building solutions Certainly there are times when it is clear what mathematical technique is required (for example, factoring, finding zeros of a polynomial or a function, integrating a function, simulating a model over time to understand how output evolves, etc.). Other times, when it is not clear how to proceed, it may be helpful to analyze simple examples, special cases, or related problems. Even a “guess-and-check” approach can sometimes provide some deep insight. Mathematical experiments may be facilitated with a graphing calculator or computational software such as Excel, Mathematica, or Maple. Multiple approaches can be taken to build a solution. We will show several approaches for each of the following examples, which appear in order of increasing mathematical level. We also show how to leverage software to assist you in finding solutions. Example 1: Acid Rain Let’s consider approaching a general problem by building different models that may lead to different results. Suppose we want to determine how acid rain is impacting the water resources in your town. We already know from our previous section that this open-ended statement needs to be refined into a concise problem statement. Let’s suppose we developed the following problem statement after some brainstorming and mind-mapping: Measure the levels of sulfur dioxide (SO2) and nitrogen monoxide (NO) at four different locations and use the results to determine the best place for water to be extracted. Approach 1. Approach 2. Approach 3. Ranking Given the measurements for each location, rank the locations as safest, second safest, third safest, and least safe. This allows a qualitative approach but incorporates at least some quantitative analysis, since it requires us to assess the importance of SO2 vs. NO and define the term safe. Equation-Based Solution We can assess the difference in importance between SO2 and NO, and create an equation that assigns a score to each location. The location with the highest score wins. This requires algebraic modeling and manipulation as well as ideas of proportionality. Qualitative Comparison We can decide if any of the sites is too polluted. For example, if one site has the highest levels of both SO2 and NO, then that site should not be used. This reduces the problem to choosing among three sites, and the same ideas can be applied to reduce them to two and then to a final one. 26 Example 2: Constant-Rate Disease Model Often, when modeling real-world phenomena, we are interested in forecasting future values. That is, we want to explore how the value of something we care about changes over time. We will discuss several approaches that we could take to find a solution to the constant-rate disease model (in which we want to determine the number of people who are infected with a disease under the assumption that each person afflicted with a disease spreads it to exactly r people over a given time period τ). For this example, we’ll let I0 = 10, τ = 2 days, and r = 5. In other words, we have a situation in which 10 people were infected with a disease that they, and all other future individuals who become infected, will each transmit to exactly 5 susceptible individuals every 2 days. Approach 1. Approach 2. Computation by Hand We start by doing some simple computations by hand to determine whether a pattern emerges. After 2 days, we have the original 10 people who have the disease, but we also have 50 more, because each of the 10 infected individuals has spread the disease to 5 others. Hence, when t = 2, I = 60. After 2 more days, we have I = 60 + (5 × 60) = 360, and so on. We could organize our numbers by putting them in a table, as in Table 1. Computation via Technology Straightforward iterative calculations such as those needed for this problem are easily “programmable” in spreadsheet software such as Microsoft Excel. Excel is a useful tool for visualizing step-by-step discretization, and it can allow you to see the value of each variable at each time step in table form. If you don’t know how to use Excel, you might find resources or videos by doing an internet search on performing iterative calculations in Excel. If we’ve done our computations in Excel, we can also create a plot of our solution easily, as in Figure 4. Table 1. Constant-rate disease model computations, by hand 0 10 2 60 4 360 6 2160 8 12960 10 77760 12 466560 Figure 4. Graph of constant-rate disease model output with r=5, τ = 2, and I0 = 10 14000 14000 Infected Population I, Infected Population Infected Population t, In Days 10500 10500 7000 7000 3500 3500 00 00 22 44 Time (days) Time (Days) 27 66 88 5: building solutions Example 2: Constant-Rate Disease Model (CON’T.) Approach 3. Pattern identification We might notice as we perform computations by hand or by using technology that if we know the number of infected individuals at some time step, then we can get the infected population at the next time step by multiplying by 6. Let’s see why this happens. At time t = 0, we have 10 infected individuals. At time t = τ = 2 days, we have the original 10 infected individuals plus 50 more, for a total of 60. For this set of parameters: For a generic set of parameters: I(τ) = 10 + 5 ·10 = (1 + 5) 10 = 6 ·10. I(τ) = I0 + τ I0 = (1 + r) · I0. For a generic set of parameters: I(2τ) = 60 + 5 · 60 = (1 + 5) 60 = 6 · 60. I(2τ) = I(τ) + r · I(τ) = (1 + r) · I(τ). For a generic set of parameters: I(2τ) = 6 · 60 = 6 · (6 · 10) = 62 · 10. I(2τ) = (1 + r) I(τ) = (1 + r) · (1 + r) I0 = (1 + r)2 I0. You should verify that for I(3τ) we have the following equations. For this set of parameters: I(3τ) = 63 · 10. For a generic set of parameters: I(2τ) = (1 + r)3 I0. You might see a pattern emerging that would lead you to find a closed form solution for the infected population after n days have passed. Continuing, at time t = 2τ = 4 days, we have the 60 from the previous time step, plus 5 times 60 more. For this set of parameters: For this set of parameters: For this set of parameters: For a generic set of parameters: I(nτ) = 6n · 10. I(nτ) = (1 + r)n I0. This result, an exponential model, is consistent with both the values in Table 1 and the corresponding graph in Figure 4. The results of each of the three approaches are perfectly valid model for our stated assumptions, but we now see that the model may be limited in its ability to accurately describe some of the real-world characteristics of disease propagation. In the next section we discuss these questions, and we’ll revisit this model as a part of model assessment. What does the formula look like for I(3τ)? (Try it!) Now we see where multiplication by 6 comes from, but we might be able to see an even deeper formula emerge. We can actually substitute our expression for I(τ) into the equation the equation for I(2τ), as below. 28 Example 3: varying-Rate Disease Model (4) where k is a (positive) proportionality constant. The larger the value for k, the larger is. So a large k-value indicates a highly contagious disease. In order to find a solution to the differential equation, it will be helpful if we can get down to only one dependent variable. Since we assumed that the population, P, is constant, then we can take advantage of the relationship P = I(t) + S(t) to write S as S(t) = P − I(t). Then we can rewrite equation (4) as follows: = kI(t) (P − I(t)). Analytic Solution If you’ve studied differential equations before, you might recognize that this particular differential equation can be solved analytically using a technique called separation of variables. (See any standard calculus text.) Using this technique and assuming that the initial condition is I(0) = I0, you can find the solution to be I(t) = PI0 I0 + (P − I0 )e -Pkt . (6) Alternately, if you have access a symbolic computation tool such as Maple, you can use technology to generate this analytic solution. With the analytic solution in hand, we can demonstrate how the model behaves by choosing some parameter values to generate a plot. (See Figure 5.) Notice that this model exhibits the same sort of exponential growth in the initial stages of disease propagation as the constant-rate disease model, but the rate slows when there are fewer people left who have not yet contracted the illness. We see that I approaches 1000 as time increases. In generating the plot, we used total population P = 1000 (a parameter), so our model predicts that over time the entire population contracts the disease. Figure 5. Analytic solution of the varying-rate disease model output with k = 0.0006, P = 1000, and I0 = 20 1000 1000 (5) 750 750 Infected Population = kI(t)S(t), Approach 1. Infected Population We haven’t defined the variables yet in the varyingrate disease model, so let’s do that now and set up the differential equation which is to be solved. We define the total population to be P, and each member of the population must belong to exactly one of two classes: susceptible, S or infected, I. Hence, P = S + I. We assume that the total population remains constant, but the values of S and I change over time, so we might choose to write P = S(t) + I(t). Recall that in the varying-rate disease model, we consider a population in which disease transmission rate is directly proportional to the product of the number of people infected and the number of people who are susceptible. We can think of transmission rate as the rate at which people become infected, or the rate of change of population I(t). Readers familiar with calculus may recognize this as the derivative. We will denote this rate of change I(t) with respect to time as . Then we can translate the assumption “disease transmission rate is directly proportional to the product of the number of people infected and the number of people who are susceptible” into the equation 500 500 250 250 0 0 0 6.25 0 6.25 12.5 12.5 Time Period (days) Time period (Days) 29 18.75 18.75 25 25 5: building solutions Example 3: varying-Rate Disease Model (CON’t.) way to predict I(2), and so on. This numerical method, called the forward Euler method, is easily implemented in Excel. Approach 2. Approximate Solution While we can find an analytic solution for the differential equation above, many differential equations, as well as other equations and systems of equations that describe real-world phenomena, cannot be solved directly. In these situations, it is common to approximate a solution, often referred to as using a “numerical method.” Numerical methods are a powerful tool in modeling, especially if you have not yet had formal training in techniques from calculus and differential equations. As an example, consider equation (5). Let’s suppose that we did not have the analytic solution (equation (6)) to evaluate at any time we choose. One numerical method would be to try to calculate approximate values of the infected population at specified later times. We know the initial population, I(0) = I0, and we have a model describing the rate of change of the population. Intuitively we should be able to predict the number of infected people at a later time, say t = 1. To do this, we can express the change in the infected population over that time frame time as ≈ Infected Population Infected Population Figure 6. Numerical solution of the varying-rate disease model output with k = 0.0006, P = 1000, I0 = 20, and time step Δt = 1 day 1000 1000 750 750 500 500 250 250 0 0 0 0 6.25 6.25 12.5 12.5 18.75 18.75 25 25 Time (days) Time period (Days) We show a plot of the approximated values of the infected population at different points in time in Figure 6, and you can see the details of how we obtained these results in Appendix A. Notice that the plot of the numerical solution looks much like the plot of the analytic solution we saw in Figure 5. While we are pleased that the numerical solution does a good job of approximating the analytic solution, we should note that the graphs are not identical. Every numerical method introduces some error, and the forward Euler method is no exception. The study of error is a complicated issue, and we will not attempt to address it here. I (1) − I (0) 1−0 and plug what we know into the right-hand side of equation (5). We can then solve for I(1) in terms of I0. We should be careful, however. What we really have is an approximation to I(1) because we approximated , but our goal was to determine approximate values of the infected population, and that is what we’ve accomplished. We could then use our value of I(1) in the same 30 summary of building solutions Finding a solution to your model may be achieved by various means, depending on your background knowledge in mathematics and software. Solving by ranking is an excellent method for students who do not have the mathematical training to produce algebraic formulas. Even with an equation, however, there are often multiple ways to arrive at a final answer. Software tools such as Excel can facilitate obtaining solutions. If you do you use a numerical approximation technique, it is important to be aware of error that might be introduced. If nothing else seems to help, try guessand-check. in summary 1 How you build a solution may depend on what mathematical tools are available to you. 2 There is often more than one way to tackle a problem, so just start and see what happens. 3 If you don’t immediately know how to solve the problem at hand, ask yourself the provided set of questions to help you get started. 4 Different solution methods can lead to solutions of different natures. This is perfectly acceptable. Activity Continue working with your model for a ranking system for roller coasters depending on your definition of “thrilling.” Use your ranking system to rank at least 10 roller coasters. You may find data you need at rcdb.com. 31 6. analysis AND Model assessment We separate this section into two subsections. The first subsection, Does My Answer Make Sense?, gives some quick checks to determine whether your solution is at all reasonable. The second subsection, Model Assessment, gives more in-depth techniques to analyze the model. Often we are so excited that we built and solved a mathematical model that we forget to step back and carefully examine the results. While this is understandable since it took hard work to get to that point, it is essential to ask yourself, “Does my answer make sense?” Sometimes, the results may indicate a mistake in the calculations. Other times you may find that additional or alternate assumptions are needed for the solution to be realistic. If the results do make sense, then further analysis is needed to assess the quality of the model. Recall that open-ended questions may have more than one solution and that the results depend on the assumptions made and the level of sophistication of the mathematics used. An honest evaluation is necessary to explain when the model is applicable and when it is not. In this section we will talk about ways to analyze your results and how to assess the quality of your model. 32 does my answer make sense? During the modeling process, you may gather some intuition about what the output will be like. Here we provide some pointers on how to answer the question “Does my answer make sense?” by analyzing the output of the model. • Is the sign of the answer correct? For example, if your disease model is supposed to calculate the number of infected people at a certain time, then clearly an answer of −1000 would not make sense. Carefully check your calculations, especially if you are using software. For example, in Excel it is easy to select the wrong cell when defining a formula. Your model may be correct, but its implementation may be at fault. • Is the magnitude of the answer reasonable? If you are trying to estimate the speed of a car, for example, then it wouldn’t make sense if your model predicts a value of 1000 miles per hour. Sometimes, when the magnitude of a number is off, incorrect units may have been used somewhere in the process. • Does the model behave as expected? If the output of your model is visualized with a graph or plot of any kind, then carefully look at the intercepts, the maximum or minimum values, or the long-term behavior. Were you expecting a horizontal asymptote, yet your graph just increases without bound? If you have a data set and believe there is a relationship between two variables, plot the data. A mathematical error in the sign of the slope will be immediately obvious. It could be that you had some assumptions which were neglected, erroneous units, unrealistic parameters, or that the software was used incorrectly. • Can you validate the model? Sometimes it is possible to validate your model using available data. For example, if you used your roller coaster ranking model on the Top Thrill Dragster of Cedar Point, which held the record for the tallest roller coaster in the world and goes 120 mph, and the output said it was only mediocre, then likely your model is not doing what you want it to. 33 6: analysis and model assessment model assessment Now that you have verified that your model is correct, it is time to step back and consider the validity of your model. This includes identifying the strengths and weaknesses of your model and understanding at a deeper level the behavior of the model. Performing a sensitivity analysis, wherein you analyze how changes in the input and parameters impact the output, can contribute to understanding the behavior of your model. software, research other model parameters in order to fit the model to his or her own needs, or sort through complicated output to be able to draw a conclusion. It is also a strength if the people who might use your model can understand it and have faith in it. Let’s examine this process by looking at the constant-rate disease model. Recall our assumption that each infected person transmits the disease to r people every τ days. From this, we found the following exponential function to describe the infected population after nτ days: Identifying Strengths and Weaknesses Once you are convinced that the output is correct and the model is achieving what you want, assess the quality of model. This assessment needs be included in the write-up about your model to help people understand the conditions under which your model is applicable, which is strongly linked to the assumptions that were made along the way. It is necessary for you to provide an honest, exact assessment of the capabilities of your model. This is also a chance to highlight the strengths of the model. For example, even if a model was formulated using simple physics, it might require very little expert knowledge in order to provide meaningful insight. This can be a huge advantage over a more complex model that requires the user to program and run I(nτ) = (1 + r)nI0. Figure 7 shows the graph of the model output for I0 = 20 infected people, τ = 2 days, and r = 5. 34 Figure 7. Graph of constant-rate disease model output with r = 5, τ = 2, and I0 = 20 Infected Population Infected Population 14000 14000 10500 10500 7000 7000 3500 3500 00 00 22 44 66 8 8 Time (Days) Time (days) This model has some valuable strengths: • This model is easy describe to others, which means that they might have increased confidence in its output. Allowing full understanding of all parts of the model can be valuable when trying to understand the significance of output given different input values. • This model is not valid for all types of disease. Our model only has two categories for individuals: susceptible and infected. In real life, after being infected with some diseases a person recovers and then acquires immunity to that specific disease. This model wouldn’t accurately describe that situation. • We found an analytic solution. If we have values for I0, τ, and r, the function I(nτ) = (1 + r)nI0 can be used to solve for the number of infected individuals at any time. • This model predicts the disease spreads to everyone. In other words, under this set of assumptions, the disease will propagate through a population until every individual has the disease. This (hopefully) is not a reasonable scenario. • Our model is consistent with our assumptions. Our primary assumption is that the disease is spread at a constant rate. Our model accurately describes this behavior, and therefore we have developed a meaningful solution. This model also has a few notable weaknesses: • Our model is quite simple. In addition to being a strength, it is also a weakness. We may be concerned with the constant rate assumption, especially since it is not consistent with our intuition regarding the spread of disease. In this case, we have a model based on sound mathematics that provides us with good insight into disease propagation but also leads to some questionable realworld outcomes if taken as the final answer. In general, awareness of your model’s capabilities leads to better overall solutions. In such awareness, you know when it is appropriate to use your model, and you also have a starting point from which to build future (more specific and/or realistic) models. Identifying your model’s weaknesses does not detract from the hard work you’ve done; it is always preferable for the modeler to acknowledge weaknesses in the model. If the reader identifies weaknesses that the modeler has missed, then the readers’s assessment of the model and the modeler suffers. 35 6: analysis and model assessment sensitivity analysis Figure 8. Analytic solution of the varying-rate disease model output with P = 1000, I0 = 20, and k ranging from 0.0002 to 0.001 in increments of 0.0002 Infected Population Infected Population 1000 1000 750 750 500 500 250 250 00 0 0 6.25 6.25 12.5 12.5 18.75 18.75 25 25 Time (days) Time period (Days) 0.0002 kk= =0.0002 kk ==0.0004 0.0004 k0.0006 = 0.0006 k= = 0.0008 k =k0.0008 0.001 kk= = 0.0001 does this mean? Well, that depends on the problem you are solving. In this example, we learned that changes to k can increase (or decrease) the spread of the disease to the entire population. If a population were infected with such a disease, then this model could demonstrate that a drug that may inhibit the disease transmission rate could create the time needed to develop an effective course of treatment. In our sensitivity analysis, the changes in k were fairly small (0.0002). How do you know how big (or small) your variation should be? In some cases you may have real-world data to help you make a decision. If that’s not available, use common sense as your guide and play around with the numbers to develop intuition. For example, in the variable-rate disease model, we could also hold k constant and investigate whether changes to the initial infected population I0 affect our outcome. In this instance we will definitely vary I0 in increments larger than 0.0002. When doing model assessment it is also vital to consider the model’s sensitivity to changes in the assumptions and parameters used to build it. A model is considered sensitive with respect to a parameter if small changes in that parameter lead to significant changes in the output. There are several ways to conduct a sensitivity analysis. A simple approach would be to consider a range of values for a certain parameter while keeping all other parameters fixed, calculate the output, and then determine the impact on the output. For example let’s take a look at the effect of changes to the transmission rate in the variable-rate disease model with P =1000 and I0 =20. Looking at Figure 8, we see how incremental changes to the transmission rate affect the output of our model. In particular, we note that when 0.0006 ≤ k ≤ 0.001, the disease has infected most (if not all) of the population after 15 days, but it takes just over 20 days for the disease to reach most of the population when k = 0.0004. When k = 0.0002, a good proportion of the population still has not been infected after 15 days. What 36 model refinement flu is over. It doesn’t make sense to put them back into the susceptible population because we know that they have developed immunity. So we need to create another class, R, which represents those who have recovered from the disease. We begin by defining the following: If time allows, assessment and sensitivity analysis can lead to improvements in the model. Modeling, as pointed out earlier, is an iterative process, and refinements can almost always be made to develop more realistic scenarios. If the modeling is being done in a timed setting, such as for a competition or a homework assignment, then this may not be possible (although it is generally possible to indicate the type of refinements that would improve the model). However, for long-term projects, the model assessment really is an intermediate step before (possibly) starting the modeling loop over again. Discussing possible modifications to the model, even if you cannot make them, demonstrates that you are able to think beyond just the first approach. We demonstrate how this could be done with the variablerate disease model. In our previous disease model, the population was split into two classes: infected and not infected. If, however, we consider something such as an influenza outbreak, we know that people are capable of transmitting the flu for a short period of time, but eventually they develop immunity and will no longer spread the disease. This dynamic is certainly not captured with our initial disease models. With our new considerations in mind, we want infected individuals to be able to move out of the infected population after their bout with the P = the total number of people in the population, S = the number of people in the population who are susceptible, I = the number of people in the population who are infected and transmitting the disease, R = the number of people in the population who have “recovered” (i.e., they are not susceptible and no longer transmit the disease). Notice that for any time, P = S +I +R. We will assume that each individual who becomes infected takes the same amount of time to recover. That is, our model will not take into account the possibility that one person recovers in 3 days while another takes 5 days to stop being a carrier of the disease (although a later refinement of our model might include a random distribution of recovery intervals). 37 6: analysis and model assessment Figure 9. Graph of constant-rate disease SIR model output for susceptible (S), infected (I), and recovered (R) populations for parameter values P = 1000, r = 2, τ = 4, and I0 = 12 1000 1000 Population Infected Population 750 750 500 500 250 250 00 0 6.25 6.25 12.5 12.5 18.75 18.75 25 25 (days) TimeTime period (Days) S S I I R R as influenza; it starts out slowly, spikes, and then dies out as the remaining infected population recovers. This model is not perfect, but it does provide us with insight into the dynamic interaction of the affected populations. Let’s move forward looking at a specific example. Suppose at time t = 0, 12 people of a total population of 1000 become infected with a novel disease. We assume that each infected person recovers after exactly 4 days and that each infected person transmits the disease to 2 others during that 4-day period. So we will let τ = 4 days. At time t = 0, then, we have I0 = 12, S0 = 988, and R0 = 0. After a 4-day time period, i.e., when t = τ, all 12 of the infected individuals move to recovered status. In the meantime, they have infected 24 susceptible individuals. So we have Table 2. Infected, susceptible, and recovered populations over time for parameter values P = 1000, r = 2, τ = 4, and I0 = 12 t S(t)I(t)R(t)P(t) 0988 1201000 τ 96424121000 I(τ) = 2 · I0 = 2 · 12 = 24, S(τ) = 988 − I(τ) = 988 − 24 = 964, and R(τ) = R0 + I0 = 0 + 12. 2τ91648361000 3τ82096841000 4τ 6281921801000 5τ 2443843721000 We can continue as shown in Table 2 and Figure 9. Our refined model now shows a situation more consistent with our intuition regarding a disease such 6τ 0 2447561000 38 in summary 1 Activity Be sure to allocate time to analyze your Read the solution to the recycling problem. results since it is indeed a critical part What are the strengths and limitations of that of the entire modeling process. model? What are parameters in the recycling model that could be examined for sensitivity? 2 Always examine the output you get from How might the sensitivity of the recycling your model and ask yourself if it makes model be analyzed? Write a few paragraphs sense. If your answer doesn’t make sense, assessing this model. verify that you haven’t made a mistake in implementing your model. 3 If your solution is consistent with your assumptions but not consistent with the real-world phenomenon you are trying to describe, you may need to refine your model by adjusting your assumptions. 4 5 6 List strengths and weaknessES of your model. Try to determine how sensitive your model is to parameters and assumptions. Include specific improvements you would have incorporated given more time. 39 7. putting it all together Now that a model has been created, solved, and assessed, the time comes to write everything up as a polished solution paper. This step is just as important as the effort necessary to get to this point. Keep in mind that you are the expert about the problem, and now your role is to explain what you did in detail to people unfamiliar with your solution approach. To this end, it is critical that you take good notes from the initial brainstorming process through the final analysis to be sure you have kept track of all the assumptions you made. Good writing also takes time, so be sure to allocate a period of time to step back from the math modeling and focus on quality writing. In this section we discuss how to structure your report and some key points for successful technical writing. This step is just as important as the effort necessary to get to this point. Keep in mind that you are the expert about the problem. 40 structure A technical report typically starts with a summary page, also called an executive summary or an abstract, that is of one page or shorter. This is not an introduction; it’s actually a place to summarize how the problem was solved and to provide a brief description of the results. It might seem strange to put the conclusion at the beginning, but this “bottom-line up front” approach is convenient for those reading your report. The abstract or summary page should restate the problem, briefly describe the chosen solution methods, and provide the final results and conclusions. You should describe your results in complete sentences that can stand on its own, without using variables. The summary lets the reader know what to expect in the report but does not overwhelm him or her with unfamiliar mathematical notation. Imagine that a reader will decide whether to continue reading the rest of your paper itself based on this abstract. As an example, see the solution to the recycling problem. After the summary, the paper should include a formal introduction that includes a restatement of the underlying real-world application as if the reader does not have any prior knowledge. This section usually contains some motivation or relevant background information as well but should not include a lengthy history lesson. Both the general modeling question as well as the concise problem statement that you developed should be at the forefront of this section. This section must provide a paragraph that describes how you approached the problem. For example, if we consider the task of ranking roller coasters based on how thrilling they are, then it would help to define “thrilling” up front. For example, “Our model is based on the notion that rides with high accelerations, inversions, and significant heights are thrilling.” Note that this statement doesn’t exactly explain how those features are quantified or implemented for a mathematical ranking system for thrilling roller coasters, but it does give the reader an idea of what will be showing up later in the document. It may seem counterintuitive, but the introduction and abstract are typically written last. This is because, after all else has been written, the author has a complete picture of the manuscript and may then best tailor these sections accordingly. The body of the solution paper will likely be several pages long and split up into sections about assumptions, variables, the model, the solution process, analysis, and overall conclusions. Let the reader know about the overarching assumptions you made to make the problem solvable. Some specific assumptions may need to be included again later, within the paper’s main text, in order to clarify certain ideas as they are developed. However, the most important message here is that all assumptions are included and listed at some point in your write-up. You should be sure to justify why those assumptions are reasonable and include citations as needed. Plagiarism of any kind is never acceptable. When you next begin to describe your model and how you solved it, clearly state the variables you will be using and the corresponding units. If there are relationships between variables, explain where the come from and, if needed, refer to any necessary assumptions. Mathematical equations and formulas should be centered, each occuring on their own line. We provide some more specific details on this in the following section. Finally, the paper must have a conclusion section that recaps the important features of the model. It is critical that this section includes an analysis of your results, as described in the previous section. An honest assessment of the strengths and weaknesses is important. In particular, you can comment on how the model can be verified and how sensitive the model is to the assumptions. We proceed by giving some tips about technical writing. 41 7: putting it all together TECHNICAL WRITING DOS & don’ts 1. Do not write a book report. It is critical that the narrative of your solution doesn’t read like a story about how you came up with your model. For example, consider the following two excerpts about an assumption made for the recycling model. Example 1: A study of drop-off recycling participation in Ohio found that 15.5% of citizens who do not have access to curbside recycling use drop-off recycling [8]. We have assumed that this data is representative of the U.S. Example 2: We were stuck because we did not know how many people in the U.S. recycle. We googled and found an article that Ohio’s participation in drop-off recycling was 15.5% for people who did not have access to curbside recycling, so we used that number in our model. The second example is written in a way that makes the assumption sound invalid or that it was chosen only because no other information could be found. However, the first one sounds as though some research was done and a useful and legitimate source was identified, which provided an applicable statistic. 2. Do not use words when using mathematics would be more appropriate. Which of the following is more effective? Example 1: For an ideal gas, we have nRT , V where P is the absolute pressure of the gas, V is the volume of the gas, n is the amount of substance of gas (measured in moles), T is the absolute temperature of the gas, and R is the ideal, or universal, gas constant. P= Example 2: For an ideal gas, the absolute pressure is directly proportional to the product of the number of moles of the gas and the absolute temperature of the gas and inversely proportional to the volume of the gas, with proportionality constant R, the ideal, or universal, gas constant. 42 In this case, the mathematics is easier to follow, and you can imagine that the more complicated the calculations, the harder it would be to try to describe in words. 3. do use proper sentence structure when explaining mathematics. Communicating mathematics requires proper punctuation, such as periods at the end of a computation if the computation ends the sentence, as in Example 1 below. Use commas when appropriate. Example 1: If s is the length of the side of the square box, then the area of a side is given by Example 2: If s is the length of the side of a square box then we can find the area and volume. A = s2, A = s2 and the volume is given by V = s3 V = s3. 4. do NOT substitute mathematical symbols for words within sentences, as in the second of the following two examples. Example 1: For this work L is the length of the side of a rectangle. Example 2: For this work L = the length of the side of the rectangle. 5. do pay attention to significant figures. For example, your calculator might read a value of 27.3416927482, but you may not need to report all of those digits unless you are trying to show accuracy in the later decimal places. 6. do use scientific notation when numbers vary by orders of magnitude, meaning that the exponent is really what matters in understanding the significance of the value. For example, the diameter of the sun is 1.391e6 km, while the diameter of a baseball is 2.290e−4 km. 43 7: putting it all together Technical writing takes practice, but the end result should be something of which to be extremely proud. In reviewing your final paper, you can step back and look at all you have accomplished throughout the modeling process. Ultimately your model can lead to the creation of new knowledge and provide deeper insight about the world we live in. Remember that modeling also takes practice so the next time you tackle an open-ended problem you will already have a new set of tools that will make the entire experience go more smoothly. 7. Do label figures and use a large enough font so that the axes are clearly readable. 8. Do not forget to include units as appropriate. 9. Do check carefully for spelling and grammar mistakes, especially those that spell check might miss. For example, it’s easy to confuse their, there, and they’re. 10. Do give credit where credit is due. This means including citations and building your bibliography as you go. in summary 1 Take notes throughout your entire modeling process so that you do not leave out anything important, especially assumptions made along the way. 2 Give yourself enough time to focus on the writing process and to proofread the report. 3 Keep in mind that this is a technical document, not a story about your modeling experience. 4 Follow the guidelines for technical writing. 5 Some additional references on technical writing can be found at [3]. 6 Pat yourself on the back for your accomplishments. 44 Appendices & ReferenceS A. FORWARD EULER METHOD Let’s explore how the forward Euler methods works in the context of the varying-rate disease model. As a reminder, our model is described with the differential equation We start with the initial population, I0 = 20. That is, when t = 0 days, I = 20. In other words, we know that the point (0, 20) is on the graph of the solution. We also know what the slope of the solution curve is at that point because we have an equation for dIdt : dI = kI(t)(P − I(t)), dt dI dt where I is the number of infected individuals, P is the total population, and k is a positive constant. We’ll use the same parameter values we used to plot the analytic solution: k = 0.0006, P =1000, and I0 =20. The method we will use here is called the forward Euler method, which takes advantage of the fact that the derivative is the same thing as slope of the tangent line. We’ll demonstrate the method on this example, but do not want to overwhelm the reader with too many details. We point you toward [9, 4] for a deeper look into this very powerful approximation method. = k I (0) (P − I (0)) t=0 =kI0 (P − I0) = (0.0006) · 20 · (1000 − 20) = 11.76. Now we can write the equation of the line through the point (0, 20) with slope of 11.76 as follows. Recall that the independent variable is t and the dependent variable is I. 45 I − 20 = 11.76(t − 0), I = 11.76(t − 0) + 20. appendices & REFERENCES Figure 10. Numerical solution of the varying-rate disease model output with k = 0.0006, P = 1000, I0 = 20, and time step Δt = 1 day Infected Population Infected Population 1000 1000 750750 500500 250250 0 0 0 0 6.25 6.25 12.5 18.75 12.5 18.75 Time period (Days) 25 25 Time (days) Hence, we have a line with slope of 11.76 and y-intercept (or I-intercept, in this case) of 20. We can use this linear function as an approximation for the true solution. We assume that the solution to the differential equation is approximately the same as this line for points nearby. Perhaps we’ll assume it’s a good enough approximation through time t = 1. We can find the number of infected individuals at t = 1. dI dt = k I (1) (P − I (1)) t=1 = (0.0006) · 31.76 · (1000 − 31.76) = 18.45. As before, we can find the equation of the line through the point (1, 31.76) with slope of 18.45. I(1) = 11.76(1 − 0) + 20 = 31.76. I − 31.76 = 18.45(t − 1) I = 18.45(t − 1) + 31.76. We will assume that this makes a good enough approximation for the solution through t = 2. Thus we estimate the population at time t = 2 to be (Note: While it’s not possible to have a fractional person—or, more specifically, 0.76 of an infected individual —it doesn’t deter us from continuing to approximate the solution using this method. We do suggest noting that something unrealistic has occurred and encourage you to re-examine it later when assessing your model.) Thus we have a line segment from (0, 20) to (1, 31.76). Now you can image us starting the process over. In other words, we assume that the point (1, 31.76) is on the solution curve, and we can use the derivative to give us the slope at that point: I(2) = 18.45(2 − 1) + 31.76 = 50.21. Once again, we have identified a solution method requiring multiple iterative calculations, which can be easily performed using technology such as Excel. As before, now that we have a table of values in Excel, we can generate a plot of our numerical solution. (See Figure 10.) 46 3 B. the 2013 M Challenge Moody’s Mega Math Challenge 2013 Moody’s Mega Math Challenge 2013 Problem and the solution paper from team 1356 ® ® A contest for high school students A contest for high school students SIAM SIAM Society for Industrial and Applied Mathematics Society for Industrial and Applied Mathematics 3600 Market Street, 6th Floor 3600 Market Philadelphia, PAStreet, 191046th USAFloor Philadelphia, PA 19104 USA M3Challenge@siam.org M3Challenge@siam.org M3Challenge.siam.org M3Challenge.siam.org Waste WasteNot, Not,Want WantNot: Not:Putting PuttingRecyclables RecyclablesininTheir TheirPlace Place Plastics are embedded in a myriad of modern-day products, from pens, cell phones, and storage containers to Plastics are embedded in a myriad of modern-day products, from pens, cell phones, and storage containers to car parts, artificial limbs, and medical instruments; unfortunately, there are long-term costs associated with these car parts, artificial limbs, and medical instruments; unfortunately, there are long-term costs associated with these advances. Plastics do not biodegrade easily. There is a region of the Northern Pacific Ocean, estimated to be advances. Plastics do not biodegrade easily. There is a region of the Northern Pacific Ocean, estimated to be roughly the size of Texas, where plastics collect to form an island and cause serious environmental impact. While roughly the size of Texas, where plastics collect to form an island and cause serious environmental impact. While this is an international problem, in the U.S. we also worry about plastics that end up in landfills and may stay this is an international problem, in the U.S. we also worry about plastics that end up in landfills and may stay there for hundreds of years. To gain some perspective on the severity of the problem, the first plastic bottle was there for hundreds of years. To gain some perspective on the severity of the problem, the first plastic bottle was introduced in 1975 and now, according to some sources, roughly 50 million plastic water bottles end up in U.S. introduced in 1975 and now, according to some sources, roughly 50 million plastic water bottles end up in U.S. landfills every day. landfills every day. The United States Environmental Protection Agency (EPA) has asked your team to use mathematical modeling to The United States Environmental Protection Agency (EPA) has asked your team to use mathematical modeling to investigate this problem. investigate this problem. How big is the problem? Create a model for the amount of plastic that ends up in landfills in the United States. How big is the problem? Create a model for the amount of plastic that ends up in landfills in the United States. Predict the production rate of plastic waste over time and predict the amount of plastic waste present in landfills Predict the production rate of plastic waste over time and predict the amount of plastic waste present in landfills 10 years from today. 10 years from today. Making the right choice on a local scale. Plastics aren’t the only problem. So many of the materials we Making the right choice on a local scale. Plastics aren’t the only problem. So many of the materials we dispose of can be recycled. Develop a mathematical model that a city can use to determine which recycling dispose of can be recycled. Develop a mathematical model that a city can use to determine which recycling methods it should adopt.You may consider, but are not limited to: methods it should adopt.You may consider, but are not limited to: • providing locations where one can drop off pre-sorted recyclables • providing locations where one can drop off pre-sorted recyclables • providing single-stream curbside recycling • providing single-stream curbside recycling • providing single-stream curbside recycling in addition to having residents pay for each container of garbage • providing single-stream curbside recycling in addition to having residents pay for each container of garbage collected collected Your model should be developed independent of current recycling practices in the city and should include some Your model should be developed independent of current recycling practices in the city and should include some information about the city of interest and some information about the recycling method. Demonstrate how your information about the city of interest and some information about the recycling method. Demonstrate how your model works by applying it to each of the following cities: Fargo, North Dakota; Price, Utah; Wichita, Kansas. model works by applying it to each of the following cities: Fargo, North Dakota; Price, Utah; Wichita, Kansas. How does this extend to the national scale? Now that you have applied your model to cities of varying How does this extend to the national scale? Now that you have applied your model to cities of varying sizes and geographic locations, consider ways that your model can inform the EPA about the feasibility of recycling sizes and geographic locations, consider ways that your model can inform the EPA about the feasibility of recycling guidelines and/or standards to govern all states and townships in the U.S. What recommendations does your guidelines and/or standards to govern all states and townships in the U.S. What recommendations does your model support? Cite any data used to support your conclusions. model support? Cite any data used to support your conclusions. Submit your findings in the form of a report for the EPA. Submit your findings in the form of a report for the EPA. The following references may help you get started: The following references may help you get started: http://www.epa.gov/epawaste/nonhaz/municipal/index.htm Moody’s Mega Math Challenge supports http://www.epa.gov/epawaste/nonhaz/municipal/index.htm Moody’sof Mega Math Challenge supports Mathematics Planet Earth (MPE2013). http://5gyres.org/what_is_the_issue/the_problem/ Mathematics of Planet Earth (MPE2013). www.mpe2013.org http://5gyres.org/what_is_the_issue/the_problem/ www.mpe2013.org Organized by SIAM Organized by SIAM Society for Industrial and Applied Mathematics Society for Industrial and Applied Mathematics Funded by Funded by 47 ® ® Team #1356, Page 1 of 20 Analysis of Plastic Waste Production and Recycling Methods EXECUTIVE SUMMARY In 2010 alone, the U.S. generated approximately 250 million tons of trash [1]. Much of this waste consisted of plastics, which build up in landfills and flow into oceans through storm drains and watersheds [2], breaking up into little pieces and absorbing contaminants in the process. A major method to reduce waste is recycling, where materials like glass, paper, and plastic are reformed to create new products. There are many different methods of collection of recyclable materials, including drop-off centers, where citizens transport their recyclables; single stream curbside collection, where the city collects the recyclables of each household; and dual stream curbside collection, where the city collects recyclables that are pre-sorted by each household. To encourage or subsidize recycling programs, some cities may implement a Pay-AsYou-Throw (PAYT) program, where citizens pay a fee based on the amount of garbage they throw away. The EPA tasked us to analyze the production and discard rate of plastic waste over time. We were also asked to create a model of possible methods for recycling collection to determine which methods are appropriate for what cities. Using a linear regression model over years passed since 2000, we estimated that 35.1 million tons of plastic waste will be discarded in 2023. We also modeled the use of drop-off centers, single stream curbside collection, and dual stream curbside collection to calculate the total amount of recyclables collected as well as the cost to the city using each recycling method. For collection using drop-off centers, we developed a simulation that randomly simulated the number of households who would recycle when drop-off centers were placed around the city. The simulation took into account the area, population, average household, maximum distance citizens are willing to travel, and number of drop-off centers. Using these data, we calculated the amount recycled and then calculated the net cost to the city by subtracting operating costs of Material Recovery Facilities (MRFs) from revenue generated by selling recycled products. For curbside collection, we calculated the number of trucks needed to service a given city, based on population density. Based on labor, upkeep, and fuel, we calculated the costs of a curbside collection program. Again, using the calculated amount of collected material, we determined the net revenue generated by these products. We determined that by using a drop-off center method, Fargo and Wichita would generate profits, while Price would incur costs that could be partially covered by using Pay-AsYou-Throw. Using a curbside collection method, Fargo and Price would incur costs that could be partially covered by Pay-As-You-Throw, while Wichita would generate profits using either single or dual stream collection. Thus, either drop-off or curbside collection methods may be feasibly implemented in cities around the U.S., depending on population and area of each city. We concluded that small cities tend to incur net costs from recycling programs, while larger cities like Wichita may profit from using a dual stream curbside collection program. To assess use of recycling programs on a national level, we programmed a computer simulation generating an image of all the counties of the U.S., where blue dots on the U.S. map represented counties where at least one of our three proposed recycling programs earned a net profit. In general, we recommend that the EPA extend recycling program guidelines to the national level. 48 Team #1356, Page 2 of 20 I. INTRODUCTION 1. Background Each year, the U.S. consumes billions of bags and bottles. However, of the plastics that the U.S. produces, only 5% is recovered [2]. Unrecycled plastics present a growing hazard because they contain dangerous chemicals like polycarbonate, polystyrene, PETE, LDPE, HDPE, and polypropylene, which accumulate over time and build up in our oceans and landfills. As such, it is important to assess the scale of our plastic waste production problem over time. Our foremost method of reducing wastes like plastics is through recycling, where useful materials including glass, plastic, paper, and metals are recovered so that they may be used to create new products [3]. There exist several methods of recycling collection; in general, cities may use either use drop-off centers or curbside collection. With drop-off centers, the residents carry the burden of transporting their recyclable waste, while curbside collection places this burden on the city. If a city implements curbside collection, it may choose to use single stream, dual stream, or pre-sorted methods; in single stream, all recyclables are collected as one unit, whereas in dual stream, recyclables are separated into paper and glass, cans, and plastic [4]. Further separation exists with the pre-sorted collection method, where recyclables are fully separated by material type [5]. There are advantages and disadvantages associated with each method of collection, and in choosing the type of recycling program to implement, cities must consider, among other factors, the practicality of individual household collection, as well as the volume of recyclables that would be collected using each program [6]. Some communities may use Pay-As-You-Throw (PAYT) programs, which encourage residents to recycle their waste so as to avoid fees dependent on the weight of their trash [7]. We assess in this analysis whether it is more efficient to use drop-off centers or curbside collection, depending on the city where the recycling program is being implemented, as well as the effect of using PAYT programs to generate additional revenue for the city. 2. Restatement of the Problem In this analysis, we were requested by the EPA to create a model to predict the change in plastic production rate over time, as well as the amount of plastic waste in landfills in the year 2023. We were further asked to look at various recycling methods, not limited to the recycling of plastics, and to analyze the recycling method a city should develop, using as sample points the cities of Fargo, North Dakota; Price, Utah; and Wichita, Kansas. Finally, the EPA requested that we provide recommendations for developing recycling methods on the national level based on the model we designed. 3. Global Assumptions Throughout our analysis, we will make the following assumptions: 1 A city’s population is approximately evenly distributed. Population mostly varies on a large scale: in the small microcosm of a city, the population density will not vary much. 2 A city’s shape is approximately square. Most cities are shaped like this, as are the three sample cities we were provided with. 3 A city’s roads are laid out in a grid plan. The popularity of the grid plan is pervasive, dating back to Ancient Rome, and most cities are organized as such, like our three sample cities. [8, 9, 10]. 49 Team #1356, Page 3 of 20 4 A household’s recycling stance is consistent. That is, a household that recycles always recycles, and a household that does not will never recycle. Recycling is a habit, and households that recycle tend to recycle consistently. II. ANALYSIS OF THE PROBLEM AND THE MODEL 1. Plastic Waste Production Assumptions 1 We used data collected from the past ten years because the first plastic bottle was introduced in 1975 [11], and recycling has only become important recently. In other words, values used before 2000 would not adequately take into account the recycling methods which have now become widespread. Approach We created our model by performing linear and logistic regressions on the amount of plastic waste discarded per year for the last decade in thousands of tons as provided by the EPA [1]. Model Discarded Plastic Waste (thousands of tons) = 463.27 * (years since 2000) + 24443.6 R2 = .803; S = 801 The R2 value of .803 means that 80.3% of the variability in the amount of plastic waste discarded is explained by the linear relationship between years passed since 2000 and plastic waste amount. The standard deviation of the residuals was 801. Based on this model, the amount of unrecycled plastic waste discarded in 2023 will be 463.27 * 23 + 24443.6 = 35098.81 thousands of tons, or 35.1 million tons. 50 Team #1356, Page 4 of 20 Discarded Plastic Waste (thousands of tons) = 23733.8 + (28014.1 - 23733.8) / (1 + exp((Years since 2000 - 2.71325) / -0.752611)) S = 422 The previously mentioned R2 value only makes sense under the assumption that the linear model was appropriate. Since there is a prominent bend in the data, we fit them with a logistic curve as well. A statistical software found the four parameters using a successive approximation method, and produced the model above. The standard deviation of the residuals in this model is only 422, which is almost twice as small as it was in the linear model. Unfortunately, this model assumes that the tonnage of discarded plastic waste will level off, which is not entirely reasonable. It does, however, give a best-case result (e.g. if recycling initiatives work perfectly). The projected value of discarded waste for 2023 is 28014.1 thousand tons (within 4 decimal places), which is the maximal value according to the model. In summary, the linear model (which seems to overpredict the later values) yields a value of 35.1 million tons, while the logistic model (which levels off) predicts that it will level off at 28.0 million tons. The US population has been increasing linearly since 2000 [12], so the linear model gives a more plausible value for the next ten years. 2. Recycling Methods Assumptions 1 City shape can be approximated as a square or diamond. Most cities in the U.S. are square-like in shape, including Fargo, North Dakota, Price, Utah, and Wichita, Kansas. 2 The streets of the city are laid out in a grid. Many large cities have streets following a grid, including Fargo, North Dakota; Price, Utah; and Wichita, Kansas all use grid systems 3 There is no overlap between use of drop-off centers and curbside collection. 4 The composition of recyclables in the MSW stream is fixed over the entire planning horizon. Model 1: Drop-off Centers Approach Assumptions 51 Team #1356, Page 5 of 20 1 Each household makes a collective decision on whether or not to recycle because it is convenient for a household to transport all of their recyclables to a drop-off center together. 2 The probability of a household’s deciding to recycle varies linearly with the household’s distance to the nearest drop-off center. 3 Recycling households recycle all recyclable waste. To assess the amount of recyclables collected by a recycling program dependent on dropoff centers, we created a computer simulation where we assumed uniform population density and where we place equally spread drop-off centers around the city, as many as would fit without overlapping coverage. To determine whether each household would recycle, a random number from 0 to 1 is generated, and if the number is less than the household’s probability of recycling, which we assumed varies linearly with the household’s distance to the nearest drop-off center, the household recycles. We also determined the cost of maintaining each recycling center and the revenue the center would generate, and used these data to calculate the total cost to the city of the drop-off center program. In our simulation, we accepted as inputs the area of the city, population of the city, average number of people in a household, maximum distance citizens are willing to travel, and number of drop-off centers. Taxicab Distance Because streets are assumed to be organized in a grid, we calculate distance as “taxicab distance”, or distance in which the only path allowed consists of horizontal and vertical lines. In other words, given px and py as the coordinates of the drop-off center, and x and y as the coordinates of the household, the distance between them, d, can be calculated as: d = |y - py| + |x - px| A Household’s Maximum Distance to a Drop-off Center Assumption 1 Recycling households make biweekly trips to a drop-off center. We recommend that cities conduct a survey to determine the distance their citizens are willing to travel in order to recycle, though we calculated this distance in our model. U.S. citizens are willing for their household to pay $2.29 a month for curbside collection [13]. Since this is the amount that they are willing to pay to recycle at greatest convenience, we can assume that it is equivalent to the maximum amount they are willing to pay as the driving cost to a dropoff center. The average price of a gallon of gas is $3.784 [14] and the average mileage of a passenger car in 2010 was 23.8 mpg [15]. The cost of traveling a distance d is: Cost = ($3.784/gallon) * d / (23.8 miles/gallon) The distance citizens are willing to travel each week is: 52 Team #1356, Page 6 of 20 ($2.29 dollars/month) / (4.35 weeks/month) = $0.53 dollars/week = ($3.784/gallon) * (d miles/week) / (23.8 miles/gallon) d miles/week = ($0.53 dollars/week) * (23.8 miles/gallon) / ($3.784 dollars/gallon) = 3.33 miles/week Since citizens must drive to the drop-off center and back, the maximum distance driven to the drop-off center is 1.665 miles/week. Assuming that households make biweekly recycling trips, the maximum distance from a household to a drop-off center for the household to consider recycling is 1.665 miles/week * 2 weeks = 3.33 miles. A study of drop-off recycling participation in Ohio supports our model, finding that the functional usage area of a full-time urban drop-off center is about 3.5 miles [16]. Number of Recycling Households Covered by a Drop-off Center Assumption 1 The available data from Ohio are representative of the U.S. as a whole. Each drop-off center will receive recyclables from households up to 3.33 miles away. Using taxicab distance, which only allows horizontal and vertical movement, the area within 3.33 miles is bounded by a diamond (a square rotated 45°). The diagonal of the diamond is twice the distance from the center to a corner, or 2 * 3.33 miles = 6.66 miles. Since the diamond is a square, diagonal length = square root(2) * side length, so the side length is 4.71 miles. The area of the diamond is side length ^ 2 = 22.18 sq. mi. This is the coverage area of the drop-off center, which contains all the households that will consider using the drop-off center. The number of households in the drop-off center’s coverage area is: Households = 22.18 sq. mi * (population / land area) / (average household size) A study of drop-off recycling participation in Ohio found that 15.5% of citizens who do not have access to curbside recycling use drop-off recycling [16]. Assuming that this data is representative of the U.S. as a whole, the number of recycling households covered by each dropoff center is: Recycling households = 22.18 sq. mi * (population / land area) / (average household size) * .155 In our simulation, we assigned 15.5% as the median household probability of recycling. The closer a household is to the drop-off center, the more likely it is to recycle. Within the dropoff center coverage area, the closer half of households has a greater than 15.5% recycling probability and the farther away half of households has a less than 15.5% recycling probability. The distance from the center to the boundary of the closer half of households is the diagonal of the square with half the area of the entire coverage area, which is: Halfway distance = square root(22.18 sq. mi. / 2) * square root(2) = 4.71 miles 53 Team #1356, Page 7 of 20 We assumed that the probability of a household recycling varies linearly with the distance to the nearest recycling center. At a distance of 4.71 miles, the probability is 15.5%. At the boundary distance of 6.66 miles, the probability is 0%. Extending the line through these points, at the center, with a distance of 0 miles, the probability is 52.9%. In our simulation, the number of recycling households covered by each drop-off center is approximately the same as that calculated using the formula previously given. Drop-off Center Placement In our simulation, we placed as many drop-off centers as possible in each city so that none of the coverage areas overlap, with at least one drop-off center in each city. The cost efficiency of drop-off centers decreases when their coverage areas overlap. Annual Amount Recycled The average American generates 4.5 pounds of waste per day [17], about 75% of which is recyclable [18]. Thus, the average American generates 4.5 pounds * 0.75 = 3.375 pounds of recyclable waste per day. Annual Amount Recycled (tons) = (recycling households) * (average household size) * 3.375 lb * 365 days/year * 0.005 lb/ton * (# drop-off centers) This formula can be used in place of our simulation to calculate annual amount recycled, as long as there is no overlap between drop-off center coverage areas and the drop-off center coverage area is entirely contained within the city. For example, because the drop-off center coverage area (22.18 sq. mi.) is much larger than the area of Price, Utah (4.2 sq. mi.), this formula cannot be used in place of our simulation for Price, Utah. Using our simulation, we were able to calculate the annual amount recycled for Fargo, North Dakota; Price, Utah; and Wichita, Kansas, as well as to visualize the households contributing recyclables to each city. In the screenshots below, the white dots represent the dropoff centers; the green dots represent households that are recycling; and the red dots represent households that are not recycling. Fargo, North Dakota Annual Amount Recycled (tons) = 5209.66 Number of people recycling= 8458 54 Team #1356, Page 8 of 20 Land Area = 48.82 sq. mi. Population = 105,549 people Average household size = 2.15 people Price, Utah Annual Amount Recycled (tons) = 1876.88 Number of people who recycle = 3047 55 Team #1356, Page 9 of 20 Land Area = 4.2 sq. mi. Population = 8,402 people Average household size = 2.60 people Wichita, Kansas Annual Amount Recycled (tons) = 16929.56 Number of people who recycle = 27486 Land Area = 159.29 sq. mi. Population = 382,368 people Average household size = 2.48 people Drop-off Center Cost A report by design engineering company R.W. Beck, Inc. recommends front load dumpsters as the most cost-effective type of drop-off center. Under this plan, front load dumpsters would be set up at each drop-off center site and recyclables would be collected in two streams, commingled containers and paper. The annual cost of a front load dumpster site is about $5,575 per year [19]. Thus, the total annual cost of drop-off centers is: Annual cost of drop-off centers = $5,575 * (# drop-off centers) Revenue Generated To calculate the total revenue per ton generated from selling recycled products, we used the following formula, taking into account the market price per ton for each product [20, 21, 22, 23, 24, 25, 26, 27]: Revenue per ton = RevenuePaper + RevenueGlass + RevenueFerrous Metals + RevenueAluminum + RevenuePlastic + RevenueTextiles + RevenueWood = (.7012*$112.82) + (.0492*$13) + 56 Team #1356, Page 10 of 20 (.1131*$217.75) + (.0107*$310) + (.0401*$370) + (.031*$100) + (.0362*$296) +($135*.0180) = $128.78 per ton of recycled material [1] Based on a study conducted on recycling collection and processing options in New Hampshire [28], cities can decide between small, medium, and large Materials Recovery Facilities (MRFs) depending on the annual tonnage. The cost per ton using dual stream and cost per ton using single stream varies depending on the size of the MRF. For drop-off centers, we are assuming that dual stream is used. Fargo, North Dakota We calculated that Fargo would collect 5,209.66 tons of recyclables. This suggests that a medium tonnage mini MRF, which has an annual tonnage of 5,283, is sufficient for the city. The cost per ton of a medium mini MRF using dual stream is $124.62. Since the material revenue per ton was previously found to be $128.78, we can calculate the net cost per ton as: Net cost = $124.62 - $128.78 = -$4.16 The total cost to the city can then be calculated as: Total cost = -$4.16 per ton * 5,209.66 tons + $5,575 per drop-off container * 1 container = -$16,097.19 (profit) Price, Utah We calculated that Price would collect 1876.88 tons of recyclables. Price would use a low tonnage mini MRF, and the net cost per ton would also be $89.69. Then, the total cost to the city is: Total cost = $89.69 per ton * 1876.88 tons + $5,575 per drop-off container * 1 container = $173,912.37 Wichita, Kansas 57 Team #1356, Page 11 of 20 We calculated that Wichita would collect 16,929.56 tons of recyclables, suggesting that Wichita would require a high tonnage mini MRF, which has an annual tonnage of around 7,500. For a high tonnage mini MRF, the cost per ton for dual stream is $95.40. Since the material revenue is $128.78 per ton, the net cost per ton is: Net cost = $95.40 - 128.78 = -33.38 This represents a profit of $33.38 per ton of recycled material. The total cost to the city is then: Total cost = -$33.38 per ton * 16,929.56 tons + $5,575 per drop-off container * 1 container = -$559,533.71 (profit) Pay-As-You-Throw If the city implements a Pay-As-You-Throw (PAYT) program, it will collect revenue from citizens who must pay an amount depending on the volume of waste they generate. We can calculate revenue generated by such a program by using the formula [29]: RevenuePAYT = (WeightWaste / VolumeContainer * PriceContainer - PriceStartup, Maintenance per day) * Population The average American generates 4.5 lbs of waste and recycles 1.5 lbs [17]. PAYT programs cost around $0.28 per capita, based on surveys of Wisconsin and Iowa [30]. We also simplified VolumeContainer * PriceContainer as Container price/pound, since the containers are meant to hold specific amounts of weight. Thus, the revenue generated by PAYT for citizens who recycle can be calculated as: RevenuePAYT, Recycle = ((4.5 lbs - 3.375 lbs) * Container price/pound - $0.28/365) * PopulationRecycle RevenuePAYT, Don’t recycle = (4.5 lbs * Container price/pound - $0.28/365) * PopulationDon’t recycle Total revenuePAYT = RevenuePAYT, Recycle + RevenuePAYT, Don’t recycle = ((4.5 lbs - 3.375 lbs) * Container price/pound - $0.28/365) * PopulationRecycle+ (4.5 lbs * Container price/pound $0.28/365) * PopulationDon’t recycle Fargo, North Dakota Total revenuePAYT = (1.125 lbs * x - $0.28/365) * 8,458 peopleRecycle + (4.5 lbs * x - $0.28/365) * (105,549 - 8,458 peopleDon’t recycle) Price, Utah Total revenuePAYT = (1.125 lbs * x * - $0.28/365) * 3047 peopleRecycle + (4.5 lbs * x - $0.28/365) * (8,402 - 3,047 peopleDon’t recycle) Wichita, Kansas Total revenuePAYT = (1.125 lbs * x - $0.28/365) * 27,486 peopleRecycle + (4.5 lbs * x - $0.28/365) * (382,368 - 27,486 peopleDon’t recycle) 58 Team #1356, Page 12 of 20 The following table provides the total revenue generated by a PAYT program if the container price per pound were $0.01, $0.05, or $0.10. Sensitivity Analysis We tested the sensitivity of our simulation of the annual amount recycled in a city using a drop-off recycling program. We changed population and area by +/- 2%, 5%, and 10% and examined the resulting change in annual amount recycled. For simplicity, we only examined the changes for one of our sample cities: Fargo, North Dakota. The annual amount recycled responds approximately linearly to both area and population. The response is not precisely linear because the randomness used in the simulation to determine whether each household recycles introduces some variation between different runs of the simulation. Since the slopes are small, a slight error in the initial parameters would not significantly change the simulation’s output. 59 Team #1356, Page 13 of 20 Model 2: Curbside Collection Assumptions 1 Each city has only one recycling processing plant, located at the geographic center, as we found that one large-scale processing center is more than enough to cover one city’s recycling needs. 2 Recycling collection comes biweekly. Approach We subdivided the city into zones for which one garbage truck was responsible. Each truck is responsible for driving to its zone, collecting all the recyclable waste it can, and delivering it to the central processing center, which then sorts and processes the recyclable waste. Recyclable Waste Collected and Cost to City We divide the cost to the city into three parts: the cost of gasoline, the wages of the truck drivers, and the price of truck upkeep. The cost is as follows: Cost = (Price of diesel fuel in dollars/gallon) * distance / (Truck miles/gallon) + (Num houses) / (Houses/hour) * (Driver wage/hour) + Truck_Upkeep The number of houses visited per hour varies depending on whether a single stream or dual stream collection method is used; for single stream, 171 households are visited per hour, while for dual stream, 130 households are visited per hour [31]. The mileage of a truck is 5 mpg, with a cost of $4.02 per gallon. The average wage of a truck driver is $16 dollars/hour [19]. Thus, the formulas for single stream and dual stream collection costs are as follows: Single stream cost = ($4.02 dollars/gallon) * distance / (5 miles/gallon) + (Num houses) / (171 houses/hour) * ($16 wage/hour) Dual stream cost = ($4.02 dollars/gallon) * distance / (5 miles/gallon) + (Num houses) / (130 houses/hour) * ($16 wage/hour) We assume that a truck driver can only collect for 7 hours a day: (8 hour work day, minus an hour for lunch and driving). So, a truck driver has a maximum amount of households s/he can visit in a biweekly circuit (171 * 7 * 10=11970 for single-stream and 130*7*10=9100 for dualstream). When a driver is tasked with more houses that s/he can visit, we simply used this ceiling. To demonstrate, the graphs below show efficiency, in terms of tons of recyclable waste collected per thousand dollars, versus the number of trucks in each city. 60 Team #1356, Page 14 of 20 Using the model, we calculated the optimum number of trucks for each city for either dual stream or single stream curbside collection depending on the efficiency of the collection, quantified using the tons of recycled material collected per $1,000, and the total amount of recycled waste collected. The results for optimum number of trucks are shown below: City Single Stream Dual Stream Fargo 5 6 Price 1 1 Wichita 13 17 Given the optimum number of collection trucks, the annual cost and tons of recyclable waste collected can then be determined using our computer simulation. City Single Stream Dual Stream Tons of Waste Collection Cost Tons of Waste Collection Cost Fargo 25292.85 $205,787.20 25933.39 $319,383.30 Price 2064.37 $24,526.54 2064.37 $27,005.94 Wichita 93947.82 $713,424.24 88719.32 $798,496.11 61 Team #1356, Page 15 of 20 To calculate the revenues and costs generated or incurred from curbside collection, we needed to determine the cost of recycling and sorting at large-scale MRFs. To calculate the net costs per ton of material in processed in a MRF, we used data from Resource Recycling Systems [31] to find operating, capital, and maintenance costs for MRFs of different tonnage capacity. A graphical representation of the processing and operating costs is shown below[31]: Fargo, North Dakota Single stream: Using our model, we calculated that Fargo would generate 25,292.85 tons of recyclables annually using single stream curbside pickup. The operating cost is about $130 per ton for a dual stream MRF of the same tonnage capacity [31]. However, single stream MRFs have greater processing costs in the range of $10-15 per ton (averaged at $12.5), because of greater sorting required [4]. Using the revenue generated from selling recovered material, as calculated in the Drop-Off Center section to be $128.78, the net cost and total cost are: Net cost per ton = ($130+ $12.5 )- $128.78 + = $13.72 per ton Total cost = 25,292.85 tons * $1.22 per ton + Collection cost = $347,017.90 + $205,787.20 = $552,805.1 Dual stream: Using our model, we calculated that Fargo would generate 25,933.39 tons of recyclables annually using dual stream. The operating cost is about $130 per ton. Thus, net cost and total cost are: Net cost per ton = $130 - 128.78 = $1.22 per ton Total cost = 25,933.39 tons * $1.22 per ton + Collection cost = $31,638.74 + $319,383.30 = $351,022.04 Price, Utah Single stream: 62 Team #1356, Page 16 of 20 Using our model, we calculated that Price would generate 2,064.37 tons of recyclables annually using single stream. A mini MRF, with an annual tonnage of 2,649, is sufficient. The operating cost is about $245.62 per ton for single stream. Thus, net cost and total cost are: Net cost per ton = $245.62 - 128.78 = $116.84 Total cost = 2,064.37 tons * $116.84 per ton + Collection cost = $24,526.54 + $241,201 = $265,727.53 Dual stream: Using our model, we calculated that Price would generate 2,064.37 tons of recyclables annually using dual stream. A mini MRFis again sufficient. The operating cost is about $218.47 per ton for dual stream. Thus, net cost and total cost are: Net cost per ton = $218.47 - 128.78=89.69 Total cost = 2064.37 tons * $89.69 per ton + Collection cost = $24,526.54 + $27,005.94 = $212,159.29 Wichita, Kansas Single stream: Using our model, we calculated that Wichita would generate 93,947.82 tons of recyclables annually using single stream. Thus, net cost and total cost are: Net cost per ton = ($95 + $12.5) - $128.78 = -$21.28 Total cost = 93,947.82 tons * -$21.28 per ton + Collection cost = -$3,173,557 + $713,424.24 = -$1,285,785.61 As the cost is negative, the city receives a profit. Dual stream: Using our model, we calculated that Wichita would generate 88,719.32 tons of recyclables annually using dual stream. Thus, net cost and total cost are: Net cost per ton = $95 - $128.78 = -$33.78 Total cost = 93,947.82 tons * -$33.78 per ton + Collection cost = -$3,173,557.36 + $798,496.11 = -$2,375,061 The city again receives a profit. Pay-As-You-Throw We can apply the Pay-As-You-Throw revenue formulas calculated in the Drop-Off Centers section: 63 Team #1356, Page 17 of 20 Total revenuePAYT = RevenuePAYT, Recycle + RevenuePAYT, Don’t recycle = ((4.5 lbs - 3.375 lbs) * Container price/pound - $0.28/365) * PopulationRecycle+ (4.5 lbs * Container price/pound $0.28/365) * PopulationDon’t recycle Given that 40% of people to whom curbside recycling is available recycle [16], we calculated the total revenue each city can expect from a Pay-As-You-Throw program alongside curbside recycling. The variable “x” is used to represent the container price per pound, which is up to the city to set. Fargo, North Dakota Total revenuePAYT = (1.125 lbs * x - $0.28/365) * (.40 * 105,549 peopleRecycle) + (4.5 lbs * x $0.28/365) * (105,549 - .40 * 105,549 peopleDon’t recycle) Price, Utah Total revenuePAYT = (1.125 lbs * x - $0.28/365) * (0.40 * 8,402 peopleRecycle) + (4.5 lbs * x $0.28/365) * (8,402 - 0.40 * 8,402 peopleDon’t recycle) Wichita, Kansas Total revenuePAYT = (1.125 lbs * x - $0.28/365) * (0.40 * 382,368 peopleRecycle) + (4.5 lbs * x $0.28/365) * (382,368 - 0.40 * 382,368 peopleDon’t recycle) The following table provides the total revenue generated by a PAYT program if the container price per pound were $0.01, $0.05, or $0.10. 3. Testing the Models To test our models for accuracy, cities with a current drop-off recycling, single-stream curbside collection, or dual-stream curbside collection program can be run through the models. The population, area, and other required attributes of the city will be input into our models, and 64 Team #1356, Page 18 of 20 the accuracy of our models would be confirmed if the model results for annual amount recycled and net annual cost to the city are similar to the values in reality. 4. Recommendations When designing a recycling program, the city should identify markets for recycled materials. The characteristics of the market determine how recyclables should be collected, processed, and eventually sold [32]. To extend our model to the national level, we took US Census data from 2000 which recorded the population density. We then tested each county. On the national level, the EPA should strongly encourage recycling programs for almost every county or region, particularly in denser, less rural regions. The diagram below marks all the centers of all the counties where at least one of our three proposed recycling programs turns a profit to the community, based on the models were proposed earlier. In general, across the U.S., very small cities such as Price, Utah will incur losses from a recycling program. A drop-off program cannot be used to full advantage because much of the potential coverage area of one drop-off center lies beyond the city limits. In relatively large, densely populated cities such as Wichita, Kansas, dual-stream curbside collection is generally recommended to bring the highest profits. This is due to low participation in drop-off recycling; on average, only 15.5% of potentially covered households participate. The revenue benefits of a pay-as-you-throw initiative must be balanced against the cost of its unpopularity among citizens. A pay-as-you-throw initiative is generally recommended for small cities such as Price, Utah that seek to adopt a recycling program but incur losses no matter what the program. In these cases, a pay-as-you-throw initiative is recommended to offset losses to the city. III. CONCLUSION Effective recycling programs are critical for cities to address the waste accumulation in landfills. Based on our models, we conclude that drop-off centers, curbside collection, and payas-you-throw initiatives can all be feasible recycling programs, depending on the population and area of a given city. All the models are resistant to minor changes in the input values and can be applied to any city. 65 Team #1356, Page 19 of 20 The population growth of the U.S. has a notable effect on the change in the amount of plastic waste discarded in landfills each year. Partly because U.S. population growth has been linear in recent years, we determined that a linear model was most appropriate for predicting the amount of plastic waste discarded. Our linear model predicts that 35.1 million tons of plastic waste will be discarded in 2023, an increase of 13% over 2010. Using a drop-off program, Fargo, North Dakota, and Wichita, Kansas would both generate profits from the sales of recovered materials. The net profits leave an unpopular PAYT initiative unnecessary. In Price, Utah, however, a drop-off program sustains losses because of the very small size of the city. For this reason, we recommend that Price adopt a PAYT initiative to raise revenues and offset costs of the drop-off program. Using any curbside program, single-stream or dual-stream, Fargo, North Dakota and Price, Utah incur losses. In both cities, a drop-off program is recommended: in Fargo, because a drop-off generates profits and in Price, a drop-off generates less losses than a curbside collection. In Wichita, Kansas, however, both a single and dual stream curbside collection generate a profit, leaving all three programs feasible. Dual-stream curbside collection is strongly recommended, however, for the highest profits. Nationally, small cities generally incur losses with any recycling program, as seen in our model results for Price, Utah. Dual-stream curbside collection is generally recommended for large, densely populated cities, who can take advantage of efficiencies of scale. The revenue benefits of a PAYT initiative must be balanced against the cost of its unpopularity among citizens, though it is recommended for small cities to help offset their losses. Recycling has environmental benefits for any city but is especially important for large, densely populated cities, where it has economic as well as environmental benefits. BIBLIOGRAPHY 1 "2010 Facts and Figures Fact Sheet (PDF) - US Environmental ..." 2012. 4 Mar. 2013 <http://www.epa.gov/wastes/nonhaz/municipal/pubs/msw_2010_rev_factsheet.pdf> 2 2013. <http://5gyres.org/what_is_the_issue/the_problem/> 3 "Municipal Solid Waste | Wastes | US EPA." 2004. 3 Mar. 2013 <http://www.epa.gov/msw/> 4 "Dual Stream” vs. “Single Stream” Recycling Programs | UMBC ..." 2012. 3 Mar. 2013 <http://umbcinsights.wordpress.com/2012/08/28/dual-stream-vs-single-stream-recyclingprograms/> 5 "Curbside Collection Study - Eureka Recycling." 2010. 3 Mar. 2013 <http://www.eurekarecycling.org/page.cfm?ContentID=72> 6 "Organizing a Community Recycling Program." 2012. 3 Mar. 2013 <https://www.bae.ncsu.edu/topic/vermicomposting/pubs/ag473-11-communityrecycle.html> 7 "Pay-As-You-Throw Programs| Conservation Tools | US EPA." 2003. 3 Mar. 2013 <http://www.epa.gov/payt/> 8 "Moving to Fargo, ND| Fargo, ND Moving Companies." 2009. 4 Mar. 2013 <http://www.upack.com/moving-companies/north-dakota/fargo/> 9 "Explore Utah - Getting Around in Utah - Navigating Utah's Streets ..." 2011. 4 Mar. 2013 <http://www.exploreutah.com/GettingAround/Navigating_Utahs_Streets.shtml> 10 Wichita travel guide - Wikitravel." 2005. 4 Mar. 2013 <http://wikitravel.org/en/Wichita> 66 Team #1356, Page 20 of 20 11 "A Brief History of Plastic - The Brooklyn Rail." 2008. 4 Mar. 2013 <http://www.brooklynrail.org/2005/05/express/a-brief-history-of-plastic> 12 "National Intercensal Estimates (2000-2010) - U.S Census Bureau." 2011. 4 Mar. 2013 <http://www.census.gov/popest/data/intercensal/national/nat2010.html> 13 "Estimating Consumer Willingness to Supply and Willingness to Pay ..." 3 Mar. 2013 <http://le.uwpress.org/content/88/4/745.refs?related-urls=yes&legid=wple;88/4/745> 14 "Gasoline and Diesel Fuel Update - Energy Information ... - EIA." 2011. 3 Mar. 2013 <http://www.eia.gov/petroleum/gasdiesel/> 15 "Fuel Economy | National Highway Traffic Safety Administration ..." 2010. 4 Mar. 2013 <http://www.nhtsa.gov/fuel-economy> 16 "Ohio EPA 2004 Drop-Off Recycling Study." 2010. 3 Mar. 2013 <http://www.epa.ohio.gov/portals/34/document/general/swmd_drop_off_study_report.pd f> 17 "Municipal Solid Waste Generation, Recycling, and Disposal in - US ..." 2009. 3 Mar. 2013 <http://www.epa.gov/osw/nonhaz/municipal/pubs/msw2008rpt.pdf> 18 "Recycling Stats | GreenWaste Recovery." 2008. 3 Mar. 2013 <http://www.greenwaste.com/recycling-stats> 19 "Enterprise Portal Information - Home." 2006. 4 Mar. 2013 <http://www.portal.state.pa.us/> 20 "SCRAP TIRE RECYCLING - Tennessee." 2008. 4 Mar. 2013 <http://www.tn.gov/environment/swm/pdf/TFscraptires.pdf> 21 "Recycled Wood Products - Sonoma Compost." 2007. 4 Mar. 2013 <http://www.sonomacompost.com/recycle_wood.shtml> 22 "Plywood Thickness and Weights: Sanded Nominal Thickness 1/4 3 ..." 2008. 4 Mar. 2013 <http://parr.com/PDFs/PG_plywoodthickness.pdf> 23 "Recycling and Composting Online." 4 Mar. 2013 <http://www.recycle.cc/freepapr.htm> 24 "Glass." 2010. 4 Mar. 2013 <http://www.ndhealth.gov/wm/recycling/Glass.htm> 25 "Daily Scrap Metal Prices - Maxi Waste Limited." 2007. 4 Mar. 2013 <http://www.maxiwaste.co.uk/scrapdailyprices.php> 26 "Aluminum: How Sustainable is It? | Streamline." 2010. 4 Mar. 2013 <http://www.streamlinemr.com/articles/aluminum-how-sustainable-is-it> 27 "Textile Recycling FAQs - Trans-Americas Trading Co." 2011. 4 Mar. 2013 <http://tranclo.com/recycling-faq.asp> 28 "Solid Waste Publications for Sullivan County - Waste Management ..." 2013. 4 Mar. 2013 <http://waste.uvlsrpc.org/index.php/solid-waste-publications-sullivan-county/> 29 "Georgia Department of Community Affairs." 2007. 4 Mar. 2013 <http://www.dca.state.ga.us/main/quickmenuListing.asp?mnuitem=PROG> 30 Skumatz, LA. "Measuring Source Reduction: Pay-As-You-Throw Variable Rates as ..." 2000. <http://www.epa.gov/osw/conserve/tools/payt/pdf/sera.pdf> 31 "Extending Recycling's Reach with MRF Hub and ... - NC SWANA." 4 Mar. 2013 <http://www.ncswana.org/files/2012%20Fall%20Presentations/FreyNC%20SWANA%20MRF%20Optimization%20presentation%20100212.pdf> 32 "Organizing a Community Recycling Program." 2012. 4 Mar. 2013 <https://www.bae.ncsu.edu/topic/vermicomposting/pubs/ag473-11-communityrecycle.html> 67 appendices & REFERENCES references [7] D. Hailman and B. Torrents. Keeping dry: Running in the rain. Mathematics magazine, 8(24):266–277, 2009. [1] S . Chaturapruek, J. Breslau, D. Yazdi, T. Kolokolnikov, and S. McCalla. Crime modeling with Lévy flights. SIAM Journal on Applied Mathematics, 73(4):1703–1720, 2013. [8] Ohio EPA 2004 Drop-Off Recycling Study. 2010. 3 Mar. 2013 http://www.epa.ohio.gov/portals/34/ document/general/swmd_drop_off_study_report.pdf. [2] C onsortium for Mathematics and Its Applications. 2013 MCM problem A: The Ultimate Brownie Pan. http://www.comap.com/undergraduate/contests/mcm/ contests/2013/problems/. [9] M. Stubna, W. McCullough, and G. L. Gray. Euler’s Method Tutorial. http://www. esm.psu.edu/courses/ emch12/IntDyn/course-docs/Euler-tutorial/. [3] A . Crannell. A Guide to Writing in Mathematics Classes. http://www.math.wisc.edu/~miller/m221/ annalisa.pdf. [10] Y. van Gennip, B. Hunter, R. Ahn, P. Elliott, K. Luh, M. Halvorson, S. Reid, M. Valasik, J. Wo, G. Tita, A. Bertozzi, and P. Brantingham. Community detection using spectral clustering on sparse geosocial data. SIAM Journal on Applied Mathematics, 73(1):67–83, 2013. [4] P . Dawkins. Paul’s Online Math Notes: Euler’s Method. http://tutorial.math.lamar. edu/Classes/DE/ EulersMethod.aspx. [5] FreeMind. http://freemind.sourceforge.net/wiki/ index.php/Main_Page. [6] A. Greenleaf, Y. Kurylev, M. Lassas, and G. Uhlmann. Cloaking devices, electromagnetic wormholes, and transformation optics. SIAM Review, 51:3–33, 2009. 68 motivation and acknowledgments We came to this project as members of the Moody’s Mega Math Challenge’s Problem Development Committee, where we have responsibility for soliciting, writing, editing, and vetting potential problems to use in this math modeling scholarship contest. Something that was made clear from the start was that many high school students are unfamiliar with how to do modeling, how to get started, and how to reach a solution. In addition, we have taken part in NSF workshops where the focus is on engaging math education, motivating students to study and pursue careers in STEM, and keeping them in those disciplines when the going gets rough through better understanding and enthusiasm for the subjects. The support of several individuals and organizations was critical to making the handbook happen: First, Frances G. Laserson, President of The Moody’s Foundation, which has funded the Moody’s Mega Math Challenge since its inception in 2006. Her commitment to education, specifically in applied math, economics, and finance, has been instrumental in enabling us work on the content contained here. Dr. Peter Turner, Dean of Arts and Sciences at Clarkson University and Vice President for Education and Chair of the Education Committee for the Society for Industrial and Applied Mathematics (SIAM), has supported both the Challenge and SIAM’s increased attention to the high school and undergraduate communities of students interested in STEM and, in particular, math modeling. Peter has also been principal investigator, along with Dr. James C. Crowley, executive director of SIAM, on the “Modeling across the Curriculum” workshops funded by the National Science Foundation (NSF) and through which the content of this handbook was also enriched. And finally, Michelle Montgomery, SIAM Director of Marketing and Outreach and Project Director for Moody’s Mega Math Challenge, and her staff team at SIAM. Michelle’s group organizes the Moody’s Mega Math Challenge, participates enthusiastically in the NSF workshops, and seeks to optimize outreach to young people through multiple projects and channels to fulfill one of SIAM’s key goals: increasing the pipeline of individuals going into applied mathematics and computational science programs and careers. #WeDidIt Karen, Katie and Ben this handbook was made possible with enthusiasm and funding from the following organizations: National Science Foundation 3600 Market Street, 6th Floor Philadelphia, PA 19104-2688 USA www.siam.org