Final Project Mathematical Modeling with Spreadsheets, Spring 2020 Instructions • You should organize all your work into a single spreadsheet with appropriately named tabs. • When ready, you should complete the Final Project Quiz, available in the Projects folder. • You should then share your sheet with Elaine (erudel@g.harvard.edu) and Eric (econnall@g.harvard.edu). • Your grade will be based on: – Completeness ∗ Did you answer all the questions? – Correctness ∗ Do your formulas show the correct values? – Design ∗ Are your sheets clear and well organized? ∗ Do you use relative, absolute, and mixed references where appropriate? ∗ Do you use appropriate built-in functions? ∗ Do you use appropriate formatting for text and numbers? • You should work on your project alone. This is not a group assignment. • The Extension School’s academic integrity policy: https://www.extension. harvard.edu/resources-policies/student-conduct/academic-integrity • If you have any questions, please let Eric or Elaine know! Problem 1 Goal In this problem you will model two interacting animal populations, Canada lynx and snowshoe hares. You will use many of the math modeling skills involving growth rates we have worked on this term. Once finished with your model, you should be able to create a chart like this: Figure 1: Snowshow hare and Canada lynx populations over time Background Interactions between these populations are generally harmful to the hares (who get eaten) and beneficial to the lynx (who do the eating). • In the absence of interactions with lynx, the hare population would grow by 32% every year, because they aren’t being eaten as often. • In the absence of interactions with hares, the lynx population would fall by 30% every year, because they aren’t eating as much. • The number of possible lynx-hare interactions is given by: Possible interactions = (Number of hares) × (Number of lynx) . – For instance, if there were 3 lynx and 8 hares, there would be 3 · 8 = 24 possible lynx-hare interactions, since each lynx might possibly eat up to 8 hares. – Of course, only a certain percentage of these interactions will ever take place. – The interactions that do take place are bad for the hare population, which is reduced at an annual rate given by 0.28 times the number of possible interactions. – On the other hand, these interactions are good for the lynx population, which is increased at an annual rate given by 0.35 times the number of possible interactions. – In all of this, the numbers 32%, 30%, 0.28, and 0.35 are all parameters of the model. In principle, they could be determined by ecologists studying these populations.1 Steps 1. Show a column for time (in years), starting at t = 0 and taking steps of ∆t = 0.02 years.2 2. Show columns for the number of hares and the number of lynx. Assume that in year t = 0 there are 2 million hares and 1 thousand lynx. • Enter a 2 for the number of hares (not 2,000,000). • Enter a 1 for the number of lynx (not 1,000). • Note. Counting this way (by thousands of lynx and millions of hares) makes the calculations more manageable. 3. Setting aside the harm/benefit of lynx-hare interactions, show columns for the base annual growth rates of the two populations. • At t = 0, the growth rate for hares is 32%. Initially there are 2 million hares, so this cell should display 0.64, since 0.32 × 2 = 0.64. • Likewise, the growth rate for lynx is −30%. Initially there are 1 thousand lynx, so this cell should display −0.30, since −0.30 × 1 = −0.30. • Remember that we are counting by millions of hares and thousands of lynx, so the 0.64 stands for 640,000 hares/year and the −0.30 stands for −300 lynx/year. 4. Show a column for the total number of possible lynx-hare interactions. 1 2 In practice, they have been made up by your math teacher. We see that 0.02 years is 0.02 × 365 days, or about 7.3 days. • In year t = 0, the number in this column should be 2 × 1 = 2. • Note This number actually stands for 2,000,000 × 1,000 = 2,000,000,000 possible interactions, but displaying it as a 2 is more manageable. This is similar to how someone might write, There are between 8 and 9 billion people in the world, instead of writing, There are between 8,000,000,000 and 9,000,000,000 people in the world. 5. Show columns for the interaction effect on these two populations. • For year t = 0, the harm to the hare growth rate is: 0.28 × (Number of interactions) = 0.28 × 2 = 0.56. • Likewise, the benefit to the lynx growth rate is: 0.35 × (Number of interactions) = 0.35 × 2 = 0.70. 6. Show columns for the net change in these two populations. • For year t = 0, the net growth rate in the hare population is: Base growth rate − Harm = 0.64 − 0.56 = 0.08. {z } | {z } | 0.64 0.56 – This amounts to an overall growth rate of 0.08 million hares/year, or 80,000 hares/year. – At this rate, in ∆t = 0.02 years, the hare population goes up by 0.02 × 0.08 = 0.0016 million, which amounts to 1,600 hares. • The net growth in the lynx population is Base growth rate + Benefit = −0.30 + 0.70 = 0.40. | {z } | {z } −0.30 0.70 – This amounts to 0.40 thousand lynx/year, or 400 lynx/year. – At this rate, in ∆t = 0.02 years, the lynx population goes up by 0.02 × 0.40 = 0.008 thousand, or by 8 lynx. 7. We can now determine the new populations at time t = 0.02 years. • The new hare population is 2 + 0.0016 = 2.0016 million hares, or 2,001,600 hares. • The new lynx population is 1 + 0.008 = 1.008 thousand lynx, or 1008 lynx. 8. Now you can repeat these calculations row by row for 50 years (2500 rows). • So that you can check your work, in year t = 0.04 you should see about 2.003112 hares and 1.016075 lynx. • Your calculations should change if any of the following parameters are changed: – The initial number of hares (2 million) and lynx (1 million). – The base growth rate of hares (32%) and lynx (30%). – The harm factor to hares (0.28) and lynx (0.35). 9. Make a chart like the one shown at the beginning of this problem of the two populations over a 50-year time period. Your chart should be linked to your data so that any changes in your calculations are reflected. Problem 2 Goal In this problem you will build a small dashboard to explore a Worldbank data set, available here: https://data.worldbank.org/indicator/sp.pop. grow. You will use many of the data tools involving references and lookup functions we have worked on this term. When you are done, your dashboard should look something like this, although you might format it differently. Figure 2: A dashboard for exploring World Bank population data. Steps 1. Download and import the data set. Use the CSV link at the lower-right corner of the page. The download will include three separate files: (a) API_SP.POP.GROW_DS2_en_csv_v2_989327.csv • The main data file to be imported into its own tab. • This tab shows growth rates for different countries from 1960 to 2018. • For instance, we see that in 1960 Aruba (code ABW) had a growth rate of about 3.148%. (b) Metadata_Country_API_SP.POP.GROW_DS2_en_csv_v2_989327.csv • A metadata file to be imported into its own separate tab. • This tab groups countries, identified by country code, into 7 different regions (such as Latin America & Caribbean, South Asia, and Sub-Saharan Africa) and four different income groups (high, low, lower middle, and upper middle). • For instance we see that Aruba (code ABW) is classified as a high income Latin America & Caribbean country. (c) Metadata_Indicator_API_SP.POP.GROW_DS2_en_csv_v2_989327.csv • You can disregard this file. We won’t be needing it. 2. Clean the data set. (a) Country rows in the two tabs should line up properly, except there is one extra row on the main tab. • This row is labeled Not classified (code INX). • Delete the row and verify (by looking at country codes) that rows on both tabs now agree. (b) Initially, the metadata tab has five columns. Our work will be made easier if we add a sixth column to the second tab called Country Names, copied over from the main tab. • At first, you might think we could use the country names in the TableName column (Column E) on the metadata tab. • However, you will see that entries in these two columns are not always the same. Compare, for instance, entries for Côte d’Ivoire, Curaçao, and São Tomé and Principe. • These discrepancies would throw off our calculations. (c) So, carefully copy data from the Country Names column on the main tab to the sixth column (Column F) of the metadata tab. 3. Create a region dropdown like the one shown in the screenshot, where Europe & Central Asia has been selected. • Add a Data Validation from the Data menu, populating it with values from the Region column of the metadata tab. • Note that the Validation menu is smart enough to build a dropdown using unique, non-blank values from the Region column, so you shouldn’t need to do anything special to make it work. • Your finished dropdown should show seven different entries. 4. Create a country dropdown like the one shown in the screenshot, where Romania has been selected. This will be a multi-step process. (a) We need to retrieve all the countries corresponding to the region selected (for instance, all the countries that correspond to Europe & Central Asia). We can use the QUERY function for this. • The QUERY function should select names of countries on the metadata tab from the new column you inserted there when cleaning the data set. • QUERY should select only those countries matching the selected region (for instance, Europe & Central Asia). • Your formula might look something like: QUERY(metadata, statement). – Here, metadata is a named range pointing to the data on your metadata tab. – Here, statement might look like this: select F where B=‘Europe & Central Asia’. ∗ In this case, column F is a column on the metadata tab that has country names in it. ∗ Column B, also on the metadata tab, has country regions in it. ∗ Therefore, this statement says something like: Select country names for countries whose region is ‘Europe & Central Asia’. ∗ It is vital that your statement changes when you pick different regions using the region dropdown. For instance, if you select South Asia from the dropdown, your statement would change to something like: select F where B=‘ South Asia’. (b) Once you have a QUERY statement that retrieves country names corresponding to a given region, you can add a Data Validation. • Your QUERY statement will return a list of country names. • These names are what the Validation Menu should use to populate your country dropdown. 5. Create a year dropdown like the one shown in the screenshot. • Add a Data Validation from the Data menu, populating it with values from the metadata tab (1965, 1966, 1967, . . . , 2018 ). 6. Retrieve the data values you need. For example, for Romania in 1973, you need a sentence that says: In 1973, Romania, an upper middle income country in Europe & Central Asia, had a 0.857% growth rate. To build this sentence, you already have (from the dropdowns) these pieces: Romania, 1973, and Europe & Central Asia. You will need to retrieve the remaining pieces, including the phrase upper middle income from the metadata tab and the value 0.857% from the main tab. (a) In this case, you can use the MATCH function to find that Romania is entry 199 in the list of countries. (b) If 1960 is year 0, you know that 1973 is year 13. (c) Now you can use the values 199 and 13 to retrieve the text upper middle income and the value 0.857% from their respective tabs. • The OFFSET and INDEX functions could be useful here. 7. Now you can build the required statement about the selected country and year. Here are some suggestions and reminders. • LOWER can convert a phrase like Upper middle income to upper middle income. • TEXT can represent 0.8566377661 as 0.857%. • SWITCH or IF can help build statements like an upper middle income country versus a low income country. • IF and ISNUMBER can help build statements like: In 1980, Serbia, an upper middle income country in Europe & Central Asia, had an unknown growth rate. • JOIN can merge pieces like an and upper middle income into a single statement like an upper income.