Uploaded by Hai Nguyen

2020-final-project

advertisement
Final Project
Mathematical Modeling with Spreadsheets, Spring 2020
Instructions
• You should organize all your work into a single spreadsheet with appropriately named tabs.
• When ready, you should complete the Final Project Quiz, available in
the Projects folder.
• You should then share your sheet with Elaine (erudel@g.harvard.edu)
and Eric (econnall@g.harvard.edu).
• Your grade will be based on:
– Completeness
∗ Did you answer all the questions?
– Correctness
∗ Do your formulas show the correct values?
– Design
∗ Are your sheets clear and well organized?
∗ Do you use relative, absolute, and mixed references where
appropriate?
∗ Do you use appropriate built-in functions?
∗ Do you use appropriate formatting for text and numbers?
• You should work on your project alone. This is not a group assignment.
• The Extension School’s academic integrity policy: https://www.extension.
harvard.edu/resources-policies/student-conduct/academic-integrity
• If you have any questions, please let Eric or Elaine know!
Problem 1
Goal
In this problem you will model two interacting animal populations, Canada
lynx and snowshoe hares. You will use many of the math modeling skills
involving growth rates we have worked on this term. Once finished with your
model, you should be able to create a chart like this:
Figure 1: Snowshow hare and Canada lynx populations over time
Background
Interactions between these populations are generally harmful to the hares
(who get eaten) and beneficial to the lynx (who do the eating).
• In the absence of interactions with lynx, the hare population would
grow by 32% every year, because they aren’t being eaten as often.
• In the absence of interactions with hares, the lynx population would
fall by 30% every year, because they aren’t eating as much.
• The number of possible lynx-hare interactions is given by:
Possible interactions = (Number of hares) × (Number of lynx) .
– For instance, if there were 3 lynx and 8 hares, there would be
3 · 8 = 24 possible lynx-hare interactions, since each lynx might
possibly eat up to 8 hares.
– Of course, only a certain percentage of these interactions will ever
take place.
– The interactions that do take place are bad for the hare population,
which is reduced at an annual rate given by 0.28 times the number
of possible interactions.
– On the other hand, these interactions are good for the lynx population, which is increased at an annual rate given by 0.35 times
the number of possible interactions.
– In all of this, the numbers 32%, 30%, 0.28, and 0.35 are all parameters of the model. In principle, they could be determined by
ecologists studying these populations.1
Steps
1. Show a column for time (in years), starting at t = 0 and taking steps
of ∆t = 0.02 years.2
2. Show columns for the number of hares and the number of lynx. Assume
that in year t = 0 there are 2 million hares and 1 thousand lynx.
• Enter a 2 for the number of hares (not 2,000,000).
• Enter a 1 for the number of lynx (not 1,000).
• Note. Counting this way (by thousands of lynx and millions of
hares) makes the calculations more manageable.
3. Setting aside the harm/benefit of lynx-hare interactions, show columns
for the base annual growth rates of the two populations.
• At t = 0, the growth rate for hares is 32%. Initially there are 2
million hares, so this cell should display 0.64, since 0.32 × 2 = 0.64.
• Likewise, the growth rate for lynx is −30%. Initially there are 1
thousand lynx, so this cell should display −0.30, since −0.30 × 1 =
−0.30.
• Remember that we are counting by millions of hares and thousands
of lynx, so the 0.64 stands for 640,000 hares/year and the −0.30
stands for −300 lynx/year.
4. Show a column for the total number of possible lynx-hare interactions.
1
2
In practice, they have been made up by your math teacher.
We see that 0.02 years is 0.02 × 365 days, or about 7.3 days.
• In year t = 0, the number in this column should be 2 × 1 = 2.
• Note This number actually stands for
2,000,000 × 1,000 = 2,000,000,000 possible interactions,
but displaying it as a 2 is more manageable. This is similar to how
someone might write, There are between 8 and 9 billion people in
the world, instead of writing, There are between 8,000,000,000 and
9,000,000,000 people in the world.
5. Show columns for the interaction effect on these two populations.
• For year t = 0, the harm to the hare growth rate is:
0.28 × (Number of interactions) = 0.28 × 2 = 0.56.
• Likewise, the benefit to the lynx growth rate is:
0.35 × (Number of interactions) = 0.35 × 2 = 0.70.
6. Show columns for the net change in these two populations.
• For year t = 0, the net growth rate in the hare population is:
Base growth rate − Harm
= 0.64 − 0.56 = 0.08.
{z
} | {z }
|
0.64
0.56
– This amounts to an overall growth rate of 0.08 million hares/year,
or 80,000 hares/year.
– At this rate, in ∆t = 0.02 years, the hare population goes
up by 0.02 × 0.08 = 0.0016 million, which amounts to 1,600
hares.
• The net growth in the lynx population is
Base growth rate + Benefit
= −0.30 + 0.70 = 0.40.
|
{z
} | {z }
−0.30
0.70
– This amounts to 0.40 thousand lynx/year, or 400 lynx/year.
– At this rate, in ∆t = 0.02 years, the lynx population goes up
by 0.02 × 0.40 = 0.008 thousand, or by 8 lynx.
7. We can now determine the new populations at time t = 0.02 years.
• The new hare population is 2 + 0.0016 = 2.0016 million hares, or
2,001,600 hares.
• The new lynx population is 1 + 0.008 = 1.008 thousand lynx, or
1008 lynx.
8. Now you can repeat these calculations row by row for 50 years (2500
rows).
• So that you can check your work, in year t = 0.04 you should see
about 2.003112 hares and 1.016075 lynx.
• Your calculations should change if any of the following parameters
are changed:
– The initial number of hares (2 million) and lynx (1 million).
– The base growth rate of hares (32%) and lynx (30%).
– The harm factor to hares (0.28) and lynx (0.35).
9. Make a chart like the one shown at the beginning of this problem of
the two populations over a 50-year time period. Your chart should
be linked to your data so that any changes in your calculations are
reflected.
Problem 2
Goal
In this problem you will build a small dashboard to explore a Worldbank
data set, available here: https://data.worldbank.org/indicator/sp.pop.
grow. You will use many of the data tools involving references and lookup
functions we have worked on this term. When you are done, your dashboard
should look something like this, although you might format it differently.
Figure 2: A dashboard for exploring World Bank population data.
Steps
1. Download and import the data set. Use the CSV link at the lower-right
corner of the page. The download will include three separate files:
(a) API_SP.POP.GROW_DS2_en_csv_v2_989327.csv
• The main data file to be imported into its own tab.
• This tab shows growth rates for different countries from 1960
to 2018.
• For instance, we see that in 1960 Aruba (code ABW) had a
growth rate of about 3.148%.
(b) Metadata_Country_API_SP.POP.GROW_DS2_en_csv_v2_989327.csv
• A metadata file to be imported into its own separate tab.
• This tab groups countries, identified by country code, into 7
different regions (such as Latin America & Caribbean, South
Asia, and Sub-Saharan Africa) and four different income groups
(high, low, lower middle, and upper middle).
• For instance we see that Aruba (code ABW) is classified as a
high income Latin America & Caribbean country.
(c) Metadata_Indicator_API_SP.POP.GROW_DS2_en_csv_v2_989327.csv
• You can disregard this file. We won’t be needing it.
2. Clean the data set.
(a) Country rows in the two tabs should line up properly, except there
is one extra row on the main tab.
• This row is labeled Not classified (code INX).
• Delete the row and verify (by looking at country codes) that
rows on both tabs now agree.
(b) Initially, the metadata tab has five columns. Our work will be
made easier if we add a sixth column to the second tab called
Country Names, copied over from the main tab.
• At first, you might think we could use the country names in
the TableName column (Column E) on the metadata tab.
• However, you will see that entries in these two columns are
not always the same. Compare, for instance, entries for Côte
d’Ivoire, Curaçao, and São Tomé and Principe.
• These discrepancies would throw off our calculations.
(c) So, carefully copy data from the Country Names column on the
main tab to the sixth column (Column F) of the metadata tab.
3. Create a region dropdown like the one shown in the screenshot, where
Europe & Central Asia has been selected.
• Add a Data Validation from the Data menu, populating it with
values from the Region column of the metadata tab.
• Note that the Validation menu is smart enough to build a dropdown using unique, non-blank values from the Region column, so
you shouldn’t need to do anything special to make it work.
• Your finished dropdown should show seven different entries.
4. Create a country dropdown like the one shown in the screenshot, where
Romania has been selected. This will be a multi-step process.
(a) We need to retrieve all the countries corresponding to the region
selected (for instance, all the countries that correspond to Europe
& Central Asia). We can use the QUERY function for this.
• The QUERY function should select names of countries on the
metadata tab from the new column you inserted there when
cleaning the data set.
• QUERY should select only those countries matching the selected
region (for instance, Europe & Central Asia).
• Your formula might look something like: QUERY(metadata,
statement).
– Here, metadata is a named range pointing to the data on
your metadata tab.
– Here, statement might look like this: select F where
B=‘Europe & Central Asia’.
∗ In this case, column F is a column on the metadata tab
that has country names in it.
∗ Column B, also on the metadata tab, has country regions in it.
∗ Therefore, this statement says something like: Select
country names for countries whose region is ‘Europe &
Central Asia’.
∗ It is vital that your statement changes when you pick different regions using the region dropdown. For instance,
if you select South Asia from the dropdown, your statement would change to something like: select F where
B=‘ South Asia’.
(b) Once you have a QUERY statement that retrieves country names
corresponding to a given region, you can add a Data Validation.
• Your QUERY statement will return a list of country names.
• These names are what the Validation Menu should use to
populate your country dropdown.
5. Create a year dropdown like the one shown in the screenshot.
• Add a Data Validation from the Data menu, populating it with
values from the metadata tab (1965, 1966, 1967, . . . , 2018 ).
6. Retrieve the data values you need. For example, for Romania in 1973,
you need a sentence that says:
In 1973, Romania, an upper middle income country in Europe
& Central Asia, had a 0.857% growth rate.
To build this sentence, you already have (from the dropdowns) these
pieces: Romania, 1973, and Europe & Central Asia. You will need to
retrieve the remaining pieces, including the phrase upper middle income
from the metadata tab and the value 0.857% from the main tab.
(a) In this case, you can use the MATCH function to find that Romania
is entry 199 in the list of countries.
(b) If 1960 is year 0, you know that 1973 is year 13.
(c) Now you can use the values 199 and 13 to retrieve the text upper
middle income and the value 0.857% from their respective tabs.
• The OFFSET and INDEX functions could be useful here.
7. Now you can build the required statement about the selected country
and year. Here are some suggestions and reminders.
• LOWER can convert a phrase like Upper middle income to upper
middle income.
• TEXT can represent 0.8566377661 as 0.857%.
• SWITCH or IF can help build statements like an upper middle income country versus a low income country.
• IF and ISNUMBER can help build statements like:
In 1980, Serbia, an upper middle income country in Europe
& Central Asia, had an unknown growth rate.
• JOIN can merge pieces like an and upper middle income into a
single statement like an upper income.
Download