ES 25 Quantitative Thinking Lab 3: Lets Practice and Review Graphs Due: Tuesday, April 24,th before 12 noon (E-mail to your facilitator) DOUBLE-check your answers, and make sure to always include UNITS. Label your graphs clearly. Neatness and effort are positively correlated with high homework scores! Bacteria Population Dynamics (pg 21 Workbook.. lets try it in Excel, and verify by hand later) Given: p(t) = 5000 + 3000t -2000t2 1. Set up the data table in Excel, and use the equation given to calculate p(t). Transpose, to have data in columns. (Copy, paste into Excel. Then select again, cut, paste special, check “transpose”) p(t) (individuals) Avg. rate of change (slope between points) (change in # of t (time, in hours) individuals/hr) 5000 0 6000 0.5 2000 6000 1 0 5000 1.5 -2000 3000 2 -4000 0 2.5 -6000 -4000 3 -8000 Note: the slope of 2000 is for the time period between 0 and 0.5 hours 2. Graph the data. 3. Estimate the equation of an appropriate trendline. Copy and paste your graph, with trendline equation an associated R2 value, below. Bacteria Population Dynamics Population size (# of individuals) 8000 6000 4000 2 y = -2000x + 3000x + 5000 2 R =1 2000 0 0 0.5 1 1.5 2 -2000 -4000 -6000 Time (hours) 2.5 3 3.5 a. In a new column, calculate the average rate of change between 0 and 0.5 hours. Repeat for each half-hour thereafter (replace the data table above with your completed data table from Excel). b. Offer an explanation for changes that you observe in the slope (i.e. explain the population dynamics). The slope starts off positive, meaning that individuals are added to the population in that time period (in the first 1/2 hour, the growth rate is 2000 individual per hour, and since we are only looking at 1/2 of an hour, only 1000 individuals are added the population). In the next time period (0.5 to 1 hour), the slope is zero, there is no growth (no individuals are added, per unit time). From 1 hour onward, the slope is increasingly negative, meaning that the death rate increases with time. I think that maybe the population was growing initially, but then a predator discovered the bacteria, and began consuming individuals. Then the predators may beckoned for help eating the bacteria from their relatives, which would explain why the death rate is increases. Certainly, you could probably come up with a better explanation that this, but mine does make logical sense (which will give it full credit). c. During what time period is the magnitude (absolute value) of the slope largest? What does your answer mean, for this problem? The (absolute value) of the slope is largest between 2.5 and 3 hours after the “start.” During this time period, the death rate is 8000 individuals/hour (or, the growth rate is – 8000 indv/hr), which means that 8000 individuals are removed from the population each hour. Since the time period is only 1/2 of an hour, the population should decrease by 4000 individuals between 2.5 and 3 hours after the start (I verified this using the population numbers given, 0 to –4000 individuals. However, I can’t make sense of “negative population,” so perhaps a death rate of 6000 indv/hr is the largest logical rate of change, which occurs between time 2 and time period 2.5 hours). d. What does a slope of 0 mean, for this problem? It means that the average rate of change is zero (for the time period from 0.5 hours to 1 hour after the start). Thus, no individuals are added or subtracted, on average, from the population. e. Evaluate p(0.75), and interpret your answer in words. p(.75) = 6125 individuals. This means that 45 minutes after the start (45 is .75 of 60 minutes), the population of bacteria is 6125 individuals. f. Report and interpret the y-intercept of the graph in terms this problem. p(t) = 5000 + 3000t -2000t2 p(0) = 5000 individuals; at the start of the experiment there are 5000 individuals (this is the initial condition). g. Report and interpret the x-intercept of the graph. 0 = 5000 + 3000t -2000t2 The above equation can be solved using the quadratic formula, to get t = 2.5 or t = -1. Or, by looking at the graph we can find that the curve “intercepts” the x axis at 2.5 hours. This means that after 2.5 hours, all of the bacteria were dead. h. Why does the R2 value equal 1? Since we actually used the polynomial “given” to generate our y-values, it is no surprise that the equation “predicts” the y-values perfectly (in other words, all of the variance in y is explained by our model). i. Over what range of x values do you think the equation is valid? Explain. The equations seems logical for times between 0 and 2.5 hours, after which there is a negative amount of bacteria, which does not make intuitive sense to me. Ant Population Dynamics (pg 22 Workbook.. lets check our work in Excel) Given: p(t) = 3000 – 2000/t2 4. Set up the data table in Excel, and use the equation given to calculate p(t). p(t) (individuals) undefined -5000 1000 2111.11111 2500 t (time, in days) 0 0.5 1 1.5 2 Avg. rate of change (slope between points) (change in # of individuals/day) undefined 12000.00 2222.22 777.78 5. Graph the data. (note: do NOT use the data point p(0), t = 0 in your graph) 6. Estimate the equation of an appropriate trendline. Copy and paste your graph, with trendline equation an associated R2 value, below. Insect Population growth 3000 Population size (individuals) 2000 p(t) = 3000-2000/t2 R2 = 1 1000 0 0 0.5 1 1.5 2 2.5 -1000 -2000 -3000 -4000 -5000 -6000 Time (days) a. Estimate the equation of the trendline. Why does the R2 NOT equal 1? Because Excel can’t come up with the exact equation that we used to generate the y-values, so the curve doesn’t explain all of the variance in y anymore. b. Lets remove the trendline, make excel connect the dots (by changing the Graph Type to x-y scatter with the dots connected), and write in our own trendline equation and R2 value (using view-toolbars-drawing- ‘text box’) c. In a new column, calculate the average rate of change between 0 and 0.5 days (using the equation “given”. Repeat for each half-day thereafter (replace the data table above with your completed data table from Excel). Note: can’t get slope for 0 to 0.5 because the function is undefined (it approaches negative infinity) at time = 0. d. Is the absolute value of the slope larger between 1 and 1.5 days, or between 1.5 and 2 days? What does your answer mean, for this problem? The population is growing faster between 1 and 1.5 days (with an average growth rate of 2222 ind/day) than between 1.5 and 2 days (when the average growth rate is 778 indv/day). A growth rate of 2222 indv/day means that during this time period, 2222 individuals are added each day. Since the time period is only 1/2 of a day, only 1111 individuals are added (confirm with population values: 2111 indv. –1000 indv.). This is clear on the graph because the slope (which represents the average rate of change) is steeper between 1 and 1.5 days than between 1.5 and 2 days. e. Evaluate p(0.75), using the given equation, and interpret your answer in words. p(.75) = 3000-2000/(0.75)2= -555.55 So, at 3/4 of an day (18 hours) past the start of the experiment, the population was –555.55 individuals. Somehow the insect population increased from this negative number to become positive after 0.816 days (maybe insect eggs were incubating, and hatched??) f. Report and interpret the y-intercept of the graph in terms this problem. By definition, the curve intercepts the y-axis when x = 0. It answers the question, “when time is zero (the initial condition), what is the population, p(t)?” p(0) = 3000-2000/(0)2 which is UNDEFINED. From the graph, and from intuition, as t gets smaller and smaller, 2000/t^2 gets larger and larger. Since p(t) = 3000 – an increasingly large number, the function approaches negative infinity as t approaches zero. g. Report and interpret the x-intercept of the graph. To find out where the curve intercepts the x-axis, we need to realize that y is zero when x is equal to the x-intercept. Replacing y, or “p(t)” in this case, with 0, p(t) = 0 = 3000-2000/t2 we can solve for t to find t = sqrt (2/3) So, the x-intercept is = 0.816 days. For this problem, after 0.816 days, the population goes from being negative (hard to imagine) to being positive. h. Over what range of x values do you think the equation is valid? From 0.816 days onward PCB’s: As popularized in Rachel Carson’s Silent Spring, high levels of PCB (polychlorinated biphenyl, an industrial pollutant) in the environment can damage pelican’s eggs by thinning the shells. The data table below shows the relationship between the concentration of PCB in the eggshells and the thickness of the eggshell1. Concentration, c, in parts per million (ppm) 87 147 204 289 356 452 Thickness, h, in millimeters (mm) .44 .39 .28 .23 .22 .14 1. Copy and paste the data table into Excel. Transpose, to have data in columns (or you can type it in by hand). (Transpose: Copy, paste into Excel. Then select again, cut, click on a new cell, edit, paste special, check “transpose”) 2. Decide which column should be y, and which should be x (include units). y = Thickness, h, in millimeters (mm) x = Concentration, c, in parts per million (ppm) 3. Predict whether the relationship will be positive or negative: negative and whether it will be linear or non-linear: linear Explain your logic in making these predictions: I expect that as PCB concentrations increase, the thinkness of pelican’s eggs should decrease. I think it should be linear, because I expect that each additional increase in concentration (ppm) will cause the same amount of thinning of eggshells (change in thickness). If I thought the rate of change in thickness, with respect to concentration, was not constant, I would predict a non-linear relationship. 4. Graph the data. 5. Estimate the equation of an appropriate trendline. Copy and paste your graph, with trendline equation an associated R2 value, below. 1 Risebrough, R. W., “Effects of environmental pollutants upon animals other than man.” Proceedings of the 6th Berkeley Symposium on Mathematics and Statistics, VI, p. 443463, (Berkeley: University of California Press, 1972) PCBs 0.5 Eggshell thickness (mm) 0.45 0.4 0.35 0.3 0.25 0.2 0.15 y = -0.0008x + 0.488 2 R = 0.9362 0.1 0.05 0 0 100 200 300 400 500 Concentration of PCB (ppm) a. Report and interpret the slope of the graph in terms of this problem. The slope = -0.0008 mm/ppm. This means that we expect the eggshell thickness to decrease by 0.0008 mm for every 1 ppm increase in concentration of PCB. Notice that this rate of change is the same for all concentrations of PCB’s (as opposed to the previous two problems, where the average rate of change varied depending on what range of x’s we were looking at) b. Report and interpret the y-intercept of the graph in terms of this problem. When the concentration of PCBs is 0, y(0) = 0.488 mm. This means that without any PCBs, the average pelican egg is 0.488 mm thick. We could think of this as the “baseline” thickness for pelican eggshells. c. Report and interpret the x-intercept of the graph in terms of this problem. The x-intercept will tell us “at what concentration the eggshell thickness will be equal to zero.” We can solve by setting y equal to zero, and solving for x. x = 0.488/.0008 ppm = 610 ppm. d. Report the R2 value for your trendline:0.9362 i. Interpret the R2 value, in terms of eggshells and PCB’s. 93 % of the variance (a measure of spread) in eggshell thickness is explained by the model (the relationship between thickness and concentration of PCB’s) ii. What could account for the “unexplained variance” in y? Genetic differences in pelicans which leads to naturally occurring differences in eggshell thickness. e. Calculate f(204) using your trendline equation. Interpret your answer, in terms of eggshells and PCB’s. y= -0.0008x + 0.488 remember y = f(x), y is a function of x, y(204) = -0.0008(204) + 0.488 = 0.325 mm f. Calculate the residual associated with f(204). Interpret your answer, in terms of eggshells and PCB’s. Our predicted value of eggshell thickness for a PCB concentration of 204 ppm was 0.325 mm. Looking at our data table, the actual thickness associated with this PCB concentration was 0.28 mm. Thus, the residual (yactual – ypredicted) is 0.28 – 0.325 = -0.045 mm. We overpredicted the eggshell thickness by .045 mm. g. Do you think that the trendline will continue to make accurate predictions for y as x gets larger (in other words, do you feel confident extrapolating for this data set)? Why/why not? It can’t go on forever, because the eggshell thickness can not become negative. Thus, the function only makes sense for concentrations between 0 and 610 ppm. Explore Global Warming In Lab 2, we looked at trends in global temperature anomaly through time, which scientists believe is a result of increases in CO2 (primarily from burning fossil fuels). Lets look at the relationship between the temperature anomaly and CO2. Open the Lab3.xls spreadsheet, ExploreGW *note, since CO2 varies dramatically between summer and winter, I took the average of the two values to compare to the yearly temperature anomaly data (this may not be the best approach, but it was convenient). I matched the CO2 and temperature data by years (I only had CO2 data for 1959 through 2002). 6. Decide which column should be y, and which should be x (include units). y = average temperature anomaly (deg C) x =CO2 (ppm) 7. Predict whether the relationship will be positive or negative: positive and whether it will be linear or non-linear: non-linear.l Explain your logic in making these predictions: Based on my ES classes, I believe that increasing levels of CO2 will lead to greater average temperature anomalies. I think it will be non-linear, because many complex, dynamic systems are governed by feedbacks and thresholds, which lead to non-linear responses by the dependent variable. Since I think it will be non-linear, I graphed it as follows: 8. Graph the data. Does CO2 Cause Global Warming? Data for years (1959-2002) Average Temperature Anomoly (deg C) 0.700 Non-linear model y = 1E-05x3 - 0.0108x2 + 3.6104x - 402.47 0.600 R2 = 0.4972 0.500 0.400 0.300 Linear model y = 0.0063x - 2.035 0.200 R2 = 0.3164 0.100 0.000 310 -0.100 320 330 340 350 360 370 380 -0.200 -0.300 CO2 (ppm) 9. Estimate the equation of an appropriate trendline. Copy and paste your graph, with trendline equation an associated R2 value, below. a. What does each data point on the graph represent? Each point represents a year (between 1959 and 2002, but note, they are not plotted in chronological order, they are plotted in coordinates of (temp. anomaly, C02). Thus, the pair of values represents the “average” temperature and CO2 values for a year. b. Report and interpret the slope of the graph in terms of this problem. Since I used a non-linear plot (which I feel reflects the underlying pattern in the data more accurately), The slope is changing (not constant). A positive slope indicates and increase in temperature anomaly when CO2 increases, whereas a negative slope corresponds to a decrease in temp. anomaly for a increase in CO2. For a linear plot, you would have gotten the equation: y = 0.0063x - 2.035. The slope of .0063 degC/ppm means that for a one unit increase in CO2, we would expect a 0.0063 deg C increase in the average temperature anomaly. If you got an equation of y = 0.125x – 0.815, convince yourself that neither the slope nor the intercept makes sense (try intercept at CO2= 0 ppm, and for CO2=315 ppm) c. Report and interpret the y-intercept of the graph in terms of this probable For non-linear plot, y = 1E-05x3 - 0.0108x2 + 3.6104x - 402.47 with R^2 = 0.4972 When x (CO2) = 0, y (temperature anomaly) = -402.47deg C. Thus, when there is no CO2 in the atmosphere, the planet is a very very cold ice-ball. I would want to check and see what the minimum historic CO2 levels have been (and associated temperatures) before deciding how far out to extrapolate. Linear plot: y= 0.0063x - 2.035 y intercept is –2.035 deg C, which means that when the concentration of CO2 is zero ppm, the temperature anomaly is –2.035 de C. d. Report and interpret the x-intercept of the graph in terms of this problem. My non-linear plot looks like it crosses the x-axis around 317 ppm. This is the concentration at which the temperature anomaly changes from negative to positive. My linear plot looks like it crosses the x-axis around 325 ppm. This is the concentration at which the temperature anomaly changes from negative to positive. I could also solve for this value by setting the y = 0 and solving for x. non-linear answer: e. Report the R2 value for your trendline: 0.4972 i. Interpret the R2 value, in terms of this problem. 49.7% of the variance (a measure of spread) in the global average temperature anomaly was explained by my model (3rd degree polynomial with CO2 as the independent variable). ii. What could account for the “unexplained variance” in y? Many factors are not accounted for in my model, such as deforestation, creation of urban heat islands, etc. My model might account for factors such as ocean uptake of heat, since that factor might be correlated with CO2 presence in the atmosphere. f. The Intergovernmental Panel on Climate Change reports that the carbon dioxide concentration is rising by 1.5 ppm per year. i. Based on the data provided, do you agree with this value? Show your calculation clearly. This is a kind of tricky. We are asked for the rate of change of CO2, relative to time. Thus, we need to graph CO2 vs. time, and estimate the average rate of change (slope). CO2 increase from 1959 to 2002 380 y = 0.8534x + 321.37 370 360 CO2 (ppm) 350 340 330 320 310 300 290 01 99 20 97 19 95 19 93 19 91 19 89 19 87 19 85 19 83 19 81 19 79 19 77 19 75 19 73 19 71 19 69 19 67 19 65 19 63 19 61 19 19 19 59 280 Y ear (for linear equation, 1959 is year 0) According to my linear model, CO2 is rising by 0.85 ppm each year (slope is average rate of change). I don’t know why my graph has two high values and only one low value, seems weird. I investigated (using the link to the original data on the spreadsheet), and found that when yearly averages were taken, there was a miscalculation. The graph should look like: CO2 increase from 1959 to 2002 y = 1.3653x + 312.41 380.00 370.00 360.00 CO2 (ppm) 350.00 340.00 330.00 320.00 310.00 300.00 290.00 .0 19 59 19 61 19 .0 63 . 19 0 65 . 19 0 67 . 19 0 69 . 19 0 71 . 19 0 73 . 19 0 75 19 .0 77 . 19 0 79 19 .0 81 . 19 0 83 . 19 0 85 . 19 0 87 . 19 0 89 . 19 0 91 19 .0 93 . 19 0 95 . 19 0 97 . 19 0 99 . 20 0 01 .0 280.00 Y ear (for linear equation, 1959 is year 0) According to my new linear model, CO2 is rising by 1.36 ppm each year (slope is average rate of change). This is close to the IPCC value. Interestingly, some students figured out the slope (rate of change) between each pair of years, and then averaged those twenty or so measurements, to arrive at an average slope of 1.3 ppm/yr. Therefore, the data is flawed and I can not trust my original predictions. I changed my “Source Data” to the properly calculated CO2 averages (with corresponding years and temperatures) to revise my models as follows: Average Temperature Anomoly (deg C) 0.700 Does CO2 Cause Global Warming? Revised Data for years (1959-2002) Non-linear Equation y = 0.0001x2 - 0.0917x + 14.26 0.600 2 R = 0.7655 0.500 0.400 0.300 0.200 Linear equation y = 0.0095x - 3.1495 0.100 R2 = 0.7282 0.000 310.00 -0.100 320.00 330.00 340.00 350.00 360.00 370.00 -0.200 -0.300 CO2 (ppm) This model has a much better R^2 value (it fits the data better), and does not have those strange vertical grouping of data on the left hand side. I should re-interpret my intercepts with this new model. ii. If the amount of CO2 increase per year does, in fact, continue at 1.5 ppm/year, what will be the concentration of CO2 in the year 2100? (notice that the CO2 concentration in 2002 was 372.55 ppm) If the average rate of change is constant, at 1.5 ppm/year (note: this corresponds to a constant rate of increase, linear model with slope = 1.5 ppm/year), We will have added (1.5 ppm/year)*98 years = 147 ppm to the 2002 value of 372.55 ppm. Thus, the new concentration of CO2 will be 519.55 ppm. iii. According to your model, what will be the temperature anomaly in the year 2100, based on the concentration of CO2 calculated above? (you can do a quick internet search for temperature predictions of more sophisticated models to see if your estimate is in the ballpark) According to my revised linear model, the new temperature anomaly should be: 380.00 According to my revised linear model, the new temperature anomaly should be: y(519.55) = 0.0095(519.55) - 3.1495 = 1.78 deg. C (I have heard the temperature will rise by about 2-2.5 deg. C this century, so my “back-of-the-envelope” calculation doesn’t seem too bad. According you the old linear model (this is the one you may have used), the new temp. anomaly should be: y= 0.0063x - 2.035 = 1.24 deg C . You could also calculate the predicted temperature anomaly using nonlinear equation: y(x) = .001x2-0.0917x+14.26 y(519.55) = 0.001(519.55)2- 0.0917(519.55) + 14.26 = 88.84 deg C (check out this curve on the graph to see why it gets so hot). iv. Do you feel confident extrapolating for this data set? Why/why not? I am not extremely confident, because I know that temperature change is governed by other cycles besides just CO2. There could be negative feedbacks that act to regulate the effect of CO2 on climate. Also, some students mentioned that they feel our society might undergo a major paradigm shift which will change our CO2 and climate future. Sounds like an interesting idea… g. Offer a reasonable explanation for the vertical group of data points on the left-hand side of the graph. I found the actual error- it was a data entry problem (as is often the case… think about how this relates to the outliers in the student reported data… are they really consuming that much more water?). You could have offered any reasonable explanation, such as any effect that would magnify the heating effect of CO2 (perhaps El Nino/La Nina events?). **Disclaimer: temperature change is a very complicated process, with feedbacks, thresholds, and non-linearities. This model is meant to provide a “back-of-the-envelope” look at the relationship between C02 and temperature.