Data Literacy Project Is it berry season yet? Background: The Blue Hill Observatory, near Boston, Massachusetts, is the longest-running weather observatory in the U.S. There, they have been measuring weather variables like annual temperature, amount of rain, etc. since 1885. They have also kept track of the date when there were ripe blueberries each year (perhaps the site operator liked to eat them!). The graph below shows the date of first ripe blueberries each year since 1886 at the Blue Hill Observatory. The Y-axis is given as “Day of Year”. It is the day of the year if you start with one on January first, and just keep counting, not stopping and re-starting the count each month. So, February 1st is actually the 32nd day of the year. The black line going across at Day of Year 172 is around the first day of summer – about June 21. The black line that stair-steps up and down is the day of ripe blueberries, and the red and blue lines show the 10-year and 30-year mean, which let you see the pattern a bit more clearly by smoothing out some of the bumps. Data Source: Blue Hill Observatory; http://www.bluehill.org/climate/climate.html Note: a Julian Day (day of year) converter: http://landweb.nascom.nasa.gov/browse/calendar.html 1. Describe what the graph shows about how the date of first ripe blueberries is changing. (Purpose here is to elicit description of what the graph shows. Sample response: The date of first ripe blueberries generally varied around day 172 until around 1970 or so. There is a lot of year-to-year variability, ranging from around day 160 to day 185 or higher.) 2. I interpret the graph to mean…(Purpose here is to elicit an explanation (e.g. of the pattern or variability) or interpretation of the meaning in terms of the context of the question. Sample response: Although there some variability from year to year, the date of first ripe blueberries has gotten earlier, especially since about 1960. It was the earliest on record just a couple of years ago. Additional discussion: Why are there two different lines for the “10-year mean” and “30-year mean”? What does each signify, and how different is the story that they each tell? These are each a “moving average”. Essentially, for each year, you are taking the mean of the 10 years surrounding that year (in the case of the 10-year mean). So, if you wanted to calculate the 10-year mean for 1990, you would add up the data for the five years before and after 1990 (1985-1995) and divide by n to get the average for the 10 years. A moving average smoothes out the year-to-year variability and makes it easier to see a long-term pattern. [Note: they are used a lot in financial data as well as in science. Often for financial data, the 10 years prior are used in the calculation, not the 10 years surrounding the year of interest – so in the example above, we’d use 1980-1990]. Using 30 years smoothes out the data even more than 10 years. What would a 5-year mean look like? A bit more description and detail: http://en.wikipedia.org/wiki/Moving_average Here’s the math answer, with the formula for calculation: http://mathworld.wolfram.com/MovingAverage.html