Name:__________________________________ Using Data from Public Sources Data are the actual measurements that scientists take during an experiment, the concrete values that they will then investigate to draw conclusions about their research. Data are what scientists use to convince other people that their theories are correct. They generate graphs from their data to make it easier to understand. Many scientists collect their own data, but there are also many sources of publicly available data collected by large institutions and programs that scientists use to test hypotheses. We will use publicly available data about the climate to make our own graphs to think about the topic of climate change. Bear in mind a hypothesis is just an educated guess; the data may or may not agree with the hypothesis. When data agree with the hypothesis scientists say it supports their hypothesis. When the data don’t agree they say it refutes their hypothesis. After opening the Excel file with the climate data, notice several things: The first row of every column tells you what the variable in that column is. You have many variables to work with, and you have data for many years. Some of the variables have long names and do not fit in the cell. You can read the entire name by clicking on the cell. Most rows have values missing for some of the dates. This is normal, scientists have to work with incomplete datasets all the time. If you scroll down in the sheet there are more numbers than you can view all at once, this is also very common. There are many numbers! Don’t be overwhelmed; make graphs and charts to make the numbers easier to understand. Part I: Using Excel to organize numbers and make graphs 1. Each row (horizontal) is identified by a number, and each column (vertical) is identified by a letter. Which column contains the variable Year? What is the range of years included in your dataset (you will have to scroll down)? Column A, years 1955-2013 2. How many different variables (including year) do you have in this dataset? 13 3. Some variable names are too long to fit in one cell. You can read the entire name by clicking on the cell that contains it. Which column contains the variable Avg Temp Burlington VT (F)? Column B Name:__________________________________ 4. If we are interested in both year and temperature in Burlington, which is the independent variable and which is the dependent variable? Year: Independent, Temperature: Dependent 5. Form a hypothesis about Avg Temp Burlington VT and Year that you could test by graphing these variables. Many possible answers, but an example: With what we know about climate change, average temperature in Burlington may have over time since 1955. 6. Using the “Charts” section, make a pie chart of the values in Avg Temp Burlington VT (F). a. Is this an appropriate graph for these values? No b. Why or why not? What does this graph tell you? This chart shows that each temperature value occurs once, with no information (unless you scroll over) about what the temperature actually was. 7. Make a column graph of the values in Avg Temp Burlington VT (F). a. Is this an appropriate graph for these values? Yes b. Why or why not? What does this graph tell you? This graph shows what the temperature value was in each year, so by looking at where the values are on the y-axis we can get an idea of what the average temperatures are. The large c. What is another variable you could include in this graph to make it more informative? Including year in this visualization on the x-axis would help us understand the temperature trends over time. 8. Make a line graph of the values in Avg Temp Burlington VT (F). a. Is this an appropriate graph for these values? Yes. Name:__________________________________ b. Why or why not? What does this graph tell you? This graph connects consecutive values in the column (which in this case are consecutive years). This helps us see the trend in the data, even though the x-axis is only labeled as column values, not years. c. What is another variable you could include in this graph to make it more informative? Including year in this visualization on the x-axis would help us understand the temperature trends over time. 9. Which of the graphs you created do you think is the best for displaying these data? Line graph. Part II: Investigating your own hypothesis: 10. What is a hypothesis you could test by graphing one or more of these variables? Many possibilities. One example: The temperature in one part of the country may affect the precipitation somewhere else because of distance weather and cloud sources. The temperature in Seattle, WA may correlate with the precipitation in Burlington, VT. 11. What is your independent variable? What is your dependent variable? Temperature in Seattle – Independent Precipitation in Burlington – Dependent 12. Make the appropriate graph to test this hypothesis. Make sure you label the axes, give it a title, and make the values in the axes appropriate ranges. Print this and staple it to the back of this worksheet. See last page 13. Explain why the graph you chose is the best option for the variable(s) you are interested in (it may be easier to explain why other graphs aren’t the best option). I plotted temperature (the independent variable) on the x-axis and precipitation (dependent variable) on the y-axis. This will show if the precipitation changes in response to increasing temperature. I used an XY scatterplot because this shows how two variables correlate with each other, and added a trendline to see whether there was any Name:__________________________________ positive or negative association. This trendline is flat, indicating no relationship between the two variables. 14. Summarize the message your graph says about the data in one sentence (if this were your science fair project, you would put this in the text of your results). There is no (statistical) relationship between the average annual temperature in Seattle, WA and the precipitation in Burlington, VT. 15. Based on what you said in question 13, do the data in your graph support or refute your hypothesis in question 10? No, the data do not support my hypothesis. I hypothesized there would be a relationship between these two variables and there was not. Name:__________________________________