GVN330 Climate Data Analysis Assignment 2.3: Histograms

advertisement
GVN330 Climate Data Analysis
Assignment 2.3: Histograms
Histograms are one of the most common graphical tools for displaying data
distributions. By looking at a histogram, you can easily get a feel for the data's
location, spread, skewness, the presence of outliers, and modality.
Reading
Wilks 3.3.5 – Histograms (part of reading quiz 3)
Exercises:
Ass2.3 Create seasonal histograms for daily minimum temperature for Saeve
station. Then quantitatively compare numerical summary statistics. Finally, test
your understanding by calculating the summary statistics.
a) Create the plots
Use the daily data from DataFiles/Saeve_daily.csv. That is, create histograms
based on the daily data itself, NOT the monthly data you have been working with
in Ass1.1/2.1.
Use data for years 1961:2002, to avoid having too many NaNs.
Create 4 histograms of daily minimum temperature, one for each season:
 Summer – Jun, Jul, Aug
 Fall – Sep, Oct, Nov
 Winter – Dec, Jan, Feb
 Spring – Mar, Apr, May.
Use the hist function to create a histogram.
I suggest you create all the histograms on the same Figure, but on separate axes,
using the subplot command.
Because the seasons have different numbers of days, you will probably find it
easier to select the data for a season and plot it, rather than dividing the data up
into seasons first.
You know how to select all data for any one particular month. But how will you
select all data from more than one month?? Review logical operators in matlab,
http://www.weizmann.ac.il/matlab/techdoc/ref/logicaloperators_01.html is a
good short summary.
Discussion points:
A histogram counts the number of values which fall into "bins", where each bin
has a lower and upper value. With matlab's hist function, you can either
a) specify the number of bins, or
b) you can specify a vector of bin centers, or
c) you can let matlab choose the number of bins and their centers.
I suggest that you specify the bin centers. Why might this be a good idea??
b) Estimate distribution parameters from histograms
For your exam, you will need a conceptual understanding of measures of central
tendency, spread and symmetry. Test your understanding by attempting to
answer the following questions by examining your histogram for minimum
temperatures:
 Estimate the mode for each season.
 Are the distributions positively or negatively skewed?
 Which of the seasons has the lowest standard deviation (ie. lowest
spread)?
 For one season, the mean is greater than the mode. Which season? Ok,
this is hard! You should be able to say that it is one of two seasons. Note
that you cannot determine the mean exactly from a histogram, but you
can make a rough estimate.
Discuss your answers with others!
c) Check your understanding by calculating the mean, mode, skewness and
standard deviation for minimum temperature for each season.
Download