S519: Evaluation of Information Systems Social Statistics Ch3: Difference This week Range Standard deviation Variance Using Excel to calculate them The whole story Descriptive statistics Centrality tendency (average) Measurement of variability (variability) Average+Variability = describe the characteristics of a set of data Measures of variability Variability Three sets of data How scores differ from one another 7, 6, 3, 3, 1 3, 4, 4, 5, 4 4, 4, 4, 4, 4 Variability = the difference from the mean Measures of variability Three ways Range Standard deviation Variance Range The most general measure of variability How far apart scores are from one another Range = highest score – lowest score What is the range for 98, 86, 77, 56, 48 Standard deviation Standard deviation (SD) Average deviation from the mean (average distance from the mean) Represents the average amount of variability s ( x x) n 1 2 Lab Exercise Calculate standard deviation 5, 8, 5, 4, 6, 7, 8, 8, 3, 6 By hand Using excel (STDEV()) STDEV and STDEVP STDEV is standard deviation for sample (biased SD) STDEVP is standard deviation for population (unbiased SD) If your dataset is the whole population, use STDEVP to calculate standard deviation If you dataset is the sample of something, use STDEV to calculate standard deviation STDEV and STDEVP s s 2 ( x x ) n 1 2 ( x x ) n STDEV STDEVP Why n or n-1? To be conservative STDEV This is the standard deviation for sample Take n-1 in order to make STDEV a bit larger than it would be. If we have err, we compensate by overestimating the STDEV Why n or n-1? Sample size Numerator in standard deviation formula Denominator Population standard deviation STDEVP (dividing by n) Denominator Sample standard deviation STDEV (dividing by n-1) Difference between STDEVP and STDEV 10 500 7.07 7.45 0.38 100 500 2.24 2.25 0.01 1000 500 0.7071 0.7075 0.0004 What to remember Standard Deviation (SD) = the average distance from the mean The larger SD, the more different data are from one another Since mean is sensitive to extreme scores, so do SD If SD=0, this means that there is no variability in the set of scores (they are all identical in value) – this happens very rarely. Variance Variance = (Standard Deviation)^2 s2 2 ( x x ) n 1 Lab Exercise Calculate variance in Excel 8, 8, 8, 7, 6, 6, 5, 5, 4, 3 Var() STDEV Varp() STDEVP SD vs. variance Often appears in the “Results” sections of journals They are quite different Variance is squared SD SD vs. variance 9 8 7 mean 6 5 4 3 2 1 0 1 Average distance to mean=(2+2+2+1+1 +1+2+3)/10=1.4 2 3 4 5 6 7 SD = 1.76 Variance = 3.1 8 9 10 Lab Exercise 1 (S-p78-problem2) Calculate range, STDEV and STDEVP and variance by hand or calculator 31, 42, 35, 55, 54, 34, 25, 44, 35 Use Excel to do that. Lab Exercise 2 (S-p79-problem4) Height Problem 4 in S-p79 Calculate the variation measures for height and weight Weight 53 156 46 131 54 123 44 142 56 156 76 171 87 143 65 135 45 138 44 114 57 154 68 166 65 153 66 140 54 143 66 156 51 173 58 143 49 161 48 131 Lab Exercise 3 (S-p79-problem5) Western Airlines Flight Report Morning Flights Number of passengers Evening Flights Number of passengers Thursday To Kansas Friday To Kansas 258 Thursday To Kansas Friday Thursday Friday To Philadelphia To Providence To Providence 303 312 166 176 Thursday To Philadelphia Friday Thursday Friday To Philadelphia To Providence To Providence 321 331 210 274 251 Friday To Kansas 312 Thursday To Philadelphia 331 Lab Exercise 3 (S-p79-problem5) Look at problem 5 Write a half page summary report to your boss Form a group to discuss it