Numerical Summaries of Quantitative Data. Means, Standard Deviations, z-scores Warmup Six people in a room have a median age of 45 years and mean age of 45 years. One person who is 40 years old leaves the room. Questions: 1. What is the median age of the 5 people remaining in the room? 2. What is the mean age of the 5 people remaining in the room? 2 characteristics of a data set to measure center measures where the “middle” of the data is located variability measures how “spread out” the data is Measure of the “middle” Sample mean x n x x1 x2 x3 xn i 1 x n n i Population mean (value typically not known) N = population size N x i 1 N i Recall: Warmup 456=270; 270-40=230; 230/5=46 Six people in a room have a median age of 45 years and mean age of 45 years. One person who is 40 years old leaves the room. Questions: 1. What is the median age of the 5 people remaining in the room? Can’t answer 2. What is the mean age of the 5 people remaining in the room? 46 Connection Between Mean and Histogram A histogram balances when supported at the mean. Mean x = 140.6 Histogram 70 50 40 Fr equency 30 20 10 Abs e nce s f rom Work More 1 60.5 153.5 146.5 139 .5 132.5 125.5 0 118.5 Fre que ncy 60 Mean: balance point Median: 50% area each half right histo: mean 55.26 yrs, median 57.7yrs Mean, Median, Maximum Baseball Salaries 1985 - 2014 Baseball Salaries: Mean, Median and Maximum 1985-2014 Mean Median Maximum 35,000,000 3,200,000 25,000,000 2,700,000 20,000,000 2,200,000 15,000,000 1,700,000 10,000,000 1,200,000 Year 2013 2011 2009 2007 2005 2003 2001 1999 1997 1995 1993 0 1991 200,000 1989 5,000,000 1987 700,000 Maximum Salary 30,000,000 1985 Mean, Median Salary 3,700,000 DESCRIBING VARIABILITY OF QUANTITATIVE DATA The Sample Standard Deviation, a measure of spread around the mean Square the deviation of each observation from the mean; find the square root of the “average” of these squared deviations n ( x i x ) ; ( x i x ) 2 and find the " average" , 2 i 1 then take the square root of the average n s (x i 1 deviation i x )2 n 1 called the sample standard Calculations … Women height (inches) i xi x (xi-x) (xi-x)2 1 59 63.4 -4.4 19.0 2 60 63.4 -3.4 11.3 3 61 63.4 -2.4 5.6 4 62 63.4 -1.4 1.8 5 62 63.4 -1.4 1.8 6 63 63.4 -0.4 0.1 7 63 63.4 -0.4 0.1 8 63 63.4 -0.4 0.1 9 64 63.4 0.6 0.4 10 64 63.4 0.6 0.4 11 65 63.4 1.6 2.7 12 66 63.4 2.6 7.0 13 67 63.4 3.6 13.3 14 68 63.4 4.6 21.6 Sum 0.0 Sum 85.2 Mean 63.4 x Mean = 63.4 Sum of squared deviations from mean = 85.2 (n − 1) = 13; (n − 1) is called degrees freedom (df) s2 = variance = 85.2/13 = 6.55 i xi x (xi-x) (xi-x)2 1 59 63.4 -4.4 19.0 2 60 63.4 -3.4 11.3 3 61 63.4 -2.4 5.6 4 62 63.4 -1.4 1.8 We’ll never calculate these by hand, so make sure to 5 62 63.4 -1.4 know how to get the1.8standard deviation using your 6 63 63.4 -0.4 0.1 calculator or software. 7 63 63.4 -0.4 0.1 x 8 63 63.4 -0.4 0.1 9 64 63.4 0.6 0.4 10 64 63.4 0.6 0.4 11 65 63.4 1.6 2.7 12 66 63.4 2.6 7.0 13 67 63.4 3.6 13.3 14 68 63.4 4.6 21.6 Sum 0.0 Sum 85.2 Mean 63.4 Mean ± 1 s.d. 2. Then take the square root to get the 1. First calculate the variance s2. s2 n 1 ( xi x ) 2 n 1 1 standard deviation s. 1 n 2 s ( x x ) i n 1 1 Population Standard Deviation N 2 ( x ) i i 1 N value of population standard deviation typically not known; use s to estimate value of Remarks 1. The standard deviation of a set of measurements is an estimate of the likely size of the chance error in a single measurement Remarks (cont.) 2. Note that s and are always greater than or equal to zero. 3. The larger the value of s (or ), the greater the spread of the data. When does s=0? When does =0? When all data values are the same. Remarks (cont.) 4. The standard deviation is the most commonly used measure of risk in finance and business – Stocks, Mutual Funds, etc. 5. Variance s2 sample variance 2 population variance Units are squared units of the original data square $, square gallons ?? Remarks 6):Why divide by n-1 instead of n? degrees of freedom each observation has 1 degree of freedom however, when estimate unknown population parameter like , you lose 1 degree of freedom In formula for s , we use x to estimate the unkown n value of ; s 2 ( x x ) i i 1 n 1 Remarks 6) (cont.):Why divide by n-1 instead of n? Example Suppose we have 3 numbers whose average is 9 Choose ANY values for x1 and x2 x1= x2= Since the average (mean) is 9, x1 + x2 + x3 must then x3 must be equal 9*3 = 27, so x3 = 27 – (x1 + x2) once we selected x1 and x2, x3 was determined since the average was 9 3 numbers but only 2 “degrees of freedom” class pulse rates 53 64 67 67 70 76 77 77 78 83 84 85 85 89 90 90 90 90 91 96 98 103 140 n 23 x 84.48 m 85 s 290.26(beats per minute) s 17.037 beats per minute 2 2 Review: Properties of s and s and are always greater than or equal to 0 when does s = 0? = 0? The larger the value of s (or ), the greater the spread of the data the standard deviation of a set of measurements is an estimate of the likely size of the chance error in a single measurement Summary of Notation SAMPLE y sample mean m sample median POPULATION population mean m population median s sample variance 2 population variance s sample stand. dev. population stand. dev. 2 Using the Mean and Standard Deviation Together. Z-scores: Standardized Data Values Measures the distance of a number from the mean in units of the standard deviation z-score corresponding to y y y z s where y original data value y the sample mean s the sample standard deviation z the z-score corresponding to y If data has mean y and standard deviation s, then standardizing a particular value of y indicates how many standard deviations y is above or below the mean y . Exam 1: y1 = 88, s1 = 6; exam 1 score: 91 Exam 2: y2 = 88, s2 = 10; exam 2 score: 92 Which score is better? z1 z2 91 88 6 92 88 3 .5 6 4 .4 10 10 91 on exam 1 is better than 92 on exam 2 Comparing SAT and ACT Scores SAT Math: Eleanor’s score 680 SAT mean =500 sd=100 ACT Math: Gerald’s score 27 ACT mean=18 sd=6 Eleanor’s z-score: z=(680-500)/100=1.8 Gerald’s z-score: z=(27-18)/6=1.5 Eleanor’s score is better. Z-scores add to zero Student/Institutional Support to Athletic Depts For the 9 Public ACC Schools: 2013 ($ millions) School Support y - ybar Z-score Maryland 15.5 6.4 1.79 UVA 13.1 4.0 1.12 Louisville 10.9 1.8 0.50 UNC 9.2 0.1 0.03 VaTech 7.9 -1.2 -0.34 FSU 7.9 -1.2 -0.34 GaTech 7.1 -2.0 -0.56 NCSU 6.5 -2.6 -0.73 Clemson 3.8 -5.3 -1.47 Mean=9.1000, s=3.5697 Sum = 0 Sum = 0 In a recent year the mean tuition at 4-yr public colleges/universities in the U.S. was $6185 with a standard deviation of $1804. In NC the tuition was $4320. What is NC’s z-score? 1. 2. 3. 4. 5. 1.03 -1.03 2.39 1865 -1865 End of Numerical Summaries