Statistics Quarter 1: Exploratory Data Analysis Section 3: Sample Variance, Sample Standard Deviation, Sample Standard Error A sample mean, denoted x (pronounced “x-bar”), is an average of n observations. It measures the center of the observed data values. n o The formula is x x i 1 i . n A sample standard deviation, denoted s x , is an average deviation of n observations. It measure the spread or dispersion of the observed data values. x n o The formula is s x i 1 x 2 i n 1 is called a deviation. x x or for simplification s x SSX . n 1 x x 2 is called a squared deviation. SSX is the Sum of Squared X deviations Here is an example: Let X = the time, in minutes, it takes Joe to drive from Budd Lake, NJ, to Allentown, PA. Here are data from a sample of his driving times. x1 = 65 x2 = 61 x3 = 59 x4 = 70 x5 = 55 5 o The sample mean is x x i 1 5 i 65 61 59 70 55 310 62 min 5 5 o The sample standard deviation is a bit more involved so it may be helpful to set up a table. i xi x xi x xi x 2 1 2 3 4 5 65 61 59 70 55 62 62 62 62 62 +3 -1 -3 +8 -7 +9 +1 +9 +64 +49 132 5 Here: SSX = xi x 132 i 1 2 sx SUM = SSX SSX 132 132 33 5.74 min n 1 5 1 4 In summary, the sample mean x 62 and the sample standard deviation s x 5.74 . Joe’s sample of driving times had an average of 62 minutes and each observation was, on average, 5.74 minutes from the mean. Statistics Quarter 1: Exploratory Data Analysis Section 3: Sample Variance, Sample Standard Deviation, Sample Standard Error The sample variance measures the same thing as standard deviation: the spread or dispersion of the data. The sample variance is simply the squared standard deviation. In fact, one actually finds the sample variance before finding the standard deviation, that is, the square root of the sample variance is taken to obtain the sample standard deviation. o Sample variance = sx2 SSX n 1 Standard deviation = sx SSX n 1 Although sample variance is harder to interpret in context, the main idea is the same as sample standard deviation: The larger the variance, the larger the data observations are spread from one another. Sample Standard Error Sample standard error is a bit more complicated than the statistics described above. Standard error describes the spread or dispersion between sample averages opposed the dispersion between single observations. o Sample standard error = SE Mean = sx n In the driving times examples we found that the mean was x 62 min and the sample standard 5.74 deviation s x 5.74 min. The standard error here is SE mean = 2.57 min. 5 Here is how we interpret this value. Values of sample means with a sample size of 5, will vary, on average, 2.57 min from the mean of 62 minutes. So single observations will vary on average 5.74 min, but sample means will vary less. In fact, as the sample size increases, the value of the sample standard error will decrease.