Answers 1. When might the reported median of data be more appropriate than the mean of data? What is your biggest challenge in deciding which is more appropriate? Add an example in order to better illustrate your answer. The mean and the median are both measures of central tendency. The mean is defined as the sum of the observations divided by the number of observations. Consider the observations 4 5 7 5 9 5 7 Number of observations = 7 Sum of the observations = 4+5+7+5+9+5+7 = 42 Mean = 42/7 = 6. The median is defined as the middlemost observation when the observations are arranged in the order of magnitude. First arrange the given observations in the order of magnitude. Then we get 4 5 5 5 7 7 9 The middle most observation is the 4th observation. So the median is 5. When the distribution of the observations is symmetric, the mean and the median will be equal. When the distribution is moderately asymmetric, the two measures will be approximately equal. When the distribution is highly skewed (asymmetric), the two measures will be very different. The mean is abnormally affected by extreme items. The median is not much affected by extreme items. The following example will illustrate it. Suppose in the above data, the last observation 7 is replaced by 25. Then the data becomes 4 5 7 5 9 5 25 The mean is (4+5+7+5+9+5+25)/6 = 60/6 =10. The median is 5 The mean changed from 6 to 10. But the median is unaltered. When the magnitudes of the observations are not known and only the relative positions or ranks are known, the mean cannot be computed. In this case we can compute the median. 2. What is the significance of the standard deviation in the empirical rule? Please add an example to bolster your answer. The standard deviation is a measure of dispersion. It is defined as the square root of the variance. The variance is defined as the mean of the squares of the deviations of the observations from the mean. Consider the data 4 5 7 5 9 5 7 The mean is 6. The deviations of these observations from the mean are -2 -1 1 -1 3 -1 1 The squares of the deviations are 4 1 1 1 9 1 1 The sum of the squared deviations is 4+1+1+1+9+1+1=18 Variance = 18/7 = 2.571 Standard deviation is √2.571 = 1.604 The most important distribution studied in statistics is the normal distribution. It is characterised by two parameters μ and σ. μ is the mean and σ is the standard deviation. This distribution is symmetric about the mean μ. Approximately 99.73% of the observations lie between 3 standard deviations from the mean. Approximately 95.45% of the observations lie between 2 standard deviations from the mean. Approximately 68.27% of the observations lie between 1 standard deviation from the mean. The z score corresponding to a value x is defined as (x – μ)/ σ, where μ is mean and σ is standard deviation. If the variable x follows the normal distribution with mean μ and standard deviation σ, then the z score follows the standard normal distribution. The standard normal distribution is the normal distribution with mean 0 and standard deviation 1. The normal distribution is very important because of the central limit theorem. The central limit theorem says that whatever be the distribution of the population, the sample mean is approximately distributed as normal if the sample size is sufficiently large. 3. Can a z score be negative? Why or why not? The z score, z = (x – μ)/ σ can take any real value. If x is less than the mean μ, x – μ is negative and z = (x – μ)/ σ is negative. If x is equal to the mean μ, x – μ is 0 and z = (x – μ)/ σ is 0. 4. When is it more appropriate to use the standard deviation of data rather than the variance of data? Is one a better measure of dispersion than the other? Explain and provide an example. The standard deviation and the variance are not really two different measures of dispersion. The variance is the square of the standard deviation. The standard deviation is the square root of the variance. In our example given above, variance = 2.571 and standard deviation = 1.604. The square of 1.604 is 2.571 and the square root of 2.571 is 1.604. Since variance is the mean of the squared deviations, the unit of variance is the square of the unit of data. But the unit of standard deviation is the same as the unit of data. For example, if the data are given in inches, the variance is in squared inches and the standard deviation is in inches. If we know the value of the variance, then we know the value of the standard deviation and vice versa.