Review center of the distribution: • Mean: the sum of the observations divided by the number of observations P x x̄ = n • Median: the midpoint of the observations when they are ordered from the smallest to the largest (or from the largest to the smallest) For quantitative variables, • Does the observations take similar values, or are they quite spread out? spread of the distribution • Range: the difference between the largest and the smallest observations Range is not resistant and it ignores the numerical values of nearly all the data. • deviation from the mean: the difference between the observation and the sample mean d = x − x̄ • The sum of deviations from the sample mean ALWAYS equals 0! • variance: the average of the squared deviations P (x − x̄)2 sum of squared deviations 2 s = = n−1 sample size − 1 Attention: the denominator is n − 1, not n! • standard deviation: square root of the variance sP (x − x̄)2 s= n−1 • We’ll regard the standard deviation s as a typical distance of an observation from the mean. • The larger s , the greater the spread of the data. The first homework is graded on a scale of 0 to 10. The mean score is 8.2. Which value is most plausible for the standard deviation s: 0, 1.2, 4.9 or -1? Properties of the Standard Deviation, s: • The greater the spread of the data, the larger is the value of s. • s = 0 only when all observations take the same value. • s can be influenced by outliers. • bell shaped distribution: a distribution is unimodal and approximately symmetric Empirical Rule for bell-shaped distribution: • 68% of the observations fall within 1 standard deviation of the mean, that is, between x̄ − s and x̄ + s (denoted x̄ ∓ s). • 95% of the observations fall within 2 standard deviation of the mean(x̄ ∓ 2s). • All or nearly all observations fall within 3 standard deviations of the mean (x̄ ∓ 3s). • Empirical Rule doesn’t work for highly skewed distribution.