
center of the distribution:
• Mean: the sum of the observations divided by the number of
x̄ =
• Median: the midpoint of the observations when they are
ordered from the smallest to the largest (or from the largest
to the smallest)
For quantitative variables,
• Does the observations take similar values, or are they quite
spread out?
spread of the distribution
• Range: the difference between the largest and the smallest
Range is not resistant and it ignores the numerical values of
nearly all the data.
• deviation from the mean:
the difference between the observation and the sample mean
d = x − x̄
• The sum of deviations from the sample mean ALWAYS
equals 0!
• variance: the average of the squared deviations
(x − x̄)2
sum of squared deviations
s =
sample size − 1
Attention: the denominator is n − 1, not n!
• standard deviation: square root of the variance
(x − x̄)2
• We’ll regard the standard deviation s as a typical distance of
an observation from the mean.
• The larger s , the greater the spread of the data.
The first homework is graded on a scale of 0 to 10. The mean
score is 8.2.
Which value is most plausible for the standard deviation s: 0, 1.2,
4.9 or -1?
Properties of the Standard Deviation, s:
• The greater the spread of the data, the larger is the value of s.
• s = 0 only when all observations take the same value.
• s can be influenced by outliers.
• bell shaped distribution: a distribution is unimodal and
approximately symmetric
Empirical Rule for bell-shaped distribution:
• 68% of the observations fall within 1 standard deviation of the
mean, that is, between x̄ − s and x̄ + s (denoted x̄ ∓ s).
• 95% of the observations fall within 2 standard deviation of the
mean(x̄ ∓ 2s).
• All or nearly all observations fall within 3 standard deviations
of the mean (x̄ ∓ 3s).
• Empirical Rule doesn’t work for highly skewed distribution.