File

advertisement
1.3 Describing Quantitative Data with Numbers
Measuring Center: The Mean
1. The most common measure of center is the arithmetic average
2. NOT a resistant measure
3. Interpreted as the “average” value
x=
sum _ of _ observations
n
x=
å xi
n
Measuring Center: The Median
1. The median M is the midpoint of the distribution
2. To find the median of a distribution:
a. Arrange all the observations from smallest to largest
b. If the number of observations is odd, the median M is the center observation in the
ordered list
c. If the number of observations is even, the median M is the average of the two center
observations in the ordered list
3. The median is a resistant measure
4. Interpreted as the “typical” value
Comparing the Mean and Median
1. The mean and median of a roughly symmetric distribution are close together
2. If the distribution is exactly symmetric, the mean and median are exactly the same
3. In a skewed distribution, the mean is usually farther out in the long tail than is the median
Example #9
Here are data for the amount of fat (in grams) in McDonald’s beef sandwiches:
Sandwich
Hamburger
Cheeseburger
Double Cheeseburger
McDouble
Quarter Pounder
Quarter Pounder with Cheese
Double Quarter Pounder with Cheese
Big Mac
Big N’ Tasty
Big N’ Tasty with Cheese
McRib
Mac Snack Wrap
Angus Bacon & Cheese
Angus Deluxe
Angus Mushroom & Swiss
Fat(g)
9
12
23
19
19
26
42
29
24
28
26
19
39
39
40
a) Create a stemplot of the data
b) Find the mean amount of fat for all 15 beef sandwiches
c) The three Angus burgers are relatively new additions to the menu. How much did they increase the
average when they are added?
d) Find and interpret the median
Measuring Spread: The Interquartile Range (IQR)
1. Measures the range of the middle 50% of the data
2. How to calculate Quartiles Q1 and Q3 and the IQR:
a. Arrange the observations in increasing order and locate the median M
b. The first quartile Q1 is the median of the observations to the left of M
c. The third quartile Q3 is the median of the observations to the right of M
d. Important: Be sure to leave out the median when you locate the quartiles
IQR = Q3 - Q1
Identifying Outliers:
1. Call an observation an outlier if it falls more than 1.5 x IQR above the third quartile or below the
first quartile.
Example #10
a) Refer to Example #9 to find and interpret Q1 , Q3 , and the interquartile range (IQR)
b) Determine whether a beef sandwich with 68 grams of fat is an outlier
The Five Number Summary:
1. Consists of the smallest observation, the first quartile, the median, the third quartile, and the largest
observation. Combine all five numbers to get a quick summary of both center and spread
Minimum, Q1 , M, Q3 , Maximum
Boxplots – graphical representation of the five number summary
1. How to make a boxplot
a. A central box is drawn from the first quartile to the third quartile
b. A line in the box marks the median
c. Whiskers extend from the box out to the smallest and largest observations that are NOT
outliers
d. Outliers are marked with stars
Common Errors:
1. Many students refer to the box in a boxplot as the IQR. This is incorrect since the IQR is a number,
not the box itself.
2. When describing the shape of a boxplot, many students will describe a symmetric boxplot as
“approximately Normal.” This is incorrect since a boxplot does not reveal where the modes of a
distribution are.
Example #11
Here are the number of home runs that Hank Aaron hit in each of his 23 seasons:
13
39
27
29
26
44
44
38
30
47
a) Make a boxplot for these data
39
34
40
40
34
20
45
12
44
10
24
32
44
Measuring Spread: The Standard Deviation
Standard Deviation – measure the average distance of the observations from their mean
Variance – measures the general dispersion of the data. Is equal to sx2
sx =
1
å(xi - x)2
n -1
sx - sample
s x - population
1. Properties
a. Measures spread about the mean and should only be used when the mean is chosen as
the measure of center
b. Is always greater than zero
c. Has the same units of measurement as the original observations
d. Standard deviation is not resistant
Example #12
Here are the foot lengths (in cm) for a random sample of seven 14-year-olds from the UK:
25, 22, 20, 25, 24, 24, 28
a) Calculate and interpret the standard deviation of these data.
Choose a Measure of Center and Spread:
Median and IQR
1. describing skewed data
2. describing data with strong outliers
Mean and standard deviation
1. reasonably symmetric distributions
2. no outlers
Plot your data:
Dotplot, stemplot, histogram
Interpret what you see:
Shape, center, spread, outliers
Choose numerical summary:
Mean and standard deviation, median and IQR?
Example #13
The following data show the number of contacts that a sample of high school students had in their cell phones.
Male
124
260
135
Female 30
180
41
290
114
83
124
29
31
105
116
33
27
168
103
22
213
44
169
96
173
218
87
167
144
155
183
85
214
134
110
a) Do the data give convincing evidence that one gender has more contacts than the other?
Download