Study Guide for Chapter 4

advertisement
AP Statistics – Chapter 4 Topics
Textbook Assignments:
a. 1, 3, 6, 9, 11, 12
b. 17, 19, 22, 23, 26, 28
c. 29, 32, 33, 35
d. 37, 39, 42, 43, 44, 50 (38 in class)
Objectives for chapter 4:
1. Find measures of central tendency.
2. Combine means from multiple samples to get the mean for the combined sample.
3. Find the trimmed mean.
4. Know whether the mean or median is the better measure of center for a given situation.
5. Know what resistance and sensitivity mean.
6. Find the standard deviation and the variance and interpret these values.
7. Determine how “spread out” a distribution is using the mean and standard deviation.
8. Empirical Rule – Use it to find percentiles and percentages between values of a normal
distribution.
9. Chebyshev’s Formula – Use it to estimate percentages between values of any
distribution.
10. Know the requirements for using Empirical Rule.
11. Know what values are considered unusually high or unusually low for normally
distributed data.
12. Create a boxplot and a modified boxplot.
13. Know what the numbers in the 5 number summary mean and interpret them.
14. Draw comparative boxplots and write comparisons between two sets of data.
15. Know the IQR rule for identifying outliers and use it.
16. Know which plots are good for numerical univariate data vs. categorical univariate data
vs. numerical bivariate data.
17. Know the relationship between quartiles and percentiles and Empirical Rule.
18. Given a set of data find approximate percentiles.
19. Don’t forget C.U.S.S.
Exploratory Data Analysis:
Using Descriptive Statistics and Graphs to Draw Conclusions About Data and Make
Comparisons Between Multiple Data Sets
Measures of Central Tendency
Mean – sample mean vs. population mean (𝑥̅ vs. )
Median – sample median vs. population median
Mode
Midrange and Trimmed Mean
Measures of Spread
Standard deviation is the square root of the variance. (s and s2 vs.  and 2)
Range
AP Statistics – Chapter 4 Topics
Measures of Position
Quartiles
Minimum and Maximum
Z-scores
City Temperatures
Average monthly temperatures for Raleigh and San Francisco
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Raleigh
39 42 50 59 67 74 78 77 71 60 51 43
San Francisco 49 52 53 56 58 62 63 64 65 61 55 49
Raleigh:
San Francisco:
Mean
Mean
Median
Median
Standard Deviation
Standard Deviation
Minimum
Minimum
Q1
Q1
Q3
Q3
Maximum
Maximum
Which descriptive statistic tells the most about how these two sets of data compare? What
conclusions could be drawn?
AP Statistics – Chapter 4 Topics
Five Number Summaries and the Box-and-Whisker Plot:
A quick way to compare center, spread and distribution between multiple sets of data, the Five
Number Summary consists of:
•Minimum
(Min)
•Quartile 1
(Q1)
•Median
(Med) “Quartile 2”
•Quartile 3
(Q3)
•Maximum
(Max)
Behold! The Box-and-Whisker Plot
30 38 40 42 45 54 62
Min
Q1
25%
30
25%
40
Med
25%
Q3
Max
25%
50
60
70
AP Statistics – Chapter 4 Topics
Word Lengths (Adapted from Workshop Statistics: Discovery with Data; Rossman and
Chance)
The first sentence of the first chapter of one of my favorite books reads, “You are
starting the study of one of the most interesting and useful branches of
mathematics.” The following table lists the lengths of the 16 words used in that
sentence.
3
3
3
4
8
11
3
3
5
6
2
8
3
2
2
11
Compare the distribution of word lengths in the first sentence of my favorite book
with the distribution of word lengths in a sentence or two from one of your favorite
writers.
Write a paragraph describing your findings. Use appropriate descriptive statistics
and graphical displays.
AP Statistics – Chapter 4 Topics
Z-scores
z  x s x
A z-score is a measure of the number of
standard deviation between a value in a set
of data and the mean. A positive z-score
indicates the value is above the mean a
negative z-score indicates it is below.
x  your score
x  sample mean score
s  sample standard deviation
For San Francisco, the average temperature for June, 62, has a z-score of
62  57.25  .826
5.75
This indicates that average temperature for June was above the average
temperature for the year by .826 standard deviations.
AP Statistics – Chapter 4 Topics
Empirical Rule
If a data set is approximately normal, that is to say its distribution is roughly bellshaped and symmetrical or its boxplot is roughly symmetrical with relatively short whiskers in
comparison to the size of the box (long whiskers in relation to the box indicate outliers) about
95% of the data will be within 2 standard deviations of the average. In
this case, z-scores greater than 2 or less than -2 are associated with
values of x that are considered unusual.
Commuting Times (Adapted from Introduction to Statistics and Data Analysis; Peck,
Olsen, Devore)
A teacher, who lives a few miles outside of Mustang, records the time he takes to
drive to the school each morning. Given below are the commute times for 20
mornings.
7.92
8.42
8
7.75
8.08
7.42
8.42
6.75
8.75
7.42
8.08
8.5
9.75
8.67
8.33
10.17
7.83
7.83
7.92
8.58
a) Give the five number summary for this data.
b) Draw the boxplot for this data set. Is it roughly symmetric? Are any
outliers indicated?
c) The three unusual observations can be explained. The low time was a day
the seniors were not on campus (less traffic) and the two high times reflect
days when the highway was being resurfaced. Remove these values and
recalculate the mean and standard deviation.
d) Based on this sample, would a commute time of 8.5 minutes be considered
unusual for a typical day? What about a commute of 7 minutes?
AP Statistics – Chapter 4 Topics
Who did better?
Z-scores allow us to compare relative position within a distribution
between two or more sets of data. However, this only works well if both
sets of data are at least approximately normally distributed.
Exam Day! (1997 College Board AP Statistics exam question)
At a college the scores on the chemistry final exam are approximately normally
distributed, with a mean of 75 and a standard deviation of 12. The scores on the
calculus final are also approximately normally distributed, with a mean of 80 and a
standard deviation of 8. A student scored 81 on the chemistry final and 84 on the
calculus final. Relative to the students in each respective class, in which subject
did this student do better?
AP Statistics – Chapter 4 Topics
DIRECTIONS: Circle the best response for the answer of multiple choice questions. For the
Free Response questions, be sure to be explicit on answers.
A survey was conducted to gather ratings of the quality of service at local restaurants at a nearby
mall. Respondents were to rate overall service using values between 0 (terrible) and 100
(excellent). The following stem plot represents the data.
Stem | Leaves
3 |3 4
4 |0 1 3 4 7 7 7 9
5 |0 1 1 3 4 4 5
6 |2 4 5 5 8
7 |3 7
8 |
9 |5
1. What percent of the respondents rated quality as moderate to very poor (rating of 60 to 0)?
a) 32%
b) 50%
c) 68%
d) 75%
e) none of these
2. The median response was
a) 49
b) 50
c) 51
3. The mean of these data is
a) equal to the median.
c) greater than the median.
determined.
4. The value of 33 is
a) the minimum but not an outlier.
c) one of three outliers.
d) 52
e) 53
b) less than the median.
d) an integer.
e) cannot be
b) the minimum and an outlier.
d) not a data value.
e) none of these.
5. Sketch and label the five number summary on a box plot of these data.
6. In skewed-left distributions, what is most frequently the relationship of the mean, median,
and mode?
a) mean > mode > median
b) median > mode > mean
c) mode > mean > median
d) mode > median > mean
e) mean > median > mode
7. If the mean of 60 values is 52.6 pounds and the mean of 40 values is 48.4 pounds, find the
mean of all 100 values.
AP Statistics – Chapter 4 Topics
8. Of 1000 high school students whose mean heights is 67.8 inches, 350 were girls. If the mean
height of the girls was 63.0 inches, what is the mean height of the boys?
9. Which grade is better: A 78 on a test whose mean is 70 and standard deviation of 8.5, or an
83 on a test whose mean is 77 and standard deviation is 6.4? Justify your answer.
Use the table of test grades for two classes A and B to answer the following questions.
A
B
A
B
A
B
A
B
97
100
75
78
91
74
95
75
86
65
82
98
65
82
45
98
74
75
81
65
62
83
10. Calculate the five-number summary and the mean and standard deviation for each set of
grades.
11. Describe the benefits of each set of summary statistics for these data.
12. Construct parallel box plots for the data from the two classes.
Which class did better based on your plots and calculations? Defend your opinion.
AP Statistics – Chapter 4 Topics
Match each histogram with its boxplot, by writing the letter of the boxplot in the space provided.
1.
_______
A.
2.
_______
B.
3.
_______
C.
4.
_______
D.
5.
_______
E.
Download