AP Statistics Project Part II – Exploring Relationships Between

advertisement
AP Statistics Project Part I – Exploring and Understanding Data (MODEL ANSWERS)
This project explores the numerical and visual display and analysis of data and determines students’ ability,
using a graphing calculator to:




Enter and manipulate data;
Complete basic numerical computations (5-number summaries, mean and standard deviation);
Create properly formatted visual displays of data (tables, bar graphs, histograms, box plots, stemand-leaf plots and normal distribution plots);
Analyze and interpret numerical summaries and visual displays of data;
Scenario and Data
Scores (in percentages) for the first test of the year for two AP Statistics classes are provided below. The
class “First” meets first period of the day, every day of the school year. The second class “Last” meets last
period of the day, every day of the school year. There are 21 students in the First class and 30 students in
the Last class:
First Scores: 98, 52, 92, 92, 60, 66, 90, 86, 86, 70, 72, 84, 82, 82, 82, 82, 74, 74, 76, 80, and 80
(checksum: 1,660).
Last Scores: 96, 95, 92, 54, 57, 58, 86, 85, 82, 82, 82, 60, 66, 66, 66, 68, 80, 80, 77, 76, 76, 75, 74, 74, 74,
73, 72, 72, 71, and 70 (checksum: 2,239).
The AP Statistics teacher would like to conduct an analysis that compares the grades of the classes to
determine if there is a difference in skill level between the students in the two classes.
Section 1: Graphing Calculator (round all decimals to the tenth place)
a) Enter the data sets into two columns in the list processor of your graphing calculator. Order the data
from highest to lowest.
b) Use the checksum numbers above to ensure that you entered your data correctly. The sum of each
data set should equal its checksum.
c) Use your calculator to find the mean and standard deviation of each class. Insert your answers here:
First Mean: 79.0%
Last Mean: 74.6%
First Standard Deviation: 10.8
Last Standard Deviation: 10.3
Which standard deviation did you record above, s or σ? Explain why: Students should select σ as each data
set is a full population of a class. If they selected s, their answers will be FIRST: 11.0 and LAST: 10.5.
Their answers should include a sentence or two explaining the difference between a population and sample
mean.
d) Create a 5-number summary for each class, round to 1 decimal place & record your results here:
Max
Q3
Median
Q1
Min
First
98.0%
86.0%
82.0%
73.0%
52.0%
Last
96.0%
81.5%
74.0%
68.0%
54.0%
e) Identify any outlier(s) in these data sets. Explain why they’re outliers.
Using the IQR*1.5 method, 52% from FIRST is the only outlier in either data set.
f) Create modified box plots of the data for each class. Sketch the box plots, side-by-side, here:
See EXCEL output.
How is a “modified” box plot different than a regular box plot?
In a modified box plot, outliers are shown as ‘Xs’ beyond the min and max values at the ends
of the whiskers.
Explain in a sentence or two the usefulness of a box plot when analyzing data.
A box plot is particularly effective when comparing the spreads of two sets of data.
g) Create (by hand) back-to-back stem-and-leaf plots for the data sets; split your stems if necessary.
See EXCEL output.
Explain in a sentence or two the particular usefulness of a stem-and-leaf plot in analyzing data.
A stem-and-leaf plot is particularly effective because it allows you to look at the entire set of
data (each individual data point is exposed).
h) Make a frequency table of As (90-100), Bs (80-89), Cs (70-79), Ds (60-69) and Fs (<60) for each
class, then create a histogram for each class.
Grades
A (90 +)
B (80-89)
C (70-79)
D (60-69)
F (< 60)
FIRST
4
9
5
2
1
LAST
3
7
12
5
3
See EXCEL output for histograms.
Explain in a sentence or two the particular usefulness of frequency tables and histograms when
analyzing data.
Frequency tables tabulate in order to create a histogram. Histograms are particularly
effective when analyzing data because it gives an excellent visual of the shape of the
distribution.
i) Create a table that displays marginal grade summaries and marginal distributions.
Grades
FIRST Frequency
LAST Frequency
Totals
Marginal
Distribution (%)
A (90 +)
B (80-89)
C (70-79)
D (60-69)
F (< 60)
Totals
4
9
5
2
1
21
3
7
12
5
3
30
7
16
17
7
4
51
14%
31%
33%
14%
8%
100%
In a sentence or two, comment on the marginal distribution percentages that you calculated.
The marginal distributions reveal that 14% of the students overall received As, 31% received
Bs, 33% received Cs, 14% received Ds and 8% failed.
j) Compare the conditional probabilities of the grades for FIRST and LAST (e.g. what is the
conditional probability that a student will get a grade of B or better given that s/he is in the FIRST
PERIOD class). Complete the table and write a few sentences explaining your findings.
Conditional Distribution Table
First
Last Period
Period
Conditional
Grades
Conditional
A (90 +)
19%
10%
B (80-89)
43%
23%
C (70-79)
24%
40%
D (60-69)
10%
17%
F (< 60)
5%
10%
A or B
62%
33%
A, B or C
86%
73%
C, D or F
38%
67%
D or F
14%
27%
The conditional probabilities show that FIRST PERIOD students were more likely to receive
As (19% vs. 10%) or Bs (43% vs. 23%). They were also more likely to receive an A or B (62%
vs. 33%) or an A, B or C (86% vs. 73%). LAST PERIOD students were more likely to receive
Cs (40% vs. 24%), Ds (17% vs. 10%) and Fs (10% vs. 5%). They were also more likely to
receive a D or F (27% vs. 14%) and a C, D or F (67% vs. 38%).
Section2 - Summary Questions: Using the data, tables, summaries and visual displays you created,
answer the following questions. Your answers should be typed on a separate of paper.
(1) Describe the shape of the data for each data set (shape, center, and spread).
FIRST PERIOD data: Shape is uni-modal, center is around 80 (mean of 79%, median of
82%), spread is fairly wide with scores from 52% to 98% (range of 46%), data is skewed left.
There is one outlier low.
LAST PERIOD data: Shape is uni-modal; distribution appears to be close to normal. The
center is around 74 (mean of 74.6%, median of 74%), spread is fairly wide with scores from
54% to 96% (range of 42%), data is symmetric. There are no outliers.
(2) Discuss your numerical findings in general, comparing the data of these two classes. What
conclusions can you make?
The test scores from FIRST PERIOD are generally higher than those from LAST PERIOD.
Two measures of centrality (mean and median) are higher in FIRST PERIOD than in last
(mean: 79% vs. 74.6%; median: 82% vs. 74%). The third measure of centrality (mode) is the
same between the two data sets; however, given the data (test scores), this measure may not
mean much and may be partially due to grading (e.g. individual question values, etc.). FIRST
PERIOD has significantly more As and Bs as a percentage of the total number of tests in the
class than LAST PERIOD (FIRST: 62% vs. LAST: 33%). It appears that there is more ability
in the FIRST PERIOD class than in the LAST PERIOD class.
****OTHER CONJECTURES WILL VARY****
(3) Should the AP Statistics teacher conclude that there is a difference in the level of abilities between
the students in the two classes? Support your answers in 3-5 sentences using your data.
****Answers here will vary, however, it is reasonable for students to conclude that FIRST
PERIOD has a stronger group of students than LAST period.****
(4) Are there factors besides student ability that might be affecting this data? Using your experiences as
a student, identify some possible factors and support your arguments in 3-5 sentences.
****Answers here will vary, but students SHOULD point out that there are more students in
LAST PERIOD (30) than in FIRST PERIOD (21) and that students and teachers are probably
less effective at the end of the day than at the beginning of the day.****
(5) What recommendations would you make to the AP Statistics teacher regarding these two classes?
Be specific in your recommendations and support your answers.
****Answers here will vary; most reasonable conclusions are acceptable and should be
praised.****
Download