Activity 13 “I Didn`t Get Enough Blues!”

advertisement
Activity 13 “I Didn’t Get Enough Blues!”
Questions to keep in mind:
i.
What is the purpose of this new procedure? What is it called?
ii.
How do I calculate the chi-square number?
iii.
How do I calculate df?
iv.
What does the new distribution look like?
v.
How do I calculate the P-value using the table?
vi.
How do I calculate the P-value using the calculator?
Note: Create document YI_Activity13.doc with tables 1 and 2, density curve graphs and,
answers to the questions i. through vi. Submit your work on edmodo.com
Materials needed: One 1.69-ounce bag of plain M&M's per student.
The M&M/Mars Company, headquartered in Hackettstown, New Jersey, makes plain and
peanut chocolate candies. In 1995, they decided to replace the tan-colored M&M’s with a
new color. After conducting an extensive national preference survey, they decided to
replace the tan M&M's with blue M&M's. The company's Consumer Affairs Department
announced:
On average, the new mix of colors of M&M's Plain Chocolate
Candies will contain 30 percent browns, 20 percent each of yellows
and reds and 10 percent each of oranges, greens, and blues.
They explained:
While we mix the colors as thoroughly as possible, the above ratios
may vary somewhat, especially in the smaller bags. This is because
we combine the various colors in large quantities for the last
production stage (printing). The bags are then filled on high-speed
packaging machines by weight, not by count.
The purpose of this activity is to compare the color distribution of M&M's in your
individual bag with the advertised distribution. We will want to see if there is sufficient
evidence to dispute the company's claim for their distribution. In order to use as random a
sample as possible, it is best if the bags of M&M's are purchased at different stores and
not obtained from one or a few sources of supply.
According to M&M website, each package of Milk Chocolate M&M's should contain
24% blue, 14% brown, 16% green, 20% orange, 13% red, and 14% yellow M&M's.
Open your bag and carefully count the number of M&M's of color-brown, yellow,
red, orange, green, and blue-as well as the total number of M&M's in the bag.
2. Fill in the counts, by color, and the total number of M&M's for in the "Observed"
row in Table 1:
1.
Color
Observed
Expected
(O – E)2 / E
Brown
Yellow Red
Orange Green
Blue
Total
χ2=
To obtain the expected counts, multiply the total number of M&M's in your bag
by the company's stated percentages (expressed in decimal form) for each of the
colors.
4. For each color, perform this calculation:
3.
(observed - expected)2/ expected
and enter the result in the last row of the table. Then add up all of these calculated
values, and name the sum χ 2 . Keep this number handy -- you will use it later in
the chapter.
5. If your sample reflects the distribution advertised by the M&M/Mars Company,
then there should be very little difference between the observed counts and the
expected counts. Hence the calculated values making up the sum χ 2 should be
very small. Are the entries in the last row all about the same, or does any of the
quantities stand out because they are "significantly" larger? Did you get more of a
particular color than you expected? Did you get fewer of a particular color than
you expected?
6. Combine the counts obtained by all the students in your class to obtain a total
count of M&M's of each color. Record the results in Table 2:
Color
Observed
Expected
(O – E)2 / E
7.
Brown
Yellow Red
Orange Green
Blue
Total
χ2=
You will need this data in the exercises.
Record the total number of M&M's in each student's bag in your class. How did
your bag compare with those of your classmates?
Test for Goodness of Fit
You could use the z test described in the last chapter to test the hypotheses
H0: p = 0.10
Ha: p < 0.10
where p is the proportion of blue M&M's. You could then perform additional tests of
significance for each of the remaining colors. But this would be inefficient. More
important, it wouldn't tell us how likely it is that six sample proportions differ from the
values stated by M&M/Mars as much as our sample does. There is a single test that can
be applied to see if the observed sample distribution is significantly different from the
hypothesized population distribution.
It is called the chi-square ( χ 2 ) test for goodness of fit.
The Chi-Square ( χ 2 ) distributions
• The chi-square distributions are a family of distributions that take only positive
values and are skewed to the right. A specific chi-square distribution is specified by
one parameter, called the degrees of freedom.
• Since we are working with percentages, five of the six percentages are free to vary,
but the sixth is not, since all six have to add to 100.
• In this case, we say that there are 6-1 degrees of freedom.
Draw the density curves for three members of the chi-square family of distributions.
a. Adjust your WINDOW settings as shown
b. Enter functions Y1, Y2, and Y3 in the y-editor as illustrated below. χ 2 pdf can be
found in the DISTR menu (2nd VARS) option 6 on the TI 83 and in the
CATALOG under Flash Apps on the TI 89
c. Change the graph style on Y2 to a thick line.
d. Graph the chi-square density curves for df=1, 4, and 8 on the same axes.
•
•
•
As the degrees of freedom increase, the density curves become less skewed and
larger values become more probable.
Table E in the back of the book gives critical values for chi-square distributions.
You can use Table E if software does not give you P-values for a chi-square test.
The chi-square density curves have the following properties:
i.
The total area under a chi-square curve is equal to 1.
ii.
Each chi-square curve (except when df = 1) begins at 0 on the
horizontal axis, increases to a peak, and then approaches the
horizontal axis asymptotically from above.
iii.
Each chi-square curve is skewed to the right. As the number of
degrees of freedom increase, the curve becomes more and more
symmetrical and looks more like a normal curve.
We use the chi-square density curve with n-1 degrees of freedom to calculate the P-value
in a goodness of fit test.
A goodness of fit test is used to help determine whether a population has a certain
hypothesized distribution, expressed as proportions of population members falling into a
various outcome categories. To test the hypothesis
H0: the actual population proportions are equal to the hypothesized
proportions
first calculate the chi-square test statistic
2
(
O
−
E
)
χ 2 =∑
E
Then χ 2 has approximately a χ 2
For a test H against the alternative hypothesis
Ha: the actual population proportions are equal to the hypothesized
Proportions
the P-value is P( χ 2 ≥ Χ 2 )
Download