# Chi-Square w/Skittles ```AP BIOLOGY
Chi-square Analysis with Skittles
INTRODUCTION
The Chi-square test (X2) is a test that makes a comparison between the data collected in
an experiment versus the data you expected to find. It is used beyond genetics studies and can be
used whenever you want to compare the differences between expected results and experimental
data. In the case of genetics (and coin tosses) the expected results can be calculated using the
Laws of Probability (and possibly the help of a Punnett square). The Chi-square test helps you to
decide if the difference between your observed results and your expected results is probably due
to random chance alone, or if there is some other factor influencing the results.
- Is the variance in your data probably due to random chance alone and therefore your
hypothesis about the genetics of a trait is supported by the data?
- Are the differences between the observed and expected results probably not due to
random chance alone, and your hypothesis about the genetics of a trait is thereby not
supported by the data?
- Should you consider an alternative inheritance mechanism to explain the results?
The Chi-square test will not, in fact, prove or disprove if random chance is the only thing causing
observed differences, but it will give an estimate of the likelihood that chance alone is at work.
In classical genetics research where you are trying to determine the inheritance pattern of
a phenotype, you establish your predicted genetic explanation and the expected phenotype ratios
in the offspring as your hypothesis. For example, you think a mutant trait in fruit flies is a
simple dominant inheritance. To test this you would set up a cross between 2 true-breeding flies:
white-eyed female x wild type male. You would then predict the ratios of phenotypes you would
expect from this cross. This then establishes a hypothesis that any difference from these results
will not be significant and will be due to random chance alone. This is refered to as your “null
hypothesis”. It, in essence, says that you propose that nothing else – no other factors – are
creating the variation in your results except for random chance differences.
Have you ever wondered why the package of Skittles you just bought never seems to
have enough of your favorite color? Or, why is it that you always seem to get the package of
mostly Lemon or Lime? What’s going on at the Mars Company? Is the number of the different
colors of Skittles in a package really different from one package to the next, or does the Mars
Company do something to insure that each package gets the correct number of each color of
Skittle? Personally, I’m more of an M&amp;M person but I have opened my mind to
chocolate/peanut allergy people so we can do Skittles too! After researching Skittle data (sad but
true, I spent 37 minutes doing this), the Mars company claims they make each bag random but
approximately 20% of each color should be in each package to make a total of 42 pieces. We
can perform a Chi-square test on individual packages and then compare that to a larger
population of the large package. I couldn’t find data on how many were supposed to be in a
large bag.
1
OBJECTIVES
1. write a null hypothesis that pertains to this investigation
2. determine the degrees of freedom (df) for an investigation
3. calculate the X2 value for a given set of data
4. use the critical values table to determine if the calculated value is equal to or less than the
critical value
5. determine if the Chi-square value exceeds the critical value and if the hypothesis is accepted
or rejected
MATERIALS
Individual bag of Skittles
Large bag of Skittles to be used by the whole class
Paper towels
Calculator
PROCEDURE
Null Hypothesis:
__________________________________________________________________
__________________________________________________________________
NOTE: You cannot eat the Skittles until told to do so!!
1. Wash hands BEFORE starting the lab.
2. Place the contents of your individual bag of Skittles on a paper towel.
3. Record the number of different colors (classes) in Table 1 and Table 2 as “Number Observed”
(o).
4. Calculate the number of each color expected in Table 1 and Table 2 as “Number Expected”
(e). Hint: You must count all the colors and add the total number of Skittles before you can
calculate the number of expected of each color. Round to the nearest whole number.
5. When everyone has completed the lab portion of their individual bag, work as a class to count
and record the large bag. Record numbers in Table 3 and Table 4.
6. If you still have time after the lab portion of both the individual bag and the class bag, you
may eat Skittles while you calculate the Chi-square.
7. Left over Skittles should be placed in a Ziploc bag. Don’t worry, you will get the rest at a
later date!
2
Table 1- Individual
Color of Candy
Grape (purple)
Number Observed (o)
Percentage Expected
20%
Lemon (yellow)
20%
Strawberry (pink/red)
20%
Orange (uh, orange!)
20%
Lime (green)
20%
Number Expected
Total # Candies =
42
Table 2- Individual Chi-square
Class (colors)
Expected (e)
Grape (purple)
Observed (o)
(o-e)2
o-e
(o-e)2/e
Lemon
(yellow)
Strawberry
(pink/red)
Orange (uh,
orange!)
Lime (green)
Table 3- Class Data
Color of Candy
Grape (purple)
Number Observed (o)
Percentage Expected
20%
Lemon (yellow)
20%
Strawberry (pink/red)
20%
Orange (uh, orange!)
20%
Lime (green)
20%
Number Expected
Total # Candies =
3
Table 4- Class Chi-square
Class (colors)
Expected (e)
Grape (purple)
Observed (o)
o-e
(o-e)2
(o-e)2/e
Lemon
(yellow)
Strawberry
(pink/red)
Orange (uh,
orange!)
Lime (green)
DATA ANALYSIS
With the Chi-square calculation table completed, you would look up your Chi-square
value on the Chi-square Distribution table on the bottom of this lab. But to know which column
and row to use on that chart, you must now determine the degrees of freedom to be used and the
acceptable probability that the Chi-square you obtained is caused by chance alone or by other
factors. The following two steps will help you to determine the degrees of freedom and the
probability.
The rows in the Chi-square Distribution table refer to degrees of freedom. The degrees of
freedom are calculated as the one less than the number of possible results in your experiment.
df = n-1
example: 5 classes of colors – 1 = 4
QUESTIONS
1. What is the X2 value for your data?
2. What is the critical value (p = 0.05) for your data?
3. Is your null hypothesis accepted or rejected? Explain why or why not.
4. If the null hypothesis is rejected, propose an alternate and recalculate the X2 value.
5. How many degrees of freedom are there?
6. Compare your individual data with the class data. What differences or similarities do you see?
Explain which “population” is a better study.
4
```