Methods for a Single Categorical Variable – Goodness of Fit The

advertisement
Methods for a Single Categorical Variable – Goodness of Fit
The analyses completed up to this point in the semester were for a single categorical variable with only
_____ possible outcomes of interest. For instance, in the bone density example, the possible outcomes
were whether the postmenopausal woman was diagnosed with low bone density or not. For the drug
example, the possible outcomes were whether an adult experienced relief or not. In this set of notes we’ll
consider categorical variables which have ________________________ two possible outcomes of interest.
Example: Melanoma is a rare form of skin cancer that accounts for the great majority of skin cancer
fatalities. UV exposure is a major risk factor for melanoma. Some body parts are regularly more exposed to
the sun than others. A random sample of 224 men diagnosed with melanoma was classified according to
the known location of the melanoma on their body (head/neck, trunk, upper limbs, lower limbs).
Question:
1. Identify the single categorical variable of interest.
Step 0: Defining the research question
Is there evidence to suggest that melanoma does not occur equally on the body (head/neck, trunk,
upper limbs, lower limbs)?
Step 1: Determining the null and alternative hypotheses
H0: Melanoma occurs equally on the body
Ha: Melanoma does not occur equally on the body
What would the hypotheses look like if we used symbol notation?
Phead/neck =
Ptrunk =
Pupper limbs =
Plower limbs =
1
Questions:
2. If melanoma does occur equally on the body, how many men would you expect to see having
melanoma for each part of the body?
Body part
Head/neck
Trunk
Upper Limbs
Lower Limbs
Men with Melanoma Expected
3. A statistician would argue that we must allow for some slight variations in the number of men with
melanoma over the various parts of the body because we should not expect the numbers to come
out exactly at the expected number for each part of the body. Do you agree? Explain.
Previously, we’ve used a “reference” distribution (predictable pattern) to determine what values seem
likely or unlikely/surprising to occur assuming an infant has no preference, a person can’t hear, the new
drug isn’t more effective, etc. For this example, we have four outcomes that we’re interested in
(head/neck, trunk, upper limbs, lower limbs) and must therefore, compare all four simultaneously to
determine what outcomes seem likely or unlikely/surprising to occur assuming that melanoma does occur
equally on the body (the null hypothesis). One possible simulation of 100 trials for this scenario is given
below.
2
Questions:
4. Looking at the above simulation, what outcomes seem unlikely/surprising to occur? Is it the same
for each body part?
5. The data below shows the breakdown of melanoma by part of body collected from the 224 men.
Part of body
Melanoma
Cases
Expected
Cases
Total
Head/Neck
Trunk
Upper Limbs
Lower Limbs
36
139
17
32
224
56
56
56
56
224
Do you think there is evidence that the number of men with melanoma differs depending on the
part of body under consideration? Explain.
3
Since it is not an easy task to determine whether there is a significant difference overall, we need a method
which will take into account all four parts of the body at once. We can use what’s called the Chi-square test
to accomplish this. The Chi-square test looks at how close the observed data is to what would be expected
(under the null hypothesis) and then finds the probability of observing results at least as extreme results
(total differences) as was observed.
Step 2: Checking the assumptions, finding the test statistic and p-value
Test statistic =

 observed - expected 
2
expected
Once the test statistic has been computed this gives us an indication of how big of a difference
there is overall between what was observed and what was expected.
 36 - 56  + 139 - 56  + 17 - 56  +  32 - 56 
2
Test Statistic =
56
2
56
2
56
2
56
= 167.6071
Questions:
6. What would the value of the test statistic be if what was expected matched the counts observed?
7. Therefore, the bigger the test statistic the _________________________ the difference there is
between what was observed and what was expected.
4
Once we have the test statistic, we can determine how unusual/surprising it is to observe assuming the
null hypothesis (equal distribution, i.e. all 0.25) is true. The results for a simulation with 1,000 trials is
given below.
The estimated p-value is going to be the probability of our observed test statistic and the values which
are more extreme, i.e. larger!
p-value =
# dots at 167.6071 and larger
=
1000
Step 3: Reporting the conclusion in context of the research question
5
In order to find the exact p-value, we’ll have to use what’s called the Chi-square distribution. We will again
use JMP to do this. Thus, we must first enter the observed data into JMP as follows.
Next choose Analyze  Distribution (like we did for confidence intervals). Put Part of Body in the Y,
columns box and Count in the Freq box as shown below.
Click OK, and you should get the following output.
Next, click on the little red arrow next to Part of Body and choose Test Probabilities. Then enter the
hypothesized probabilities as shown below.
6
After clicking Done you should get the following output.
You’ll want to use the test statistic and p-value given for the Pearson Chis-square test (circled above).
Question:
8. Give the values for the test statistic and p-value for this scenario.
Warning: Make sure the Chi-square test is valid!
Recall that the chi-square distribution is used to approximate p-values. This approximation may not be
very good with small sample sizes. One rule of thumb suggests that most (at least _______) of the
expected counts should be 5 or more; otherwise, the chi-square approximation may not be reliable.
Note: We already found the expected counts back in Question 2 and all 4 were greater than 5.
7
Let’s look at the formal hypothesis test now.
Step 0: Defining the research question
Is there evidence to suggest that melanoma does not occur equally on the body (head/neck, trunk,
upper limbs, lower limbs)?
Step 1: Determining the null and alternative hypotheses
H0: phead/neck = 0.25 ptrunk = 0.25
Ha: Two or more differ
pupper limbs = 0.25
plower limbs = 0.25
Step 2: Checking the assumptions, finding the test statistic and p-value
Expected Counts:
Head/neck  224(0.25) = 56 ≥ 5√
Trunk  224(0.56) = 56 ≥ 5√
Upper Limbs  224(0.25) = 56 ≥ 5√
Lower Limbs  224 (0.25) = 56 ≥ 5√
All expected counts are ≥ 5√ so the condition for the Chi-square has been satisfied.
Test Statistic = 167.6071
p-value = 0.0001
Step 3: Reporting the conclusion in context of the research question
There is evidence (since 0.0001 < 0.05) that melanoma does not occur equally on the body
(head/neck, trunk, upper limbs, lower limbs).
8
Example: A drug company is interested in investigating whether the color of their packaging has any impact
on sales. To test this, they used five different colors (blue, green, orange, red, and yellow) for the boxes of
an over-the-counter pain reliever, instead of the traditional white box. The following table shows the
number of boxes sold over the first month.
Blue
310
Green
292
Orange
280
Red
216
Yellow
296
Total
1394
Research Question – Is there evidence that the number of boxes sold is not equally likely for the
different colors?
Step 0: Define the research question
Is there evidence that the number of boxes sold is not equally likely for the different colors?
Step 1: Determine the null and alternative hypotheses
H0:
Ha:
Step 2: Checking the assumptions, finding the test statistic and p-value
Step 3: Reporting the conclusion in context of the research question
9
Example: A Gallup poll asked 1010 adults ages 18 and over about their ideal weight. The survey found that
112 interviewees thought that they were currently under their ideal weight, 180 thought that they were at
about their ideal weight, and 718 thought that they were over their ideal weight. The National Health
Interview Survey estimates, based on people’s actual body mass index (computed from both weight and
height), that 1.8% of the U.S. adult population is underweight, 36.7% has a healthy weight, and 61.5% is
either overweight or obese.
Step 0: Defining the research question
Do the self-perceptions of the adults from the Gallup poll differ from the reality of the weight
distributions in the United States?
Step 1: Determining the null and alternative hypotheses
H0:
Ha:
Step 2: Checking the assumptions, finding the test statistic and p-value
Step 3: Reporting the conclusion in context of the research question
10
Example: The Frizzle fowl is a striking variety of chicken with curled feathers. In a 1930 experiment,
Launder and Dunn crossed Frizzle fowls with a Leghorn variety exhibiting straight feathers. The first
generation (F1) produced all slightly frizzled chicks. When the F1 was interbred, the following
characteristics were observed in the F2 generation:
Frizzled
23
Slightly Frizzled
50
Straight
20
Total
93
The most likely genetic model for these results is that of a single gene locus with two codominant alleles.
Under such a model, we would expect a 1:2:1 ratio in the F2 generation.
Step 0: Defining the research question
Is there evidence the codominant model of inheritance for this feather phenotype is not
appropriate?
Step 1: Determining the null and alternative hypotheses
H0:
Ha:
Step 2: Checking the assumptions, finding the test statistic and p-value
Step 3: Reporting the conclusion in context of the research question
11
Download