Statistics MINITAB

advertisement
Statistics
MINITAB - Lab 15
1. CHI – SQUARED TESTS FOR UNIVARIATE CATEGORICAL DATA
Often experimental or survey work results in data that consists of counts of responses
fitting into a number of classifications or categories. This data does not lend itself to much
of the data analysis we have looked at in earlier classes as count data cannot be assumed
to be normally distributed. Univariate categorical data is most conveniently summarised in a
one-way frequency table.
EXAMPLE:
A Study was carried out to identify the cause of Adverse Drug Effects. The table below
summarises the cause of 95 ADE's that were caused by the wrong dose being prescribed
or dispensed. Using  = 0.01 determine whether the true percentages of ADE's in the five
categories are different.
Wrong Dosage Cause
Lack of Knowledge of Drug
29
Rule Violation
17
Faulty Dose Checking
13
Slips
9
Other
27
Calculating the expected number of outcomes
In this case we can assume that the cause of wrong dosage is equally divided among all of
these reasons. In this case there were 95 ADE’s and five categories and so we would
expect 19 in each.
Wrong Dosage Cause
Observed
Expected
Lack of Knowledge of Drug
29
19
Rule Violation
17
19
Faulty Dose Checking
13
19
Slips
9
19
Other
27
19
1
The 2 Test in MINITAB
Ho:
1 
19
19
19
19
19
, 2 
, 3 
, 4 
, 5 
95
95
95
95
95
Ha: Ho is not true
TO CALCULATE A TEST STATISTIC AND P-VALUE
1
Enter the observed and expected values.
2
Name the columns by typing Observed and Expected in the name cells.
3
Calculate the test statistic:
(Observed  Expected ) 2

Expected
by first
enabling commands and then generating some commands under MTB to create the
test statistic equation.
Hints:
(a) Store the (Observed – Expected) values in C3
(b) Store the (Observed – Expected)2 values in C4
(c) Store the
(Observed  Expected ) 2
Expected
values in C4
(d) Then sum the values in C4 and store them under some name, eg k1
(e) Print k1 and this is the value of your test statistic
4.
Calculate the P-Value by: Choosing Calc > Probability Distributions > ChiSquare.
Choose Cumulative
probability
Enter the degrees of
freedom for the test
Choose Input constant, and
enter the name you gave
the test statistic
In Optional storage, enter a
name for the cumulative
probability. Eg k2
2
Look this value up in the tables.
How would you calculate the p-value from this figure?
________________________________________________________________________
6.
The p-value is calculated from 1 minus the cumulative probability. Calculate this
under an MTB command.
Using  = .05, summarise this analysis. You should state the Ho, Ha, , the value of the
test statistic, p and your conclusions.
2.
NON - PARAMETRIC TESTS - 1 SAMPLE SIGN TEST.
Often we are required to analyse small sample data that cannot be considered to be
normally distributed. In these cases a t or F test (ANOVA) cannot be performed. However
there is a class of tests that do not require data to follow any particular probability
distribution - these tests are called non-parametric tests.
One such test, which is based on the binomial distribution is called the Sign Test. The Sign
test tests hypothesis concerning the median of a small sample. We know from the
definition of a median that 50% of the distribution should lie below and 50% above the true
median. Therefore if we specify a median under the null hypothesis we can analyse the
sample as a binomial experiment, with the number of observations below the median
defined as a success, with p = 0.5 being the probability of a success. (we can also define a
success on the binomial experiment as being above the hypothesised median).
3
Sign Test for a Population Median 
Note:  (the Greek letter eta) will be the symbol used here for the population median.
ONE-TAILED TEST
TWO-TAILED TEST
Ho:  = 0
Ho:  < 0 (or Ho:  > 0 )
Ho:  = 0
Ho:   0
Test statistic:
Test statistic:
S = Number of sample measurements less
than 0 [ or S = number of measurements
greater than 0 ]
Larger of S1 and S2 , where S1 is the number of
measurements less than 0 and S2 is the number of
measurement greater than 0
Observed significance level:
Observed significance level:
p-value = P( x  S )
p-value = 2P( x  S )
where x has a binomial distribution with parameters n and p = 0.5.
P(x  S) = 1  p(x < S)
The P-value is computed as:
=> P.value  1 
s 1
n
  i 0.5
n
i 0
s 1
The quantity
n
  i 0.5
n
may be computed in the MINITAB session window as follows;
i 0
MTB > cdf S-1;
(where S is replaced by the actual integer of interest)
SUBC > binomial n 0.5.
(where n is replaced with the actual number of trials)
Rejection region: reject Ho: if P.value  
Assumption: The sample is selected randomly from a continuous probability distribution [Note: No
assumptions need to be made about the shape of the probability distribution.]
Note: when the sample size is 10 or more the normal approximation to the binomial may be used - refer
to lecture notes and text book for more information on these formulae.
The following data are scores for the survival times in months of heart and lung transplant
patients at a particular hospital. Test the hypothesis that the median survival time is 10
months versus an alternative that it is less than 10 months.
Survival times (months): 8.4, 16.9, 15.8, 12.5, 10.3, 4.9, 12.9, 9.8, 23.7, 7.3 .
Go to Stat > Nonparametrics > 1-Sample Sign...
4
1. Select the test variable here
2. Enter the median under the
Ho
3. Select the correct alternative
hypothesis
Summarise your analysis and conclusion here.
Is there any evidence that the median survival time is greater than 20 months ? Summarise
your analysis and conclusions.
Assignment:
5
An experiment was carried out to investigate if there was equal distribution of the different
colours of M&M’s in the large bags. The observed values were 9, 15, 13, 17, 26, 4 for the
red, green, orange, brown, yellow and blue M&M’s respectively. Perform an appropriate
hypothesis test on these results using  = 0.01. Report your answers here:
REVISION SUMMARY
After this lab you should be able to :
-
Perform a chi-squared test for univariate categorical data (to do this you
must be familiar with the test statistic for this test)
-
Perform a sign test
END
6
Download