Lab Exercise No. 2

advertisement
Lab Exercise No. 2
The goal is to generate the permutation distribution of a test statistic. Work with a partner if you wish.
Two Sample Comparisons: Paste into a Minitab worksheet the SpeedTicket data. Put the data in
columns 3 and 4. Then unstack the data into columns 1 and 2 for this exercise.
Under Editor menu make sure to enable the commands.
At the Minitab prompt type: help let (If you have never used the let command.) We are going to use it in a
couple of places.
At the prompt type: let k1=median(c1)-median(c2) and hit return.
Then type: print k1. You should get -1.
Research Hypothesis: people with tickets tend to claim higher max speeds that those without tickets.
Null Hypothesis: Distributions (of max speeds) are identical
Res. Hypothesis: Distributions are different
We then pick a feature of the distributions that distinguishes them.
We will concentrate on the pop. median. Hence, we use the sample medians and our statistic is the
difference in sample medians.
We have observed from the data that the difference is -1.
The p-value of the test is a measure of evidence against the null
hypothesis.
The p-value is the probability (assuming the distributions are identical) of observering a difference as
extreme as the one we have observed.
Rule: Reject the null hyp if p-value is sufficiently small.
If D is the difference in sample medians (w/o tic - w tic)
then we want P( D <= -1 )
Here is the permutation basis for the sampling distribution of D under the assumption that the pops are
identical (null hyp). There are 44 observations from no tic group and 65 from the tic group. Since under
the null hyp the pops are identical, we really have 109 observations from one pop. Any permutation of
the data is equally likely. So we
permute the data and then split it up and recompute D. Do this a lot of times and then check to see the
proportion of the time that D <= -1. This is an estimate of the p-value.
We will write a macro to carry this out.
In Minitab a macro must have a certain form: You must have the following:
1. macro
2. name with designated columns or constants
3. mcolumn
mconstant
4. (body of commands)
5. endmacro
We will write 3 macros in increasing order of complexity.
You should write them in Notepad and save them as name.mac files.
Do them one at a time. For example, save the first one as permmeds1.mac. Save them in the macros
folder in Minitab if you can.
To execute the command, type at the prompt: %permmeds c1 c2 c3
If you can't save it in the macros file in Minitab then you have to specify the path.
eg. a:\permmeds c1 c2 c3
The following macro permutes the data once and puts D into row 1 of column x1 which you designate.
See if you can get it to work. The other two macros automate the generation of permutations of the
combined data.
macro
permmeds1 c1 c2 x1
mcolumn c1 c2 c3 c4 c5 c6 c7 c8 x1
mconstant k1
stack c1 c2 c3;
subs c4.
let k1=count(c3)
sample k1 c3 c5
unstack c5 c6 c7;
subs c4.
let c8(1)=median(c6)-median(c7)
copy c8 x1
endmacro
------------------------------------------------macro
permmeds2 c1 c2 x1
mcolumn c1 c2 c3 c4 c5 c6 c7 c8 x1
mconstant k1 i
stack c1 c2 c3;
subs c4.
let k1=count(c3)
do i=1:100
sample k1 c3 c5
unstack c5 c6 c7;
subs c4.
let c8(i)=median(c6)-median(c7)
enddo
copy c8 x1
endmacro
--------------------------------------------------
macro
permmeds c1 c2 x1;
iterations B.
mcolumn c1 c2 c3 c4 c5 c6 c7 c8 x1
mconstant k1 i B
default B=1000
stack c1 c2 c3;
subs c4.
let k1=count(c3)
do i=1:B
sample k1 c3 c5
unstack c5 c6 c7;
subs c4.
let c8(i)=median(c6)-median(c7)
enddo
copy c8 x1
endmacro
-----------------------------------------------------If you get the last one to run you will have 1000 Ds in x1. You need to determine the proportion of them
that are <= -1.
Suppose you have set x1 to be c10.
Then type at the prompt: let c11=c10<=-1 and then type mean c11. This will be the p-value.
You should also get the histogram of Ds. This histogram is the estimate of the sampling distribution of D
under the null hyp.
Due Wednesday:
1. Copy and paste the histogram into Microsoft word along with the output of the describe command on
the Ds and the p-value.
Type a short paragraph describing the conclusion you draw from the permutation test based on the
difference of sample medians.
2. Modify the last macro so that you get two columns: one for the difference in means and one for the
difference in medians.
Repeat question 1 for the difference in means and include a copy of your new macro.
Download