STT 430/530 R#4 Fall, 2007 #let's first do example 2.6.1 on page 43 #calculate the Wilcoxon rank-sum statistic to compare the distributions #of battery lifetimes (in hours) of two brands of laptops brand1=c(3.6,3.9,4.0,4.3) brand2=c(3.8,4.1,4.5,4.8) bothbrands=c(brand1,brand2) #combine to rank W=sum(rank(bothbrands)[1:4]) ; W #so W=14 #compare this to the critical values in Table A.3, p.340 #note they are 11 (lower) and 25 (upper) and that 14 falls #in between, so at the 10% level we would not reject the #null hypothesis of no difference in distributions of battery lifetimes #RECALL the table gives 5% in each tail. Thus our p-value in this #problem is >.10 # # #now check the results with the built-in function wilcox.test wilcox.test(brand1,brand2) #note that W is given as 4 and p=.3429 (two-sided alternative is the default) # #So what's going on here?? It turns out this is actually an equivalent #statistic called the Mann-Whitney U statistic and U = W - n(n+1)/2 #where n=the size of the group whose ranks were summed to get W. # #define U=# of pairs (Xi,Yj) s.t. Xi < Yj #first compute all the pairwise differences (pwds) between the X's & Y's #Then see how many are negative to get the value of U #create a matrix to contain all the pwds and fill it with 0's k=matrix(rep(0,4), nrow=4, ncol=4, byrow=T) ; k #now compute the pairwise differences and store them in k for (i in 1:4) { k[i,]=brand1[i]-brand2 } #NOTE: be careful about this subtraction - why is there a [i] attached to #brand1 and not to brand2?? #now see how many are negative length(k[k<0]) #note this gives U=12, agreeing with Table 2.6.1, p.44 #Why is this not equal to the value of W=4 we found in wilcox.test? # #it turns out that there is an existing function in R that will also do all the #matrix computations called the outer function with "-" as an argument Check help(outer) pwds=outer(brand1,brand2,"-") #gives the pwds and now to count all the ones with brand1<brand2 #i.e., count the pwds < 0 length(pwds[pwds<0]) # #find the p-value for a U=12 from Table A.4: U=12 is in between 1 and 15 #so we don't reject the null hypothesis of no difference in battery #life distributions between the two brands, again p>.1 # #Often, if you reject the null hypothesis, then you will want to be able to #estimate the difference between the distributions (since you're saying #there is one!). Use the "shift alternative" and go over the steps on page 46 #for constructing a CI for delta ... # # # # # STT 430/530 R#4 Fall, 2007 #Now let's look at example 2.6.2 on page 46-47 granite=c(33.63,39.86,69.32,42.13,58.36,74.11) basalt=c(26.15,18.56,17.55,9.84,28.29,34.15) #first check that you're rejecting the null hypothesis wilcox.test(granite,basalt,alternative="g") #note p=.002165 so we reject the null hypothesis of equal distributions. #let's get the pwds either by doing your own matrix computations or by pwds=outer(granite,basalt,"-") #the Hodges-Lehmann estimate of the shift delta is the median of these pwds median(pwds) #now get the quantiles of the Mann-Whitney distribution in order to #compute CI's for delta qwilcox(.05,6,6); qwilcox(.95,6,6) #or you may get the H-L estimate and the confidence interval directly wilcox.test(granite,basalt,conf.int=T,conf.level=.90)