Lab_Exercise_9_(Two

advertisement

Lab Exercise 9 Part 1 (Two-Way Layout)

Goals: Introduce a two-way layout with one observation per cell. Develop a statistic for testing a general hypothesis.

We have been working with one-way layouts recently in class and lab. (For example see Lab Exercise 8.) In a one-way layout you only need one label to say where an observation belongs. For example in the deer data you only need to know which months (Jan-Feb, MarchApril, etc…) the observation came from.

Consider the data called ascorbic from the 464 web site.

4

5

6

7 storage facility

1

2

3

A

15.54

20.50

21.34

17.52

16.34

17.86

20.88

B

21.31

21.89

19.25

20.56

20.72

21.27

19.63

C

18.04

24.03

19.75

20.79

20.37

23.01

21.26

Data is ascorbic acid content of frozen broccoli after a month of storage.

3 different packaging methods were used (A, B, C) and

7 different storage facilities were used.

Hence, you need two labels to identify an observation: Which packaging and which storage facility. eg. 15.54 is from packaging A and from facility 1.

The researchers believe that there is a difference in how well the different packaging methods work in terms of preserving ascorbic acid. However, they also are concerned that there may be differences in ascorbic acid content after a month depending on which facility was used for storage.

The facilities are called blocking variables. That is, we have 7 blocks of data.

Within each block we have 3 observations: the ascorbic acid content for the 3 packaging methods.

1. State the research hypothesis. And get the ascorbic data from the web page.

2. In order to test hypotheses concerning the packaging methods we need to remove the effects due to the facilities. If we leave the facility effects in the data then the facility effects may obscure differences due to packaging, and we will not be able to detect the differences due to packaging.

3. In order to remove the facility effects we will use 1 iteration (1/2 cycle) of median polish . (See 4 below for a description of the macro to do this.) Then apply the Kruskal-Wallis test to the residuals. Hence, by removing the facility

(blocking) effect we reduce the problem from a two-way layout to a one-way layout.

4. Next analyze the ascorbic data:

 Get the macro medpolishwr from the web page. (wr me ans ‘with residuals’)

Apply the macro to the ascorbic data. You will need to use the subcommand ‘iterations’ and specify 1. That will give you ½ cycle or 1 iteration of median polish.

And then apply the Kruskal-Wallis test to the residuals. This is called an aligned rank test because we align the data by removing the block effects before carrying out the test.

5. Get the 85% CI-Boxplots and discuss where you think the significance is coming from. What are your recommendations for packaging? In this part you are carrying out informal multiple comparisons without setting overall error and comparison error rates. You could, of course, carry out more formal multiple comparisons if you want to.

6. Finally, apply the Kruskal-Wallis test to the original raw data ignoring the blocking factor (facility). What do you find?

Part 2 (Repeated Measures)

The goal of this exercise is to contrast the randomized block design with the repeated measures design. We will introduce the Friedman Rank Test for repeated measures along with multiple comparisons.

In a randomized block design (two-way layout with one observation per cell) we randomly assign independent subjects to the treatments. The assignment is made within blocks in order to compare similar subjects. You worked with this type of data in oart 1 where we developed an aligned rank test (Kruskal-Wallis computed on the residuals after ½ cycle (1 iteration) of median polish). The experiment entailed randomly assigning broccoli to packaging types within facilities as the blocks.

In repeated measures we randomly order the treatments that are repeatedly given to a subject. The subject constitutes a block. The data collected are dependent and correlated within a subject.

To decide whether you have a randomized block design or a repeated measures design think about whether subjects are assigned to treatments or whether treatments are assigned to the subject over time.

The aligned rank test is not appropriate for analyzing data in a repeated measures design. However, the Friedman test introduced next can be used in a

randomized block design but it is not as powerful as the aligned rank test. Now we consider an example of a repeated measures design and introduce statistical analysis. See section 4.5 of the text for more discussion along with the formula for the Friedman statistic.

Example : Twenty two professional baseball players participated in a study to determine if there is a best way to run from home plate to second base. In particular they wanted to minimize the time. They tried three methods of rounding first base: round out method (RO), narrow angle (NA), wide angle

(WA). Each player (subject) ran six times, two times each for each method. The methods (treatments) were assigned in random order and the data is the average of the two times for each of the same methods. Hence, the data is a set of three repeated measures on each of the 22 players. Get the data called baseball.txt from the 464 web site.

1. State the research hypothesis.

2. To prepare the data for analysis you will need to stack the three columns of data and then have two columns of subs: one column for treatment (method of running the bases) and one column for blocks (players). You may use the twdesign macro discussed in class or enter the subs by hand.

3. Once you have the data in proper form use the menu

Stat>Nonparametrics>Friedman to apply the Friedman test. The main difference between the Kruskal-Wallis statistic and the Friedman statistic is that with the

Friedman statistic you only rank within blocks but you don’t remove the block effect. With the Kruskal-Wallis statistic you first remove the block effect and then rank all the data.

4. You will need to compute the multiple comparisons by hand. And you will need the fact that

R i

R j k k+ 1

6 b can be referred to a standard normal table in order to check the p-value for the comparison between treatment i and treatment j. Here

R i

is the rank average for the ith treatment, k is the number of treatments, and b is the number of blocks. The expression under the square root is the variance of the difference.

5. Write a report on the baseball data. You should use Minitab to apply the

Friedman test. Then you should carry out multiple comparisons based on the formula in 4 above. Finally, you should do ½ cycle (1 iteration) of median polish on the data and get the CI-Boxplots for the residuals. Make a case for which

method you would recommend for the coaches to teach for running the bases based on your analysis.

Download