1 Introduction: STAT 401 Week 5 Lab 1 In this lab, we will learn how to do two-sample and paired t tests in JMP and then do them. We will also cover an example of how to do ANOVA. 2 Example Problems 2.1 Demonstration: Monkey Nerve Cell Data The monkey data set. Scientists selected 8 rhesus monkeys and randomly assigned each monkey to have nerves severed in either their right or left arm. Creatine phosphate concentrations (in mg per 100g of tissue) were measured for each arm during the regenerative process. In the data set, the data is in the following format: Subject # | severed nerve | intact nerve 1. To start, treat each of the two treatments as if they were independent populations. i.e., we will investigate with a two-sample t-test. We are interested in whether the sample means are equal. These are all outputs from JMP: What is the difference in sample means? What is the sample variance of the difference? How many degrees of freedom are there? What is the standard error for the mean difference? We want to test whether the two populations have the same mean. What is the null hypothesis? What is the test statistic? What is the p-value? 2. The above is an incorrect analysis. Since each pair of measurements is performed on one monkey, they are correlated. We should perform a paired t-test. Using a paired t-test: What is the difference in sample means? What is the sample variance of the difference? How many degrees of freedom are there? What is the standard error for the mean difference? We want to test whether the two populations have the same mean. What is the null hypothesis? What is the test statistic? What is the p-value? 1 The consequences of performing the wrong test are dire. Inference will be less powerful, results may be wrong, everything will be less efficient, your research won’t get published, and the monkeys will be angry at you. 2.2 Demonstration: ANOVA on mouse diet data One-way ANOVA could be performed instead of any two-sample t-test and it will give the same answer. ANOVA is nice, however, in that it can compare more than two groups. In diet.csv, there is data about mouse longevity under 5 different diets. There should be 49 different mice in each diet. 2.2.1 How to perform ANOVA on this data set 1. Download the data and load it into JMP. There are two variables, longevity and diet. Make sure longevity is continuous numeric. 2. There are two ways to get to ANOVA, either Fit Model or Fit Y by X. Fit Model is more versatile, so that is what I will use for these notes. Go to Analyze > Fit Model .... 3. The response variable should go in Y and the explanatory variable(s) should go in the big box in Construct Model Effects. To do this, you can click-and-drag, but there are other methods that work. 4. In the top-right, there is a dropdown menu labeled Personality. Select Minimal Report. 5. Run the analysis. A new window should pop up with tons of output. If the output you want is not there, clicking on the red triangle should display options for more output. What is the F statistic? What is the p-value associated with the F statistic? What is the d.f. error? What is the mean square error? What are the estimates for the coefficients? JMP does odd stuff with the coefficients, so I will explain what is going on with them. NP N/N85 N/R50 N/R40 NP 1 0 0 0 N/N85 0 1 0 0 N/R50 0 0 1 0 N/R40 0 0 0 1 R/R50 -1 -1 -1 -1 JMP creates a set of four dummy variables and codes them as in the table above. The coefficients in the ANOVA output correspond to those dummy variables. The dummy variables are coded to the observations as in the table above. 2 If you don’t get this right now, it is okay, we will cover this again later in more detail. Right now I am just trying to show you how to get the results from JMP. Here is how it works: suppose you want to estimate the mean for group N/N85: 2.3 Running Lizards Hopefully, after seeing how t-tests are done with monkeys, you will be able to work through this problem about lizards. (Note: no ANOVA required) In northern California, western fence lizards, Sceloporus occidentalis, are frequently infected with malaria. Scientists selected samples from two populations, one infected and one non-infected, and measured how far the lizards ran in 2 minutes. They were interested in whether the healthy lizards ran further than the infected. The data are in lizard.dat. The first column is lizard number, the second column indicates whether the lizard is infected ( 1 = yes, 2 = no), and the third column is distance. All of the following questions can be answered from JMP, though you should have some understanding of where JMP is getting the answers. 1. What are the sample means of the two populations? 2. What are the sample standard deviations of the two populations? 3. What is the difference in sample means? 4. What is the sample variance of the difference? 5. How many degrees of freedom are there? 6. What is the standard error for the mean difference? 7. We want to test whether the two populations have a different mean. What is the null hypothesis? 8. What is the test statistic? 9. What is the p-value? 10. What is the conclusion? 3 References • Rhesus Monkey • Lizards Infected with Malaria • STAT 401 Page 3