1 Introduction: STAT 401 Week 5 Lab 1

advertisement
1
Introduction: STAT 401 Week 5 Lab 1
In this lab, we will learn how to do two-sample and paired t tests in JMP and
then do them.
We will also cover an example of how to do ANOVA.
2
Example Problems
2.1
Demonstration: Monkey Nerve Cell Data
The monkey data set.
Scientists selected 8 rhesus monkeys and randomly assigned each monkey
to have nerves severed in either their right or left arm. Creatine phosphate
concentrations (in mg per 100g of tissue) were measured for each arm during
the regenerative process.
In the data set, the data is in the following format:
Subject # | severed nerve | intact nerve
1. To start, treat each of the two treatments as if they were independent
populations. i.e., we will investigate with a two-sample t-test. We are
interested in whether the sample means are equal. These are all outputs
from JMP:
What is the difference in sample means?
What is the sample variance of the difference?
How many degrees of freedom are there?
What is the standard error for the mean difference?
We want to test whether the two populations have the same mean. What
is the null hypothesis?
What is the test statistic?
What is the p-value?
2. The above is an incorrect analysis. Since each pair of measurements is
performed on one monkey, they are correlated. We should perform a
paired t-test. Using a paired t-test:
What is the difference in sample means?
What is the sample variance of the difference?
How many degrees of freedom are there?
What is the standard error for the mean difference?
We want to test whether the two populations have the same mean. What
is the null hypothesis?
What is the test statistic?
What is the p-value?
1
The consequences of performing the wrong test are dire. Inference will be less
powerful, results may be wrong, everything will be less efficient, your research
won’t get published, and the monkeys will be angry at you.
2.2
Demonstration: ANOVA on mouse diet data
One-way ANOVA could be performed instead of any two-sample t-test and it
will give the same answer. ANOVA is nice, however, in that it can compare
more than two groups.
In diet.csv, there is data about mouse longevity under 5 different diets.
There should be 49 different mice in each diet.
2.2.1
How to perform ANOVA on this data set
1. Download the data and load it into JMP. There are two variables, longevity
and diet. Make sure longevity is continuous numeric.
2. There are two ways to get to ANOVA, either Fit Model or Fit Y by X.
Fit Model is more versatile, so that is what I will use for these notes. Go
to Analyze > Fit Model ....
3. The response variable should go in Y and the explanatory variable(s)
should go in the big box in Construct Model Effects. To do this, you
can click-and-drag, but there are other methods that work.
4. In the top-right, there is a dropdown menu labeled Personality. Select
Minimal Report.
5. Run the analysis. A new window should pop up with tons of output. If the
output you want is not there, clicking on the red triangle should display
options for more output.
What is the F statistic?
What is the p-value associated with the F statistic?
What is the d.f. error?
What is the mean square error?
What are the estimates for the coefficients? JMP does odd stuff with the
coefficients, so I will explain what is going on with them.
NP N/N85 N/R50 N/R40
NP
1
0
0
0
N/N85
0
1
0
0
N/R50
0
0
1
0
N/R40
0
0
0
1
R/R50
-1
-1
-1
-1
JMP creates a set of four dummy variables and codes them as in the table
above. The coefficients in the ANOVA output correspond to those dummy
variables. The dummy variables are coded to the observations as in the table
above.
2
If you don’t get this right now, it is okay, we will cover this again later in
more detail. Right now I am just trying to show you how to get the results from
JMP.
Here is how it works: suppose you want to estimate the mean for group
N/N85:
2.3
Running Lizards
Hopefully, after seeing how t-tests are done with monkeys, you will be able to
work through this problem about lizards. (Note: no ANOVA required)
In northern California, western fence lizards, Sceloporus occidentalis,
are frequently infected with malaria. Scientists selected samples from two populations, one infected and one non-infected, and measured how far the lizards
ran in 2 minutes. They were interested in whether the healthy lizards ran further
than the infected. The data are in lizard.dat.
The first column is lizard number, the second column indicates whether the
lizard is infected ( 1 = yes, 2 = no), and the third column is distance.
All of the following questions can be answered from JMP, though you should
have some understanding of where JMP is getting the answers.
1. What are the sample means of the two populations?
2. What are the sample standard deviations of the two populations?
3. What is the difference in sample means?
4. What is the sample variance of the difference?
5. How many degrees of freedom are there?
6. What is the standard error for the mean difference?
7. We want to test whether the two populations have a different mean. What
is the null hypothesis?
8. What is the test statistic?
9. What is the p-value?
10. What is the conclusion?
3
References
• Rhesus Monkey
• Lizards Infected with Malaria
• STAT 401 Page
3
Download