Lab 5

advertisement
Lab 5: JMP instructions
Goals:
1.
2.
3.
4.
How to read .csv files
How to do a Wilcoxon rank sum test
How to compute means for groups of data
How to store residuals from an analysis
How to read .csv files:
Nothing special needed, use file/open to get started. .csv files are included in the default list of
potential files. If you only want to see a subset of the potential files, choose Text Files, which includes
.csv and .txt files. Because of the special structure of a .csv file, JMP can usually read it without problem.
If it struggles, use Text Import Preview (instead of Text Import Default) to provide some guidance to JMP
Wilcoxon rank sum test: uses hamburger.csv.
Read in the data set. You will note that treatment is a nominal variable (categories, red bars) and cfu is
a continuous variable (numbers, blue ramp) by default. If you read a data set and the treatment is a
continuous variable, you need to change the modeling type for that variable (described in detail in an
earlier lab).
Start the same way you would for a t-test: Analyze / Fit Y by X. Treatment in the X, Factor box and cfu in
the Y, response box, then click OK.
Choose the analysis by clicking the red triangle, select Nonparametric then Wilcoxon test. The default
output includes the Normal approximation to the p-value (appropriate for large samples).
You are supposed to be able to get the exact p-value when the X variable has two levels. After running
the Wilcoxon test, a new entry, Exact test, is supposed to be added to the Nonparametric menu. This
runs the exact (permutation or randomization) test, appropriate for small samples. I say ‘you are
supposed to’ because that didn’t work on my copy of JMP.
Computing means for groups of observations: uses patty.txt.
The patty data set has 3 plate counts for each hamburger patty. The bacterial concentration was
measured three times for each patty. In the language to be used shortly, the experimental unit is the
patty; the observational unit (in the original data) is plate count. This is a violation of independence. A
more appropriate analysis is based on the average count per patty. After averaging, the observational
unit is the patty (because one row of data per patty), which is the same as the experimental unit (good).
The strategy is to calculate the averages for each patty and save them in a new data set. Then, we will
analyze that new data set.
Read the patty.txt file. You should have a data table with three columns: trt, rep and cfu. There are
three cfu values for each rep. We want to average each set of three to get a single mean for each
combination of trt and rep.
Select Tables / Summary from the main menu. Define what you want JMP to do in two steps:
1. Put trt and rep into the Group box. This specifies what combination of values identifies each
unique group. It doesn’t matter whether a variable is continuous (blue ramp) or nominal (red
bars).
2. Select cfu (the variable that will be averaged), left click the black arrow by Statistics to open the
Statistics menu, choose mean. At this point, the Summary dialog box should look like:
Click OK. You get a new data table with 12 observations, one for each hamburger patty (i.e.
combination of treatment and rep). You then use data set for subsequent plots or analyses.
Options:
in the dialog, you give a specific name (Output table name) for the data table in case you don’t like
the name JMP generates for you.
After you get the new data table, you can change variable names.
Calculating and storing residuals:
Some details depend on the analysis you’re doing. Some analysis platforms provide multiple ways to
extract and manipulate residuals. Here’s what works in Fit Y by X and will work in most other analysis
procedures.
Run the desired analysis. These instructions are based on a t-test comparing treatment means in the
hamburger patty data set after averaging plate counts (so there are 6 observations per treatment).
After you get the analysis, click the red triangle by ‘Oneway analysis …’. Select Save / Save Residuals. A
column is added to the data table. That column has the residuals. If you use Save / Save Predicted, you
get a column with the predicted values. These can be used for any subsequent plot or analysis.
Download