Lab 5: JMP instructions Goals: 1. 2. 3. 4. How to read .csv files How to do a Wilcoxon rank sum test How to compute means for groups of data How to store residuals from an analysis How to read .csv files: Nothing special needed, use file/open to get started. .csv files are included in the default list of potential files. If you only want to see a subset of the potential files, choose Text Files, which includes .csv and .txt files. Because of the special structure of a .csv file, JMP can usually read it without problem. If it struggles, use Text Import Preview (instead of Text Import Default) to provide some guidance to JMP Wilcoxon rank sum test: uses hamburger.csv. Read in the data set. You will note that treatment is a nominal variable (categories, red bars) and cfu is a continuous variable (numbers, blue ramp) by default. If you read a data set and the treatment is a continuous variable, you need to change the modeling type for that variable (described in detail in an earlier lab). Start the same way you would for a t-test: Analyze / Fit Y by X. Treatment in the X, Factor box and cfu in the Y, response box, then click OK. Choose the analysis by clicking the red triangle, select Nonparametric then Wilcoxon test. The default output includes the Normal approximation to the p-value (appropriate for large samples). You are supposed to be able to get the exact p-value when the X variable has two levels. After running the Wilcoxon test, a new entry, Exact test, is supposed to be added to the Nonparametric menu. This runs the exact (permutation or randomization) test, appropriate for small samples. I say ‘you are supposed to’ because that didn’t work on my copy of JMP. Computing means for groups of observations: uses patty.txt. The patty data set has 3 plate counts for each hamburger patty. The bacterial concentration was measured three times for each patty. In the language to be used shortly, the experimental unit is the patty; the observational unit (in the original data) is plate count. This is a violation of independence. A more appropriate analysis is based on the average count per patty. After averaging, the observational unit is the patty (because one row of data per patty), which is the same as the experimental unit (good). The strategy is to calculate the averages for each patty and save them in a new data set. Then, we will analyze that new data set. Read the patty.txt file. You should have a data table with three columns: trt, rep and cfu. There are three cfu values for each rep. We want to average each set of three to get a single mean for each combination of trt and rep. Select Tables / Summary from the main menu. Define what you want JMP to do in two steps: 1. Put trt and rep into the Group box. This specifies what combination of values identifies each unique group. It doesn’t matter whether a variable is continuous (blue ramp) or nominal (red bars). 2. Select cfu (the variable that will be averaged), left click the black arrow by Statistics to open the Statistics menu, choose mean. At this point, the Summary dialog box should look like: Click OK. You get a new data table with 12 observations, one for each hamburger patty (i.e. combination of treatment and rep). You then use data set for subsequent plots or analyses. Options: in the dialog, you give a specific name (Output table name) for the data table in case you don’t like the name JMP generates for you. After you get the new data table, you can change variable names. Calculating and storing residuals: Some details depend on the analysis you’re doing. Some analysis platforms provide multiple ways to extract and manipulate residuals. Here’s what works in Fit Y by X and will work in most other analysis procedures. Run the desired analysis. These instructions are based on a t-test comparing treatment means in the hamburger patty data set after averaging plate counts (so there are 6 observations per treatment). After you get the analysis, click the red triangle by ‘Oneway analysis …’. Select Save / Save Residuals. A column is added to the data table. That column has the residuals. If you use Save / Save Predicted, you get a column with the predicted values. These can be used for any subsequent plot or analysis.