Lab 13 – Fall 2015 Goals: This lab demonstrates: 1) How to fit a logistic regression 2) How to reshape contingency table information into an analyzable form 3) set JMP defaults 4) how to get JMP help The first part uses donner.csv, the Donner party data set. The second uses Vit C 2.csv, the Vitamin C study data set in a new format. Logistic regression: Read the donner.csv data file into JMP. You will notice that both sex and status are categorical variables (red bats). If you want the unadjusted analysis (just comparing sexes, no adjustment for age), use Analyze / Fit Y by X to get the contingency table and Chi-square test of equal probability of death (or equal probability of survival) for the two sexes. You can also get the odds ratio by clicking the red triangle by Contingency Analysis and selecting Odds Ratio. The first column in the contingency table is died, so the odds ratio is the odds of that outcome in the first row divided by the odds in the second row. The female odds is (1/3) / (2/3) = 0.5 while the male odds is (2/3) / (1/3) = 2. Hence, the odds ratio is 0.5 / 2 = 0.25. JMP automatically calculates the 95% confidence interval for the odds ratio. To get the adjusted analysis (comparing sexes at the same age), we need to shift to the Analyze / Fit Model platform. To match lecture results, we first need to create an indicator variable that is 1 for female and 0 for male. Create a new column, which I’ll call Ifemale, and open the formula dialog. Select Sex in the Table Columns box, Conditional in the Functions box, and choose Match (2nd in the list). You should get a new menu with two items. Select Add Match Arguments from Data. You will (or should) see three lines: “FEMALE” with a box labelled then clause, “MALE” with a box, and a line with two blank boxes. Since we want the variable Ifemale to be 1 when the sex is FEMALE, put a line in the then clause box to the right of female. Then click on the box to the right of MALE and put 0 into it. The third line is for observations with a missing value. Leave that line blank. The dialog, when ready, should look like: Click OK, and you will see that the Ifemale column has values of 0 or 1, depending on the SEX value, as desired. To actually fit the logistic regression: Analyze / Fit Model, make sure that the response variable is categorical (red bar), then put that variable in the Y box. You will see the Personality change from Standard Least Squares to Nominal Logistic. Put Age and Isex into the Model Effects box and click the Run button. From top to bottom, the default output includes: Whole Model Test: Tests whether all slopes = 0 (equivalent of the model F test in multiple regression). For the Donner party data, tests whether beta for age = 0 and beta for Ifemale = 0. This block also includes the AICc and BIC statistics for the model. It is followed by various measures that compare observations and predictions. Lack of Fit: A comparison of the specified regression model to a model with a different probability for each unique X value. The equivalent of the ANOVA lack of fit test for continuous responses. Stat 301 hasn’t discussed these. Parameter Estimates: The estimates and standard errors for each parameter. Also tests of whether each parameter = 0. These are called Wald tests and are the equivalent of a t-test. Effect Likelihood Ratio Tests: Tests for each effect. These are likelihood ratio (drop in deviance) tests, which are more reliable than the tests in the Parameter estimates box. These are the equivalent of F tests for each effect. Unlike Least Squares (where t tests and F tests of a single parameter have the same p-value), p-values from the likelihood ratio test are not the same as the p-values from Wald tests. JMP will calculate the odds ratios for each effect if you click the red triangle by Nominal Logistic Fit and select Odds Ratios. You get two boxes of output: one for the odds ratio when each variable increased by 1 (Unit Odds Ratios) and the other that compares the odds at the largest value of X to the smallest value of X. For both, you also get the 95% confidence interval. The value labelled reciprocal is 1/the odds ratio, which is useful if JMP calculated the ratio “the wrong way round”. The odds of a female surviving are 4.9 times that of a male of the same age. Reshape contingency table information: Lab 12 showed how to analyze contingency table data that came as one row for each cell in the table, with the count for that cell (the Vit C 1 data) and data that came as one row for each subject (the Vit C 3 data). You may also get data formatted as a table, i.e.: where column are the group and each of the possible responses. Each value is the number of observations. To analyze these data, we need to reshape the table into a form like Vit C 1, i.e.: To do this: click on the data window select Tables / Stack. We will combine the two columns (no cold and cold) of numbers into one column of numbers and one column of labels (cold or no cold). select the names of the data columns and click Stack Columns type in names for the new output table (you will get a new data window), the column that will contain the data values and the column that will contain the labels columns not involved in the stacking (e.g. treatment) will be duplicated as needed here’s what my dialog box looks like just before doing the stacking: click OK The result is a new data window with four rows of data with the counts for each cell in the contingency table. The order of the groups (Placebo first or Vit C first) depends on the order in the original data set. That order is irrelevant for the analysis. Set JMP defaults: You have probably noticed that JMP has many default behaviours. For example, Fit Model centers polynomials by default when you include square or product terms in the model. If you don’t want this, you can turn this off each time or you can change the default so that “Center Polynomials” is not selected. To change defaults, select File / Preferences from the main menu in any JMP window. There are huge number of options, organized by categories. All the statistical options are in the Platforms dialog. That opens a long list of analyses as a menu of Platforms. The ones that we have worked with are: Distribution: to change default plots produced by Analyze / Distribution Distribution Summary Statistic: to change default numerical summaries produced by Analyze / Distribution Overlay Plot: to change default characteristics of the Graph / Overlay Plot graphs Scatterplot Matrix: to change default characteristics of the scatterplot matrix (Analyze / Multivariate Methods / Multivariate) Fit Least Squares: to change default output for Fit Y by X and Fit Model when Y is continuous Contingency: to change default options for Fit Y by X and Fit Model when Y is discrete Help for JMP: This is available in various places. The 301 lab pages will be kept available until the next time I teach 301. They are then revised for each year. The Help menu in JMP provides various resources: Help Contents / Search / Index: typical help file information that can be searched for specific things Books provides electronic versions to books that were (long ago) the printed documentation for JMP. These are up-to-date for the current version of JMP. The books that are most relevant for what we’ve coverered are: Basic Analysis: for Fit Distribution and Fit Y by X. Fitting Linear Models: for multiple regr., logistic regr. and ANOVA (the Fit Model stuff)