Stat 401B Lab 1

advertisement

Stat 401B Lab 1

Overview

In this lab you will be introduced to the statistical software package JMP (pronounced “jump”) and some of its features. For this lab you need to be sitting in front of a Windows PC that has both an Internet connection and JMP software.

Why use JMP?

JMP is designed to enable users to perform statistical analyses correctly with graphical displays that aid data analysis. JMP was chosen to be the campus-wide package for introductory statistics courses at ISU. JMP should be available at any computer lab on campus and registered students can download a copy for free from Information Technology.

Computer Exercises

The objective for these exercises is to introduce you to JMP's Analyze + Distribution platform for analyzing univariate data.

1. We will work with a JMP data file called “Hospital Length of Stay” which contains the length of stay (days) for each of 40 normal newborns. These are the same data we have been looking at in class. Go to the Stat 401B web page www.public.iastate.edu/~wrstephe/stat401.html

and open the file by double clicking on the link.

2. This file contains a single column (labeled days) that has 40 rows of values.

3. Select the Days column and choose Column Info from the Cols menu. (You can also

Right + Click on the column name to get this dialog.) Here, you can change the name, data type, modeling type, and format for this variable. During the course of the semester we will see that the data type and modeling type are very important for having JMP do the correct analysis and produce the correct output. For now, click OK to close the dialog.

4. From the Analyze menu, choose Distribution. In the resulting dialog, select the Days column and cast this column into the Y, role (response variable role) by clicking the Y,

Columns button. Click OK to begin the analysis.

5. Depending on how the default preferences for JMP have been configured, your initial output may not be exactly what you want. To select the specific graphical and numerical output you desire, use the pop-up menu to the left of the variable name Days (the menu is accessed by clicking the little red, downward-pointing triangle next to the word Days).

This menu lets you select/deselect the various univariate analysis items that JMP provides for the Days variable. Note: If you hold down the Alt and then click on the red triangle, you will get a dialog box of options rather than a menu. Having the dialog box is easier if you plan to make many changes to the display all at the same time. Select/deselect available analysis items so that you end up with:

• a histogram of Days with a count axis and a horizontal layout

• an outlier box plot of Days

• a Normal quantile plot of Days

• quantiles, moments, and “more” moments for the Days data

6. The histogram and the box plot show, generally, how the values are distributed on the number line. One can Right + Click on graphs and numerical summaries to alter various display options associated with each. The Normal Quantile Plot (also sometimes called

1

the Normal Probability Plot) is used to investigate the whether the sample data could have come from a normal model population.

7. The quantiles of the data set include the “five number summary” (maximum, upper

(75%) quartile, median, lower (25%) quartile, minimum). Note that JMP may calculate slightly different values than what you would calculate by hand. The moments include the sample mean ( y ), sample standard deviation ( s ), the standard error of the mean

( s

), a 95% confidence interval (CI) for the population mean, and the sample size ( n ). n

“More” moments gives you the sum of the observation weights (which is just n unless

( you specify other weights in a second column), the sum of the observations, the variance s 2 ), the skewness, the kurtosis, and the coefficient of variation (CV).

8. You can obtain CI’s other than 95% as well as tests of hypotheses with JMP's Analyze +

Distribution platform. From the pop-up menu, choose Confidence Interval and specify the desired confidence level (try 99%). The results give CI’s for both the population mean and the population standard deviation (something that we will not be using much this semester). Also, choose Test Mean and enter a hypothesized value for the population mean (try 4) to see if your sample supports this hypothesized value or not. The result is a test statistic as well as probability values one of which is the appropriate P-value for the corresponding null and alternative hypothesis that you wish to test.

9. Any output window can be printed. If you try to save an output window, JMP does not actually save the output but instead saves the commands you used to produce the output.

The resulting .JRP file when opened on the computer it was created on will reproduce the output window. The .JRP file cannot be opened on any other computer except the one on which it was created.

10. In order to save an output window that can be opened on another computer, or attached to an email that the recipient can open, you must first “Journal” the output. By using Edit +

Journal, you add a copy of the selected output to a file that can be saved. Once a JMP

Journal is opened, every time you go to Edit + Journal, additional output is added to the current journal. Be careful, JMP Journals can become very large with lots of stuff you eventually will not use. A JMP Journal can contain the output of many analyses and be saved in a word processor format (like *.RTF, Rich Text Format or *.DOC). You can also just Copy/Paste JMP output into a Word document.

11. Close the “Hospital Length of Stay” data table and open the “Cola Contents” data table.

This data table has a column for Weight and a column for Type. Use the Column Info option to see the difference between these two columns.

12. To look at the two types of cola separately use Analyze + Distribution and cast Weight in the role of the response variable. Then cast Type in the role of a By variable. You should get two separate analyses, one for each type of cola. For each Distribution choose

Uniform Scaling and Stack.

13. Another way to compare the two types of cola is to use the Analyze + Fit Y by X platform. Cast Weight in the role of Y, Response and Type in the role of X, Factor.

Click on OK. Because Weight is a continuous modeling type and Type is a nominal modeling type, JMP will produce output with side-by-side dot plots. You can change the display to side-by-side box plots and test for a difference in population means for the two types by selecting Means/ANOVA/t test.

2

Download