JMP Tutorial #1 - Review of Basic Statistical Inference

JMP Handout #1 – Review of Basic Statistical Inference EXAMPLE 1 - Problem 2-5 in your text Entering Data Select New Data Table from the JMP Starter window. A new spreadsheet will appear with only one column (labeled Column 1). As a first example we will be working with the data from Problem 2-5 in your text. Because there is only one variable of interest in this study, shelf life, we will only need a single column to enter these data. Enter the 10 shelf life times in the column hitting return after each individual entry. To change the title of the column to Shelf Life, double-click on the column title, which is currently Column 1, and change it. To begin our analysis, we first select Distribution from the Analyze menu and place Shelf Life (days) in the right hand box by double clicking on the column name. This will open a window containing a histogram, boxplot, and basic summary statistics for the shelf life of the sampled pop bottles which is shown on the following page. 1 In addition to the default display, the following options have been selected:  Distribution > Stack (this lays the histogram horizontally).  Shelf Life > Display Options > More Moments (this gives the variance, skewness, kurtosis, and coefficient of variation (CV))  Shelf Life > Histogram Options > Prob Axis (adds a relative frequency axis to the histogram)  Shelf Life > Normal Quantile Plot (adds a normal quantile plot which is used to assess the normality of the variable of interest)  Shelf Life > Fit Distribution > Normal (this superimposes a normal reference curve to the histogram and gives 95% CI’s for the mean and standard deviation of the population distribution.) The normal quantile plot shows no obvious violations of normality, although in general with so few observations it is very difficult to assess normality. The Moments box contains basic summary statistics for these data including the mean, standard deviation, standard error, 95% CI for the population mean, variance, kurtosis, skewness and CV. Confidence Intervals for the Population Mean, Variance, and Standard Deviation We can see the sample mean for shelf life was y  131 days, with a standard error of S  6.18 days. The variance and standard deviation are S 2  382 and S  19.54 n respectively. 2 100(1-) % CI for the Population Mean Shelf Life has the form: y  t  2, n 1 S n By default JMP gives a 95% CI for the population mean , which in the case of the shelf life study is the interval (117.02 days, 144.98 days). This interval has a 95% chance of covering the true mean shelf life of the beverage. You can get 90% and 99% confidence intervals instead by selecting .90 and .99 respectively from the Shelf Life > Confidence Intervals pull-out menu. Selecting any of the confidence levels from Shelf Life > Confidence Intervals produces a confidence interval for both the pop. mean () and the standard deviation (). 100(1-)% CI for the Population Variance and Standard Deviation (“by hand”) Using the formula (2-46) on pg. 52 of your text we have: (n  1) S 2 (n  1) S 2 2    2 2  a 2,n 1  1 2,n 1 For 95% confidence (we have: 12a 2,n1  2.70  a2 2,n 1 = 19.02 Thus for confidence limits we have, (10  1)382 LCL for   19.02 (10  1)382 UCL for  2.70 This interval has a 95% chance of covering the true population variance of the shelf life. To obtain a CI for , the population standard deviation, simply take the square root of the endpoints of the interval for the variance. This gives (13.44 days, 35.68 days) as a 95% CI for the standard deviation of the shelf life. 90% and 99% CI’s can be obtained in similar fashion. 95% CI’s 99% CI’s 3 Hypothesis Tests for the Mean and Standard Deviation Population Mean () We are asked to determine whether these is evidence at the level to suggest the mean shelf-life of the carbonated beverage exceeds 120 days. To do this in JMP select Shelf Life > Test Mean. This will open a dialog box for entering the hypothesized mean value for performing a t-test. Here interest centers on whether the mean shelf life for the pop exceeds 120 days. The resulting output is shown on the following page. Test statistic value t  H 1 :   120 y  o ~ t-distribution with df = n – 1 S n H 1 :   120 H 1 :   120 The test statistic is t = 1.7798. The p-values are reported are for a two-tailed, upper-tailed, and lower-tailed test in that order. Here we are interested in determining if there is sufficient evidence to suggest the mean shelf life exceeds 120, i.e. we have an upper-tail test, so we have p-value = .0544 > .05. Thus we fail to reject the null hypothesis and conclude there is insufficient evidence to indicate the mean exceeds 120. Testing the Population Variance (and Standard Deviation () Is there evidence to suggest that the standard deviation of the shelf life exceeds 15 days? H o :   15 H 1 :   15 First compute the test statistic, (n  1) S 2 (10  1)382 2     15.28 ~ Chi-square distribution with df=9  o2 15 2 4 Using the Chi-square table on pg. 641 of your text we find the following upper and lower bound on the p-value to be: .05 < p-value < .50. Thus there is insufficient evidence to conclude the variance exceeds 225 or equivalently the standard deviation exceeds 15. We can use JMP to perform the standard deviation test by selecting the Shelf Life > Test Std Dev option. Results from JMP... H 1:   15 EXAMPLE 2 - Problem 2-11 in your text Entering These Data We need two columns in our spreadsheet to enter this table, one for the type of flare and one for the burn time. To add columns to a spreadsheet simply double click to the right of the first column. Each time you double click to the right of the rightmost column another column will be added. Here we only need one additional column so we will double click once to the right of the first column. The first column which we will call Flare Type will contain the type of flare (Type 1 or Type 2) and the second column will contain the burn times. When finished entering the data the spreadsheet should look like: 5 To begin our comparative analysis select Fit Y by X from the Analyze menu and put Flare Type in the X, Factor box and Burn Time in the Y, Response box as shown below. The resulting comparative display is shown below. We next answer the following questions:  Are the population variances equal?  Are the pop. mean burn times equal? Both inferential procedures used to answer these questions require that the burn times for both flare types are normally distributed. To assess this graphically we can add normal quantile plots for the observed burn times for each flare type. To do this select Normal Quantile Plot > Plot Actual by Quantile from the Oneway Analysis pull-down menu. 6 Assessing Normality Both distributions appear slightly kurtotic, but we will still use standard tests (F- and ttest) to answer the questions of interest. Testing Equality of the Population Variances To test the equality of the population variances, select Unequal Variances from the Oneway Analysis pull-down menu. JMP gives the results of five commonly used tests for comparing population variances. The results of the standard F-test for comparing two population variances discussed in the text are given in F Test 2-sided line and are reported last. The p-values for these tests (O' Brien = .9595, Brown-Forsythe = 1.00, Levene = .9699, Bartlett = .9743, and F Test 2-sided=.9744) all suggest that there is not a significant difference in the variance of the burn times for these two types of flares. 7 Testing Equality of the Population Means Because we have little evidence of problems with normality and we can safely conclude that the population variances are equal we can use a pooled t-test to compare the population means. To perform a pooled t-Test in JMP select the Means/Anova/Pooled t option from the Oneway Analysis pull-down menu. The results of the test are contained in the t Test box shown above (t = .048, two-sided pvalue = .9622) provide no evidence of a difference in the mean burn times of these flares. Note: JMP also provides sections of output labeled Summary of Fit, Analysis of Variance, and Means for Oneway Anova. We will ignore these sections output for now, they will be more relevant when discuss experiment with a single factor and analysis of variance in Chapter 3. 95% CI for the Difference in the Population Means (-8.55 , 8.95) ~ this interval has a 95% chance of covering the true mean difference in the mean burn times of these flare types. Clearly 0 is contained is in this interval (in fact it is nearly centered at 0), thus we are 95% confident that there is no difference in the mean burn times. EXAMPLE 3 - Problem 2-15 in your text Entering These Data For this experiment a paired comparison or dependent sampling scheme is used because each inspector uses two different types of calipers to measure a ball bearing. Our interest lies in any potential differences in the readings obtained from the two calipers. In JMP we need a separate column for the two caliper measurements made by each of the 12 inspectors. Set up the spreadsheet as shown on the following below. To form the paired differences add a third column and right click at the top to obtain the pull-down menu shown to the left. Select Formula to open the JMP calculator which will allow to find the difference in caliper readings: Caliper1 – Caliper 2 8 JMP Calculator window used to specify formula for paired differences: To perform a paired t-test select Analyze > Distribution and put the paired differences in the Y,Columns box. The remainder of the analysis proceeds like the test of single population mean shown in Example 1. Here we are interested in testing whether the mean paired difference (  d ) is different from 0, i.e. H o : d  0 H1 : d  0 The summary statistics for the paired difference are as follows: d  .00025 d .00025 S d  .0021 which yields the following t-statistic t    .432 . Sd .0021 n  12 12 n The two-sided p-value = .6742, thus we conclude that the calipers do not significantly differ in their mean ball bearing measurements when used to measure the same ball bearing by the same inspector. 9 The other reported p-values are for determining if Caliper 1 > Caliper 2 and if Caliper 1 < Caliper 2 respectively. We can also conduct a paired t-test by using the Analyze > Matched Pairs option and placing both sets of measurements in the Y, Paired Response box. The measurement put in this box first will be subtracted from the measurement placed second to form the paired differences. The results of the paired analysis are shown below. The graphic shows the paired differences plotted versus the average response. The horizontal black line shows the mean under the null hypothesis (i.e. 0) and the horizontal bold red line shows the sample mean paired difference. The horizontal dashed lines show a 95% CI for the mean difference in ball bearing measurements made by the same inspector. The vertical bold red line shows the average measurement made by all the 12 inspectors. 10

JMP Tutorial #1 - Review of Basic Statistical Inference

Related documents

Products

Support

JMP Tutorial #1 - Review of Basic Statistical Inference

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib