Using JMP Scripts in Introductory Statistics* Amy G. Froelich William M. Duckworth

advertisement
Using JMP Scripts in Introductory Statistics*
Amy G. Froelich
Iowa State University
William M. Duckworth
Creighton University
Concepts from Introductory Statistics:
• Relationship Between Mean and
Median
• Hypothesis Testing
• Connections Between
• Normal Quantile Plots
• CIs and Two-Sided Testing
• Regression and Residual Plots
• Between Rejection Rates and α
• Sampling Distributions
• Relationships Among Testing
• Central Limit Theorem
Conditions and Power
• Normal Distribution vs. t distribution
• Sample Size
• Confidence Intervals
• Alpha Level
• Variability
• Difference Between True and
• Relationship Between Sample
Size and Width of CI
• Connection Between Coverage
Rates and Confidence Level
Hypothesized Parameter
• Contingency Table
• Test for Two Proportions
• Fisher’s Exact Test
Advantages over Java applets available on web:
• Script output in same format as JMP data analysis output.
• Flexibility to create script to match and expand different activities.
• Internet access not required.
Disadvantages over Java applets available on web:
• Programming knowledge of JMP scripting language required.
• Dependent upon JMP platform.
Some Resources for JMP Scripts:
• JMP Scripting Library
http://www.jmp.com/support/downloads/jmp_scripting_library/
• Statistics Education Materials Repository at Iowa State University
http://stated.stat.iastate.edu/
• Commercially Available Scripts
Predictum Management Sciences (www.predictum.com)
*This material is based upon work supported by the National Science Foundation under Grant No. 0231322.
Inference for the Mean
Population of 200 Female Heights
Hands-On Activity:
0
1
2
3
4
5
6
7
8
9
00
68
67
72
74
63
68
67
68
68
65
01
67
64
63
67
66
64
67
68
69
66
02
70
69
62
60
65
66
67
64
63
72
03
69
66
73
62
68
72
66
69
66
64
04
64
67
65
69
67
68
61
63
65
70
05
66
66
68
69
62
69
66
62
61
65
06
64
68
62
67
62
65
66
66
61
62
07
62
63
68
64
58
66
71
67
66
67
08
65
64
64
62
64
72
63
71
68
66
09
62
63
68
65
64
67
62
64
64
63
10
67
68
70
64
70
67
63
68
70
63
11
67
67
65
72
70
64
62
66
66
70
12
68
64
70
65
64
69
61
66
62
67
13
62
65
65
60
63
61
67
67
64
65
14
67
65
67
61
63
66
71
64
60
61
• Discover effect of sample size on CI width
15
65
67
65
66
65
63
64
69
69
66
• Hypothesize about meaning of confidence
16
60
71
69
62
60
67
66
68
62
67
17
70
67
60
70
63
70
67
61
65
64
18
63
63
64
66
65
66
65
64
68
69
19
65
60
70
62
67
68
63
65
63
63
Random samples from this population
• Sample sizes = 10 and 20
• Two samples of each size per group
• Sample Mean Height
• Calculate 90% CIs for Population Mean Height
• Conduct Hypothesis Test for Population Mean
Height Under True Null Hypothesis (α = 0.1)
Learning Outcomes
• Discover variability of CI
• Hypothesize about Type I error and alpha level
Confidence Interval Script:
100 95% CIs for the Population Mean Height
70
• Sample from Larger Population
69
• 80%, 90%, 95% CIs
68
• Population Mean Height of Females
67
Y
Replicates Hands-on Activity
66
Example Coverage Rates
65
• 80% CI – 84/100
64
• 90% CI – 91/100
63
• 95% CI – 95/100
62
0
20
40
60
80
100
Row s
95 out of 100 CIs Contain the True Population Mean Height
Hypothesis Testing Script:
Type I error
Power
Replicates Hands-on Activity
Ho: μ = μFALSE vs. Ha: μ ≠ μFALSE
Ho: μ = μTRUE vs. Ha: μ ≠ μTRUE
Vary: sample size (5, 25, 50)
Vary: sample size (5, 25, 50)
alpha level (0.1, 0.05, 0.01)
Value of μFALSE
alpha level (0.1, 0.05, 0.01)
100 z-test statistics with sample size = 25 and α = 0.05
100 z-test statistics with sample size = 25 and α = 0.05
25
Count
10
15
10
5
-2
-1
0
1
2
3
4 out of 100 z-test statistics will reject Ho.
5
-4
-3
-2
-1
0
1
29 out of 100 z-test statistics will reject Ho.
Count
20
15
Inference for the Proportion
Hands-on Activity:
Population of 200 Eye Colors
Random samples from this population
• Sample sizes = 10 and 20
0
1
2
3
4
5
6
7
00
Blue
Brown
Blue
Brown
Green
Blue
Brown
Green
01
Hazel
Green
Blue
Hazel
Brown
Blue
Brown Brown Brown
02
Blue
Brown
Blue
Brown
Hazel
• Two samples of each size per group
Green Brown Brown Brown
04
Brown
Other
Blue
Blue
Hazel
• Proportion of each sample with Blue Eyes
05
Brown Brown Brown
Blue
Blue
06
Green
Blue
Hazel
Brown
07
Green
Hazel
Blue
08
Brown
Hazel
Brown
09
Blue
Green
Blue
10
Blue
Brown Brown
11
Brown
Blue
12
Blue
13
Proportion in population with Blue Eyes
Learning Outcomes
• Discover variability of CI
• Discover effect of sample size on CI width
• Hypothesize about meaning of confidence
Blue
Green
Green
Hazel
Green
Brown
Hazel
Green
Blue
Brown
Blue
Brown
Blue
Blue
Green
Green
Blue
Blue
Blue
Blue
Hazel
Brown
Green
Green
Blue
Brown
Green
Blue
Blue
Blue
Brown Brown
Hazel
Brown
Green Brown
Other
Brown
Blue
Blue
Brown
Hazel
Blue
Brown Brown
Blue
Green Brown
Blue
Blue
Other
Green
Blue
Hazel
Green Brown
Blue
Hazel
Blue
Hazel
Brown
Other
Blue
Green
Blue
Blue
Brown
Hazel
Brown
Blue
Hazel
Brown
Blue
Green
Blue
14
Brown
Hazel
Blue
Hazel
Hazel
Blue
Brown
Blue
Blue
Brown
15
Brown Brown
Hazel
Hazel
Green Brown Brown Brown Brown
16
Green
Hazel
Blue
Green Brown Brown
17
Green
Green
Other
Brown
18
Green
Green
Blue
Blue
Blue
Brown
19
Brown
Hazel
Blue
Blue
Hazel
Blue
Blue
Green Brown Brown
9
Green Brown
Green Brown Brown Green
03
• Calculate 90% confidence intervals for
8
Hazel
Green Brown Brown
Green
Blue
Blue
Blue
Blue
Green Brown Brown
Hazel
Brown
Green
Brown Brown Green
Green
Confidence Interval Script:
Replicates Hands-On Activity
100 90% Confidence Intervals for Proportion of Population with Blue Eye Color
• Sample from Larger Population
0.8
• 90%, 95%, 99% CI
0.7
• Proportion in Population with
0.6
0.5
Y
Blue Eye Color
0.4
Example Coverage Rates
0.3
• 89/100 – 90% CI
0.2
• 97/100 – 95% CI
0.1
• 98/100 – 99% CI
0
0
20
40
60
80
100
Row s
89 of the 100 Confidence Intervals Contain the True Proportion of Population with Blue Eye Color
Plus 4 Method Confidence Interval Script:
100 95% Traditional CIs for Proportion of Population with Hazel Eye Color
• Sample from Larger Population
1.1
1
0.9
• Sample Size = 10
0.8
0.7
0.6
Y
• 95% CI for Proportion in
0.5
0.4
0.3
• Population with Hazel Eye Color
0.2
0.1
0
-0.1
-0.2
0
20
40
60
80
100
Row s
81 of the 100 Traditional CIs Contain the True Proportion of Population with Hazel Eye Color
Compare Two Methods
• Traditional
100 95% Plus 4 Method CIs for Proportion of Population with Hazel Eye Color
• Plus 4 Method
1
0.9
0.8
0.7
0.6
Y
0.5
Example Coverage Rates
0.4
0.3
0.2
0.1
• Traditional: 81/100
0
-0.1
-0.2
• Plus 4 Method: 91/100
0
20
40
60
80
100
Row s
91 of the 100 Plus 4 Method CIs Contain the True Proportion of Population with Hazel Eye Color
Randomization in the Design of Experiments
Hands-on Activity*:
Comparison of Mean Yields of Two Corn Varieties
Convenience Assignment
Alternating Assignment
A
130
A
149
A
139
B
155
B
137
B
145
A
130
B
137
A
139
B
155
A
149
B
145
A
149
A
133
A
152
B
131
B
147
B
136
B
137
A
133
B
140
A
143
B
147
A
148
A
141
A
156
A
137
B
146
B
132
B
148
A
141
B
144
A
137
B
146
A
144
B
148
A
150
A
142
A
155
B
136
B
152
B
133
B
138
A
142
B
143
A
148
B
152
A
145
A
139
A
155
A
139
B
147
B
137
B
153
A
139
B
143
A
139
B
147
A
149
B
153
A
155
A
138
A
150
B
137
B
145
B
136
B
143
A
138
B
138
A
149
B
145
A
148
No significant difference in mean yields between two varieties.
No significant difference in mean yields between two varieties.
The Importance of Random Assignment:
The “True” Yields Per Plot for Each Variety
One Random Assignment of Varieties to Plots
A = 130
B = 118
A = 149
B = 137
A = 139
B = 127
A = 167
B = 155
A = 149
B = 137
A = 157
B = 145
B
118
A
149
A
139
A
167
A
149
A
157
A = 149
B = 137
A = 133
B = 121
A = 152
B = 140
A = 143
B = 131
A = 159
B = 147
A = 148
B = 136
B
137
B
121
B
140
A
143
B
147
B
136
A = 141
B = 129
A = 156
B = 144
A = 137
B = 125
A = 158
B = 146
A = 144
B = 132
A = 160
B = 148
B
129
B
144
A
137
B
146
B
132
A
160
A = 150
B = 138
A = 142
B = 130
A = 155
B = 143
A = 148
B = 136
A = 164
B = 152
A = 145
B = 133
B
138
A
142
A
155
B
136
A
164
A
145
A = 139
B = 127
A = 155
B = 143
A = 139
B = 127
A = 159
B = 147
A = 149
B = 137
A = 165
B = 153
A
139
A
155
A
139
A
159
B
137
A
165
A = 155
B = 143
A = 138
B = 126
A = 150
B = 138
A = 149
B = 137
A = 157
B = 145
A = 148
B = 136
B
143
B
126
B
138
B
137
B
145
A
148
Variety A > Variety B by 12 bushels in each plot.
Significant difference in mean yields between two varieties.
Hypothesis Testing Script**:
100 t-test statistics when true difference = 12 bushels
Replicates Hands-on Activity
• Random Assignments of Varieties to Plots
15
10
• Distribution of Sample Mean Differences Between
5
Varieties
2
3
4
5
6
7
99 out of 100 t-test statistics will reject Ho.
• Number of Rejections of Null Hypothesis of Equal
100 t-test statistics when true difference = 6 bushels
25
• Vary: alpha level (0.05, 0.01)
20
15
10
true difference between Varieties A and B
Count
Means
Count
20
5
0
1
2
3
4
5
43 out of 100 t-test statistics will reject Ho.
100 t-test statistics when true difference = 3 bushels
Example Rejection Rates (α = 0.05)
• 99/100 – True Difference = 12
15
10
• 43/100 – True Difference = 6
• 13/100 – True Difference = 3
5
-2
-1
0
1
2
3
13 out of 100 t-test statistics will reject Ho.
* Original Activity Developed by W. Robert Stephenson and Hal Stern. See their article in STATS, Spring 2000, No. 28, 23-27.
** Programming Assistance provided by Mark Bailey, SAS Institute, Inc.
Count
20
Download