Project - plaza - University of Florida

advertisement
Frank Wang
November 16th, 2006
Stat3032 Project
A)
Probability Plot of C1
Normal - 95% CI
99
Mean
StDev
N
AD
P-Value
95
90
0.08395
0.006964
10
1.969
<0.005
Percent
80
70
60
50
40
30
20
10
5
1
0.06
0.07
0.08
0.09
0.10
0.11
C1
Histogram of C1
7
6
Frequency
5
4
3
2
1
0
0.080
0.085
0.090
0.095
0.100
0.105
C1
This distribution of the data is not symmetric. The distribution is skewed to the right, and
the curve is not normal.
Frank Wang
November 16th, 2006
B)
The mean from the data is 0.08395. The result shows that the median is 0.08165. The
standard deviation for the dataset is 0.00696. The first quartile is 0.08070, and the third
quartile is 0.08330.
Descriptive Statistics: C1
Variable
C1
Mean
0.08395
StDev
0.00696
Q1
0.08070
Median
0.08165
Q3
0.08330
IQR
0.00260
C)
Analyst used four methods to find the 95% confidence intervals. They are t distribution,
bootstrap confidence interval for mean, Wilcoxon Signed Rank confidence interval and
bootstrap confidence interval for median.
T distribution confidence interval
This interval centers on the mean, and t1-α/2(StDev/n1/2) gives the standard error. Since the
sample size is less than 30, the assumption for the t distribution is met.
Bootstrap confidence interval for mean
This interval centers on the mean. Analyst believes the sample distribution is not normal,
therefore analyst choose this method to find the 95% confidence interval for the mean.
Analyst generated 700 rows of data based on the original sample and randomly draws 10
numbers. Then analyst calculated the mean for each row, and analyst arranged the 700
mean in increasing order. The 2.5 percent quartile is at the position (0.025)*(700+1), and
the 97.5 percent quartile is at the position (0.975)*(700+1). This interval is
(0.08108, 0.08285). This method has no assumptions.
Wilcoxon Signed Rank confidence interval
There is another type of confidence interval for the mean that can be computed if the
original distribution is not believed to be normal, but it is symmetric. Since the median is
equal to the mean for a symmetric distribution, a confidence interval for the mean is also
a confidence interval for the median. Let’s suppose that we have once again sampled ten
people and asked their height. For example, let’s suppose the data was 60, 62, 62.5, 62.5,
63, 63.5, 64, 69, 70 and 72. To perform the wilcoxon signed rank confidence interval,
you have to compute the average of every two heights in the data set, including an
average of each height with itself. For instance, you would average 60 and itself. Then,
you would have to average 60 and 62. Continuing, you would average 60 and 62.5. You
would do this until every possible average was taken. Then, you would put the paired
averages in order. By using a table and a formula in the book, you would determine
which positions would be your lower and upper limits. Analyst is lucky that analyst have
Minitab to compute them. This interval is (0.08008, 0.0920). This method assumes the
data is symmetric, and the method does not meet the assumption.
Bootstrap confidence interval for median
Frank Wang
November 16th, 2006
Analyst use the same procedure used in bootstrap confidence interval for mean to
generate 700 rows of data based on the original data. Then analyst arranged the median
from each row in increasing order. The 2.5 percent quartile is at the position
(0.025)*(700+1), and the 97.5 percent quartile is at the position (0.975)*(700+1). This
interval is (0.08070, 0.08260). This method has no assumptions.
D)
t distribution 95% confidence interval is (0.078969, 0.088931). We are 95% confident
that the true mean is within the interval. However, the data is not normal or symmetric.
This interval may not be accurate.
Bootstrap 95% confidence interval for mean is (0.08108, 0.08285). We are 95%
confident that the true mean is within the interval. Since there is no assumption for this
interval, the true mean is 95% in this interval.
Wilcoxon Signed Rank confidence interval for mean and median is (0.08008, 0.0920).
We are 95% confident that the true mean is within the interval. Since the assumption for
this method is not met, the analyst expects the mean and median are not within the
interval.
Bootstrap 95% confidence interval for mean is (0.08070, 0.08260). We are 95%
confident that the true median is within the interval. Since there is no assumption for this
interval, the true median is 95% in this interval.
E)
Median is the better measure of center than mean because the data is skewed to the right
and not normal.
F)
Bootstrap 95% confidence interval for median should be used as an interval for measure
of center for the data because other methods do not meet their assumptions, and because
median is better measure of center.
Frank Wang
November 16th, 2006
Appendix
—————
11/14/2006 2:14:47 PM ————————————————————
Welcome to Minitab, press F1 for help.
MTB > Save "C:\Documents and Settings\Frank Wang\My
Documents\School\University of Florida\STA3032\MINITAB.MPJ";
SUBC>
Project;
SUBC>
Replace.
—————
11/14/2006 2:22:50 PM ————————————————————
Welcome to Minitab, press F1 for help.
Retrieving project from file: 'C:\Documents and Settings\Frank Wang\My
Documents\School\University of Florida\STA3032\MINITAB.MPJ'
MTB > Mean C1.
Mean of C1
Mean of C1 = 0.08395
MTB > Median C1.
Median of C1
Median of C1 = 0.08165
MTB > StDev C1.
Standard Deviation of C1
Standard deviation of C1 = 0.00696360
MTB > Describe C1;
SUBC>
Mean;
SUBC>
StDeviation;
SUBC>
QOne;
SUBC>
Median;
SUBC>
QThree;
SUBC>
IQRange.
Descriptive Statistics: C1
Variable
C1
Mean
0.08395
StDev
0.00696
MTB > PPlot C1;
SUBC>
Normal;
SUBC>
Symbol;
SUBC>
FitD;
SUBC>
Grid 2;
SUBC>
Grid 1;
SUBC>
MGrid 1.
Probability Plot of C1
Q1
0.08070
Median
0.08165
Q3
0.08330
IQR
0.00260
Frank Wang
November 16th, 2006
MTB > Histogram C1;
SUBC>
Bar.
Histogram of C1
MTB > Onet C1.
One-Sample T: C1
Variable
C1
MTB >
SUBC>
MTB >
MTB >
SUBC>
MTB >
MTB >
SUBC>
MTB >
N
10
Mean
0.083950
StDev
0.006964
SE Mean
0.002202
95% CI
(0.078969, 0.088931)
Random 700 C3-C12;
Discrete C1 C2.
RMean C3-C12 C13.
Sort 'Mean' 'Mean';
By 'Mean'.
RMedian C3-C12 'Median'.
Sort C14 C14;
By C14.
WInterval 95.0 C1.
Wilcoxon Signed Rank CI: C1
C1
N
10
Estimated
Median
0.0818
Achieved
Confidence
94.7
Confidence
Interval
Lower
Upper
0.0808 0.0920
MTB > Save "C:\Documents and Settings\Frank Wang\My
Documents\School\University of Florida\STA3032\MINITAB.MPJ";
SUBC>
Project;
SUBC>
Replace.
—————
11/16/2006 9:31:12 PM ————————————————————
Welcome to Minitab, press F1 for help.
Retrieving project from file: 'C:\Documents and Settings\Frank Wang\My
Documents\School\University of Florida\STA3032\MINITAB.MPJ'
MTB >
Download