nonparametric confidence interval for the median

advertisement
MEDIANCIBOOT
This macro is designed to calculate bootstrap confidence intervals for a population median.
RUNNING THE MACRO
Calling statement
medianciboot c1 ;
siglev k1 (95) ;
nboot k1 (2000);
medians c1 ;
quantiles c1-c3 ;
tvalues c1.
Input
Input to the macro must be a single column, containing only numerical values. Discrete or continuous data
are allowed. Missing data is allowed.
Subcommands
siglev
The significance level of the confidence interval, expressed as a percentage.
The default is 95 (corresponding to 95% significance); other standard choices are
90, 98 or 99.
nboot
The number of bootstrap samples used. The default is 2000. It is not recommend to use
less than 1000 for the construction of confidence intervals.
medians
Specify a column in which to store bootstrap sample medians.
quantiles Specify three columns in which to store ranks corresponding to the lower and upper
confidence interval limits, for the standard percentile method (column 1), the BC method
(column 2) and the BCa method (column 3).
tvalues
Specify a column in which to store bootstrap sample t-statistics.
Output
 Basic information (number of data points, significance level, number of bootstrap samples)
 Sample median, with standard error (assuming a normal distribution)
 Bootstrap standard deviation about the estimated median
 Estimated bias correction (for BC and BCa methods)
 Estimated acceleration (for BCa method)
 Standard nonparametric confidence interval for the median
 Bootstrap confidence intervals using : Estimate -/+ 1.96*bootstrap standard deviation, Bootstrap-t
method, Efron percentile method, Hall percentile method, BC method, BCa method.
Speed of macro : fast
ALTERNATIVE METHODS : Standard methods
sinterval 95 c1.
produces three different 95% nonparametric confidence intervals for the median.
The first and third intervals are based upon exact ranks, and have exact achieved confidence levels.
These confidence levels will not, in general, be equal to 95% : the first interval is the interval with the
closest confidence level to 95% which is below 95%, the third interval that with the closest confidence
level to 95% which is above 95%. Hence, the 3rd procedure is conservative, the 1st anti-conservative.
The 2nd interval is an approximate confidence interval based upon interpolation.
In the output to our macro, we include the conservative nonparametric confidence interval.
The construction is discussed in "technical details".
TECHNICAL DETAILS : nonparametric confidence interval for the median
The nonparametric confidence interval for the median is formed by finding the rank order of the lower and
upper limits using –
Lower limit : (n + 1)/2 – (0.9789 * sqrt(n)), rounded down to the nearest integer
Upper limit : (n + 1)/2 – (0.9789 * sqrt(n)), rounded up to the nearest integer,
where n is sample size.
[Notes: 1. for n > 283, we use 0.9800 instead of 0.9789
2. for n = 17, the formula provides rank values of 4 and 14, but we use 5 and 13.
3. for n = 67, the formula provides rank values of 25 and 43, but we use 26 and 42].
The data points corresponding to these rank orders then form the confidence interval.
Details of these procedures can be found at :
http://www.umanitoba.ca/centres/mchpe/concept/dict/Statistics/ci_median/
http://www.maths.unb.ca/~knight/utility/MedInt95.htm.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 3).
EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London
(Chapters 12-14).
WORKED EXAMPLE FOR MEDIANCIBOOT
Data
EXPONENTIAL (see MEANCIBOOT)
Aims of analysis
To create confidence intervals for the population median.
Standard procedure : Sign confidence interval
MTB > SInterval 95.0 c1.
Sign CI: C1
Sign confidence interval for median
C1
Achieved
N Median Confidence
20 0.705 0.8847
0.9500
0.9586
Confidence interval Position
( 0.500, 1.200)
7
( 0.416, 1.208)
NLI
( 0.390, 1.210)
6
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Exponential.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW
# Worksheet was saved on 23/08/01 12:16:52
Results for: Exponential.MTW
MTB > % N:\resampling\library\medianciboot c1 ;
2
SUBC>
SUBC>
SUBC>
SUBC>
SUBC>
siglev 95 ;
nboot 1000 ;
medians c3 ;
quantiles c5-c7 ;
tvalues c9.
Executing from file: N:\resampling\library\medianciboot.MAC
General information
Data Display (WRITE)
Number of data values 20
Median of data values 0.70500
Standard error of the median 0.29697
Significance level for confidence intervals
95
Number of bootstrap samples 1000
Bootstrap standard deviation
0.2019
Estimated bias-correction (for BC, BCa) -0.1156
Estimated acceleration (for BCa) -0.0000
Confidence limits
Data Display (WRITE)
Standard non-parametric method
0.3900
Estimate -/+ 1.96*boot sd
0.3093
Bootstrap-t method
0.1550
1.101
1.082
Efron percentile method
Hall percentile method
0.4450
0.2050
1.205
0.9650
BC percentile method
BCa percentile method
0.3850
0.3850
1.200
1.200
1.210
Modified worksheet
C3
A column containing 1000 sample medians, one for each bootstrap resample
C5
Upper and lower rank positions for percentile confidence limits using the Efron method
C6
Upper and lower rank positions for percentile confidence limits using the Efron method
C7
Upper and lower rank positions for percentile confidence limits using the Efron method
C9
A column containing 1000 t-statistics for sample medians, one for each bootstrap resample
Columns c5 - c7 each contain 2 values.
3
Download