talk - School of Mathematics and Physics

advertisement
Brownian Bridge and nonparametric
rank tests
Olena Kravchuk
School of Physical Sciences
Department of Mathematics
UQ
Lecture outline

Definition and important characteristics of the
Brownian bridge (BB)

Interesting measurable events on the BB

Asymptotic behaviour of rank statistics

Cramer-von Mises statistic

Small and large sample properties of rank statistics

Some applications of rank procedures

Useful references
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
2
Definition of Brownian bridge
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
3
Construction of the BB
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
4
Varying the coefficients of the bridge
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
5
Two useful properties
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
6
Ranks and anti-ranks
N
ri ( y ) # y ' s  yi
N
 c a ( r )   a c( d )
i
i
i
i 1
T (i ) 
i
d ri  rd i  i
i 1
i
 c( D )
Ri  ri (Y )
j
Di  d i (Y )
j 1
First sample
Second sample
Index
1
2
3
4
5
6
Data
5
7
0
3
1
4
Rank
5
6
1
3
2
4
Anti-rank
3
5
4
6
1
2
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
7
Simple linear rank statistic



Any simple linear rank statistic is a linear combination of the
scores, a’s, and the constants, c’s.
When the constants are standardised, the first moment is zero
and the second moment is expressed in terms of the scores.
The limiting distribution is normal because of a CLT.
S
N
N
 c a( R )  a c( D ).
i
i
i 1
N

ci  0,
i 1
i
i
ci2  1,
1
N
i 1
N

i 1
N
a
i
 a,
i 1


2
1 N
E ( S )  0, var( S ) 
ai  a .
N  1 i 1
Olena Kravchuk
S
 N (0,1)
var( S )
Brownian bridge and nonparametric
rank tests
8
Constrained random walk on pooled data




Combine all the observations from two samples into the pooled
sample, N=m+n.
Permute the vector of the constants according to the anti-ranks of
the observations and walk on the permuted constants, linearly
interpolating the walk Z between the steps.
Pin down the walk by normalizing the constants.
This random bridge Z converges in distribution to the Brownian
Bridge as the smaller sample increases.
 1 mn
, i  m,

i
m N
ci  
; Ti  c( D j )
ci  0,
ci2  1.
j 1
i
i
 1 mn , i  m.
 n N
Ti , i  tN
Z (i , t )  
(i  tN )Ti 1  (i  1  tN )Ti , tN  i  tN  1.

Olena Kravchuk

Brownian bridge and nonparametric
rank tests

9
From real data to the random bridge
m  n  3, c  1 / 6  0.41
First sample
Second sample
Index, i
1
2
3
4
5
6
Data, X
5
7
0
3
1
4
Constant, c
0.41
0.41
0.41
-0.41
-0.41
-0.41
Rank, R
5
6
1
3
2
4
Anti-rank, D
3
5
4
6
1
2
Bridge, Z
0.41
0
-0.41
-0.82
-0.41
0
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
10
Symmetric distributions and the BB
2


1
y
1
y


2
f HSD ( y )  sech( y ), f L ( y )  sech  , f N ( y ) 
exp  

4
2
2
 2 
1
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
11
Random walk model: no difference in distributions
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
12
Location and scale alternatives
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
13
Random walk: location and scale alternatives
Shift = 2
Olena Kravchuk
Scale = 2
Brownian bridge and nonparametric
rank tests
14
Simple linear rank statistic again



The simple linear rank statistic is expressed in terms of the
random bridge.
Although the small sample properties are investigated in the
usual manner, the large sample properties are governed by
the properties of the Brownian Bridge.
It is easy to visualise a linear rank statistic in such a way that
the shape of the bridge suggests a particular type of statistic.
S
N

i 1
N
N
bj
bi
Z i  ci

N
i 1
j  Ri N
 
N
 c a( R ).
i
i
i 1
1

S  b(t ) B(t )dt
0
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
15
Trigonometric scores rank statistics

The Cramer-von Mises statistic
W  mn ( m  n )
2
2
N

2
i ,
i 1
1
W2 
N

N

ni mi
i   .
n m
1

Ti 2 , W 2  B 2 (t )dt 
i 1
0

X i2
i 
i 1
2
2
.
X ~ N (0,1)
The first and second Fourier coefficients:
S1 
 2
 i
sin 
2 N i 1  N
2 2
S2 
2 N
Olena Kravchuk
N

1

Ti , S1   2 sin( t ) B(t )dt

0



 2i 
sin 
Ti , S2  2 2 sin( 2t ) B(t )dt
 N 
i 1
0
N

Brownian bridge and nonparametric
rank tests
1
16
Combined trigonometric scores rank statistics

The first and second coefficients are uncorrelated
S12  S12  S22

Fast convergence to the asymptotic distribution
S1 2 ~  22
The Lepage test is a common test of the combined alternative (SW
is the Wilcoxon statistic and SA-B is the Ansari-Bradley, adopted
Wilcoxon, statistic)

Olena Kravchuk
Brownian bridge and nonparametric
rank tests
17
Percentage points for the first component
(one-sample)
Durbin and Knott – Components of Cramer-von Mises Statistics
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
18
Percentage points for the first component
(two-sample)
Kravchuk – Rank test of location optimal for HSD
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
19
Some tests of location
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
20
Trigonometric scores rank estimators
Location estimator of the HSD (Vaughan)
Scale estimator of the Cauchy distribution (Rublik)
Trigonometric scores rank estimator (Kravchuk)
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
21
Optimal linear rank test



An optimal test of location may be found in the class of simple
linear rank tests by an appropriate choice of the score function, a.
Assume that the score function is differentiable.
An optimal test statistic may be constructed by selecting the
coefficients, b’s.
f ' ( F 1 (u ))
 ( u, f )  
, u  [0,1]
1
f ( F (u ))
lim N  a (1  [uN ])   (u ), b  
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
d
.
du
22
Functionals on the bridge

When the score function is defined and differentiable, it is easy
to derive the corresponding functional.
t
 (t )   b( x )dx,
0
1


2
2
2
b(t ) B(t )dt ~ N (0, ),    (t ) dt    (t )dt
 0

0
0
1
1


2

1
 B(t ) (t  0.5)dt  B(0.5) ~ N (0,1 / 4)
0
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
23
Result 4: trigonometric scores estimators

Efficient location estimator for the HSD

Efficient scale estimator for the Cauchy distribution

Easy to establish exact confidence level

Easy to encode into automatic procedures
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
24
Numerical examples: test of location
Boxplots of Normal 1 and Normal 2
(means are indicated by solid circles)
1.
2.
750
700
Normal, N(500,1002)
Normal, N(580,1002)
650
600
550
500
450
400
350
300
Normal 1
Normal 2
t-test
Wilcoxon
S1
p-value
0.150
0.162
0.154
CI95%
(-172.4,28.6)
(-185.0,25.0)
(-183.0,25.0)
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
25
Numerical examples: test of scale
1.
2.
600
Normal, N(300,2002)
Normal, N(300,1002)
Normals1
500
400
300
200
100
Normals1
p-value
Olena Kravchuk
Normals2
F-test
Siegel-Tukey
S2
0.123
0.064
0.054
Brownian bridge and nonparametric
rank tests
26
Numerical examples: combined test
1.
2.
1000
Normal, N(580,2002)
Normal, N(500,1002)
900
Normalc1
800
700
600
500
400
300
200
Normalc1
p-value
Olena Kravchuk
Normalc2
F-test
t-test
S12+S22
Lepage CM
0.021
0.174
0.018
0.035
Brownian bridge and nonparametric
rank tests
0.010
27
Application: palette-based images
When two colour histograms
are compared, nonparametric
tests are required as a priori
knowledge about the colour
probability distribution is
generally not available.
The difficulty arises when
statistical tests are applied to
colour images: whether one
should treat colour
distributions as continuous,
discrete or categorical.
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
28
Application: grey-scale images
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
29
Application: grey-scale images, histograms
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
30
Application: colour images
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
31
Useful books
1. H. Cramer. Mathematical Methods of Statistics. Princeton
University Press, Princeton, 19th edition, 1999.
2. G. Grimmett and D. Stirzaker. Probability and Random
Processes. Oxford University Press, N.Y., 1982.
3. J. Hajek, Z. Sidak and P.K. Sen. Theory of Rank Tests.
Academic Press, San Diego, California, 1999.
4. F. Knight. Essentials of Brownian Motion and Diffusion. AMS,
Providence, R.I., 1981.
5. K. Knight. Mathematical Statistics. Chapman & Hall, Boca
Raton, 2000.
6. J. Maritz. Distribution-free Statistical Methods. Monographs on
Applied Probability and Statistics. Chapman & Hall, London,
1981.
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
32
Interesting papers
1.
2.
3.
4.
5.
6.
J. Durbin and M. Knott. Components of Cramer – von Mises statistics.
Part 1. Journal of the Royal Statistical Society, Series B., 1972.
K.M. Hanson and D.R. Wolf. Estimators for the Cauchy distribution.
In G.R. Heidbreder, editor, Maximum entropy and Bayesian methods,
Kluwer Academic Publisher, Netherlands, 1996.
N. Henze and Ya.Yu. Nikitin. Two-sample tests based on the integrated
empirical processes. Communications in Statistics – Theory and
Methods, 2003.
A. Janseen. Testing nonparametric statistical functionals with application
to rank tests. Journal of Statistical Planning and Inference, 1999.
F.Rublik. A quantile goodness-of-fit test for the Cauchy distribution, based
on extreme order statistics. Applications of Mathematics, 2001.
D.C. Vaughan. The generalized secant hyperbolic distribution and
its properties. Communications in Statistics – Theory and Methods, 2002.
Olena Kravchuk
Brownian bridge and nonparametric
rank tests
33
Download