Brownian Bridge and nonparametric rank tests Olena Kravchuk School of Physical Sciences Department of Mathematics UQ Lecture outline Definition and important characteristics of the Brownian bridge (BB) Interesting measurable events on the BB Asymptotic behaviour of rank statistics Cramer-von Mises statistic Small and large sample properties of rank statistics Some applications of rank procedures Useful references Olena Kravchuk Brownian bridge and nonparametric rank tests 2 Definition of Brownian bridge Olena Kravchuk Brownian bridge and nonparametric rank tests 3 Construction of the BB Olena Kravchuk Brownian bridge and nonparametric rank tests 4 Varying the coefficients of the bridge Olena Kravchuk Brownian bridge and nonparametric rank tests 5 Two useful properties Olena Kravchuk Brownian bridge and nonparametric rank tests 6 Ranks and anti-ranks N ri ( y ) # y ' s yi N c a ( r ) a c( d ) i i i i 1 T (i ) i d ri rd i i i 1 i c( D ) Ri ri (Y ) j Di d i (Y ) j 1 First sample Second sample Index 1 2 3 4 5 6 Data 5 7 0 3 1 4 Rank 5 6 1 3 2 4 Anti-rank 3 5 4 6 1 2 Olena Kravchuk Brownian bridge and nonparametric rank tests 7 Simple linear rank statistic Any simple linear rank statistic is a linear combination of the scores, a’s, and the constants, c’s. When the constants are standardised, the first moment is zero and the second moment is expressed in terms of the scores. The limiting distribution is normal because of a CLT. S N N c a( R ) a c( D ). i i i 1 N ci 0, i 1 i i ci2 1, 1 N i 1 N i 1 N a i a, i 1 2 1 N E ( S ) 0, var( S ) ai a . N 1 i 1 Olena Kravchuk S N (0,1) var( S ) Brownian bridge and nonparametric rank tests 8 Constrained random walk on pooled data Combine all the observations from two samples into the pooled sample, N=m+n. Permute the vector of the constants according to the anti-ranks of the observations and walk on the permuted constants, linearly interpolating the walk Z between the steps. Pin down the walk by normalizing the constants. This random bridge Z converges in distribution to the Brownian Bridge as the smaller sample increases. 1 mn , i m, i m N ci ; Ti c( D j ) ci 0, ci2 1. j 1 i i 1 mn , i m. n N Ti , i tN Z (i , t ) (i tN )Ti 1 (i 1 tN )Ti , tN i tN 1. Olena Kravchuk Brownian bridge and nonparametric rank tests 9 From real data to the random bridge m n 3, c 1 / 6 0.41 First sample Second sample Index, i 1 2 3 4 5 6 Data, X 5 7 0 3 1 4 Constant, c 0.41 0.41 0.41 -0.41 -0.41 -0.41 Rank, R 5 6 1 3 2 4 Anti-rank, D 3 5 4 6 1 2 Bridge, Z 0.41 0 -0.41 -0.82 -0.41 0 Olena Kravchuk Brownian bridge and nonparametric rank tests 10 Symmetric distributions and the BB 2 1 y 1 y 2 f HSD ( y ) sech( y ), f L ( y ) sech , f N ( y ) exp 4 2 2 2 1 Olena Kravchuk Brownian bridge and nonparametric rank tests 11 Random walk model: no difference in distributions Olena Kravchuk Brownian bridge and nonparametric rank tests 12 Location and scale alternatives Olena Kravchuk Brownian bridge and nonparametric rank tests 13 Random walk: location and scale alternatives Shift = 2 Olena Kravchuk Scale = 2 Brownian bridge and nonparametric rank tests 14 Simple linear rank statistic again The simple linear rank statistic is expressed in terms of the random bridge. Although the small sample properties are investigated in the usual manner, the large sample properties are governed by the properties of the Brownian Bridge. It is easy to visualise a linear rank statistic in such a way that the shape of the bridge suggests a particular type of statistic. S N i 1 N N bj bi Z i ci N i 1 j Ri N N c a( R ). i i i 1 1 S b(t ) B(t )dt 0 Olena Kravchuk Brownian bridge and nonparametric rank tests 15 Trigonometric scores rank statistics The Cramer-von Mises statistic W mn ( m n ) 2 2 N 2 i , i 1 1 W2 N N ni mi i . n m 1 Ti 2 , W 2 B 2 (t )dt i 1 0 X i2 i i 1 2 2 . X ~ N (0,1) The first and second Fourier coefficients: S1 2 i sin 2 N i 1 N 2 2 S2 2 N Olena Kravchuk N 1 Ti , S1 2 sin( t ) B(t )dt 0 2i sin Ti , S2 2 2 sin( 2t ) B(t )dt N i 1 0 N Brownian bridge and nonparametric rank tests 1 16 Combined trigonometric scores rank statistics The first and second coefficients are uncorrelated S12 S12 S22 Fast convergence to the asymptotic distribution S1 2 ~ 22 The Lepage test is a common test of the combined alternative (SW is the Wilcoxon statistic and SA-B is the Ansari-Bradley, adopted Wilcoxon, statistic) Olena Kravchuk Brownian bridge and nonparametric rank tests 17 Percentage points for the first component (one-sample) Durbin and Knott – Components of Cramer-von Mises Statistics Olena Kravchuk Brownian bridge and nonparametric rank tests 18 Percentage points for the first component (two-sample) Kravchuk – Rank test of location optimal for HSD Olena Kravchuk Brownian bridge and nonparametric rank tests 19 Some tests of location Olena Kravchuk Brownian bridge and nonparametric rank tests 20 Trigonometric scores rank estimators Location estimator of the HSD (Vaughan) Scale estimator of the Cauchy distribution (Rublik) Trigonometric scores rank estimator (Kravchuk) Olena Kravchuk Brownian bridge and nonparametric rank tests 21 Optimal linear rank test An optimal test of location may be found in the class of simple linear rank tests by an appropriate choice of the score function, a. Assume that the score function is differentiable. An optimal test statistic may be constructed by selecting the coefficients, b’s. f ' ( F 1 (u )) ( u, f ) , u [0,1] 1 f ( F (u )) lim N a (1 [uN ]) (u ), b Olena Kravchuk Brownian bridge and nonparametric rank tests d . du 22 Functionals on the bridge When the score function is defined and differentiable, it is easy to derive the corresponding functional. t (t ) b( x )dx, 0 1 2 2 2 b(t ) B(t )dt ~ N (0, ), (t ) dt (t )dt 0 0 0 1 1 2 1 B(t ) (t 0.5)dt B(0.5) ~ N (0,1 / 4) 0 Olena Kravchuk Brownian bridge and nonparametric rank tests 23 Result 4: trigonometric scores estimators Efficient location estimator for the HSD Efficient scale estimator for the Cauchy distribution Easy to establish exact confidence level Easy to encode into automatic procedures Olena Kravchuk Brownian bridge and nonparametric rank tests 24 Numerical examples: test of location Boxplots of Normal 1 and Normal 2 (means are indicated by solid circles) 1. 2. 750 700 Normal, N(500,1002) Normal, N(580,1002) 650 600 550 500 450 400 350 300 Normal 1 Normal 2 t-test Wilcoxon S1 p-value 0.150 0.162 0.154 CI95% (-172.4,28.6) (-185.0,25.0) (-183.0,25.0) Olena Kravchuk Brownian bridge and nonparametric rank tests 25 Numerical examples: test of scale 1. 2. 600 Normal, N(300,2002) Normal, N(300,1002) Normals1 500 400 300 200 100 Normals1 p-value Olena Kravchuk Normals2 F-test Siegel-Tukey S2 0.123 0.064 0.054 Brownian bridge and nonparametric rank tests 26 Numerical examples: combined test 1. 2. 1000 Normal, N(580,2002) Normal, N(500,1002) 900 Normalc1 800 700 600 500 400 300 200 Normalc1 p-value Olena Kravchuk Normalc2 F-test t-test S12+S22 Lepage CM 0.021 0.174 0.018 0.035 Brownian bridge and nonparametric rank tests 0.010 27 Application: palette-based images When two colour histograms are compared, nonparametric tests are required as a priori knowledge about the colour probability distribution is generally not available. The difficulty arises when statistical tests are applied to colour images: whether one should treat colour distributions as continuous, discrete or categorical. Olena Kravchuk Brownian bridge and nonparametric rank tests 28 Application: grey-scale images Olena Kravchuk Brownian bridge and nonparametric rank tests 29 Application: grey-scale images, histograms Olena Kravchuk Brownian bridge and nonparametric rank tests 30 Application: colour images Olena Kravchuk Brownian bridge and nonparametric rank tests 31 Useful books 1. H. Cramer. Mathematical Methods of Statistics. Princeton University Press, Princeton, 19th edition, 1999. 2. G. Grimmett and D. Stirzaker. Probability and Random Processes. Oxford University Press, N.Y., 1982. 3. J. Hajek, Z. Sidak and P.K. Sen. Theory of Rank Tests. Academic Press, San Diego, California, 1999. 4. F. Knight. Essentials of Brownian Motion and Diffusion. AMS, Providence, R.I., 1981. 5. K. Knight. Mathematical Statistics. Chapman & Hall, Boca Raton, 2000. 6. J. Maritz. Distribution-free Statistical Methods. Monographs on Applied Probability and Statistics. Chapman & Hall, London, 1981. Olena Kravchuk Brownian bridge and nonparametric rank tests 32 Interesting papers 1. 2. 3. 4. 5. 6. J. Durbin and M. Knott. Components of Cramer – von Mises statistics. Part 1. Journal of the Royal Statistical Society, Series B., 1972. K.M. Hanson and D.R. Wolf. Estimators for the Cauchy distribution. In G.R. Heidbreder, editor, Maximum entropy and Bayesian methods, Kluwer Academic Publisher, Netherlands, 1996. N. Henze and Ya.Yu. Nikitin. Two-sample tests based on the integrated empirical processes. Communications in Statistics – Theory and Methods, 2003. A. Janseen. Testing nonparametric statistical functionals with application to rank tests. Journal of Statistical Planning and Inference, 1999. F.Rublik. A quantile goodness-of-fit test for the Cauchy distribution, based on extreme order statistics. Applications of Mathematics, 2001. D.C. Vaughan. The generalized secant hyperbolic distribution and its properties. Communications in Statistics – Theory and Methods, 2002. Olena Kravchuk Brownian bridge and nonparametric rank tests 33