Lecture Notes of Chin-Tsang Chiang at NTU Topic 6: Nonparametric Statistics ( Distribution free methods ) Advantages: (a) They can be used to test population parameters without normal assumption. (b) These methods are available for nominal or ordinal data. (c) They can be used to test hypotheses without involving population parameters. Disadvantages: (a) They are less sensitive than the corresponding parametric counterparts when the assumptions of parametric methods are met. (b) They tend to use less information than the parametric methods. (c) They are less efficient than the corresponding parametric counterparts when the assumptions of parametric methods are met. N. One sample or matched –paired sample: Sign test – nominal data with two categories Data: X 1 , Let p , X n , where X i ’s are either 0 or 1 for “ – “ or “+” signs. P( X i 1) Estimation of p : pˆ 1 n n i 1 Hypothesis test on p Xi H0 : p p0 H0 : p p0 H0 : p p0 HA : p p0 HA : p p0 HA : p p0 1 n Test statistic: pˆ Testing criterion: a) Reject H 0 if p̂ P (npˆ np2 p reject H 0 . n i X i ( Note that 1 i Xi 1 p2 , where P (npˆ p1 or p̂ p0 ) n 2 under H 0 Binomial (n, p0 ) ) np1 p p0 ) 2 and ; Otherwise, we don’t have a strong evidence to b) Reject H 0 if p̂ p2 , where P (npˆ np2 p p0 ) ; Otherwise, we don’t have np1 p p0 ) ; Otherwise, we don’t have a strong evidence to reject H 0 . c) Reject H 0 if p̂ p1 , where P (npˆ a strong evidence to reject H 0 . p0 and Var[ pˆ ] Under H 0 , E[ pˆ ] is p0 (1 p0 ) n n pˆ When the sample size n is large enough, Z 0 ↓ p0 approxmated by d X N (0,1) . Thus, the p0 1 p0 testing criterion can be based on Z 0 statistic. Application: Conduct a test about the population median, i.e. A. H0 : M M0 H0 : M M0 HA : M M0 HA : M M0 X H 0 : f X ( x) is symmetric about n ≤ M0 HA : M M0 M 0 ) & % = 出 f X ( x) 0 H A : f X ( x) is asymmetric about T D = I (X H0 : M 0 r , where ri (i.e. symmetric about another value) sign( X i i 1 i 0 )rank ( X i 0 ) populations rpair measurements 照 For a matched-paired sample, let X 1 and X 2 denote the measurement of population 1 and 2, respectively. Assume that D X 1 X 2 has a symmetric p.d.f. f D ( d ) H 0 : f D (d ) is symmetric about 0 H A : f D (d ) is symmetric about another value Üi n T n Under H 0 , T i 1 n Var T i n Thus, E[T ] i 1 r , where ri i 1 i r 1 i d n i 1 sign( X i E[ sign( X i Var ( sign( X i 些 X )rank ( X sign( X i1 0 0 i 1 (1 1 1 ( 1) ) i 2 2 0 , and 1 i2 i 1 T n(n 1)(2n 1) 6 When n is large enough, X i2 ) n(n 1)(2n 1) . 6 canbe approxmated by d Λ N (0,1) . n )) i 2 i1 ) i 0 n )] i i2 Two independent samples: Mann-Whitney-Wilcoxon test – At least ordinal scale variable ipative.ly Let X 1 and X 2 denote separately the measurement of populations 1 and 2, with the i n corresponding distributions FX1 ( x) FX 2 ( x) Population 1 Population 2 X 11 X 12 Random samples X 1n1 Let T1 n1 i 1 Independent X 21 X 22 X 2n2 rank ( X 1i ) , where rank ( X 1i ) is the rank of X 1i among { X ij : i 1, 2 ; j 1, , ni } . , . Hypothesis: H 0 : FX1 ( x) FX 2 ( x) H A : FX1 ( x) FX 2 ( x) for at least one x Test statistic: T1 Testing criterion: Reject H 0 if T1 T2 or T1 TU , where TL TU Otherwise, we don’t have a strong evidence to reject H 0 . n1 (n1 n2 1) . Remark: Under H 0 , one has E[T1 ] 1 1 n1 (n1 n2 1) and Var (T1 ) n1n2 (n1 n2 1) . 2 12 T1 When n1 and n2 are large enough, k-independent samples ( k 1 n1 (n1 n2 1) 2 d 1 n1n2 (n1 n2 1) 12 N (0,1) 2 ): Kruskall-Wallis test – At least ordinal scale variable Let X 1 , , X n denote separately the measurements of populations 1, corresponding distributions FX1 ( x) FX k ( x) Population 1 Population 2 Population k X 21 X k1 X 11 Random samples X 12 Independent X 1n1 X 22 X 2n2 Hypothesis: H 0 : FX1 ( x) FX 2 ( x) H A : FX i ( x) FX j ( x) for some i ,k with the FX k ( x) j and some x Independent Xk2 X knk R ( X ij ) X ij k nT , k ; j 1, , ni } and Ri ni j 1 R( X ij ) n i 1 i k i n 1 i 12 nT (nT 1) SSB MST Test statistic: W SSB { X ij : i 1, ni j ( 1 When nT is large, W Ri ni d k i Ri2 1 ni nT 1 2 ) and MST 2 2 k 1 . 3(nT 1) , where 1 nT 1 ( k ni i 1 j R 2 ( X ij ) nT 1 (nT 1) 2 ). 4