Classical Hypothesis Testing Theory Adapted from Alexander Senf

Classical Hypothesis Testing Theory Adapted from Alexander Senf Review • 5 steps of classical hypothesis testing (Ch. 3) 1. Declare null hypothesis H0 and alternate hypothesis H1 2. Fix a threshold α for Type I error (1% or 5%) • • Type I error (α): reject H0 when it is true Type II error (β): accept H0 when it is false 3. Determine a test statistic • 7/31/2008 a quantity calculated from the data 2 Review 4. Determine what observed values of the test statistic should lead to rejection of H0 • Significance point K (determined by α) 5. Test to see if observed data is more extreme than significance point K • • 7/31/2008 If it is, reject H0 Otherwise, accept H0 3 Overview of Ch. 9 – Simple Fixed-Sample-Size Tests – Composite Fixed-Sample-Size Tests – The -2 log λ Approximation – The Analysis of Variance (ANOVA) – Multivariate Methods – ANOVA: the Repeated Measures Case – Bootstrap Methods: the Two-sample t-test – Sequential Analysis 7/31/2008 4 Simple Fixed-Sample-Size Tests 7/31/2008 5 The Issue • In the simplest case, everything is specified – Probability distribution of H0 and H1 • Including all parameters – α (and K) – But: β is left unspecified • It is desirable to have a procedure that minimizes β given a fixed α – This would maximize the power of the test • 1-β, the probability of rejecting H0 when H1 is true 7/31/2008 6 Most Powerful Procedure • Neyman-Pearson Lemma – States that the likelihood-ratio (LR) test is the most powerful test for a given α – The LR is defined as: f1 ( X 1 ) f1 ( X 2 )  f1 ( X n ) LR  f0 ( X 1 ) f0 ( X 2 ) f0 ( X n ) – where • f0, f1 are completely specified density functions for H0,H1 • X1, X2, … Xn are iid random variables 7/31/2008 7 Neyman-Pearson Lemma – H0 is rejected when LR ≥ K – With a constant K chosen such that: P(LR ≥ K when H0 is true) = α – Let’s look at an example using the NeymanPearson Lemma! – Then we will prove it. 7/31/2008 8 Example • Basketball players seem to be taller than average – Use this observation to formulate our hypothesis H1: • “Tallness is a factor in the recruitment of KU basketball players” – The null hypothesis, H0, could be: • “No, the players on KU’s team are a just average height compared to the population in the U.S.” • “Average height of the team and the population in general is the same” 7/31/2008 9 Example • Setup: – Average height of males in the US: 5’9 ½“ – Average height of KU players in 2008: 6’04 ½” • Assumption: both populations are normal-distributed centered on their respective averages (μ0 = 69.5 in, μ1 = 76.5 in) and σ = 2 ( x  76.5 ) 2 ( x  69.5 ) 2   • Sample size: 3 8 8 e e f1 ( x)  f 0 ( x)  2 2 2 2 – Choose α: 5% 7/31/2008 10 Example • The two populations: f0 f1 p height (inches) 7/31/2008 11 Example – Our test statistic is the Likelihood Ratio, LR e  ( x1  76.5 ) 2 8 f1 ( x1 ) f1 ( x2 ) f1 ( x3 )  2 2  ( x)  ( x  69.5 )  f 0 ( x1 ) f 0 ( x2 ) f 0 ( x3 ) 8 e 2 2 ( x2  76.5 ) 2 8 e 2 2 2 1 e  e  ( x2  69.5 ) 2 8 2 2  ( x3  76.5 ) 2 8 2 2 e  ( x3  69.5 ) 2 8 2 2 3 e  1 ( xi 69.5) 2 ( xi 76.5) 2 8 i1 – Now we need to determine a significance point K at which we can reject H0, given α = 5% 7/31/2008 • P(Λ(x) ≥ K | H0 is true) = 0.05, determine K 12 Example – So we just need to solve for K’ and calculate K:    f 0 ( x1 ) f 0 ( x2 ) f 0 ( x3 )dx1dx2 dx3  0.05 K1' K 2' K 3' • How to solve this? Well, we only need one set of values to calculate K, so let’s pick two and solve for the third:     f 0 ( x1 ) f 0 ( x2 ) f 0 ( x3 )dx1dx2 dx3  0.05 6871 K 3' • We get one result: K3’=71.0803 7/31/2008 13 Example – Then we can just plug it in to Λ and calculate K: 3 K e e  1 ( K i' 69.5) 2 ( K i' 76.5) 2 8 i1  1 ( 6869.5) 2 ( 6876.5) 2  ( 7169.5) 2 ( 7176.5) 2  ( 71.080369.5) 2 ( 71.080376.5) 2 8  1.663 *10 7 7/31/2008 14  Example – With the significance point K = 1.663*10-7 we can now test our hypothesis based on observations: • E.g.: Sasha = 83 in, Darrell = 81 in, Sherron = 71 in 3  ( X  {83,81,71})  e  1 ( X i  69.5 ) 2  ( X i  76.5 ) 2 8 i 1 (83,81,71)  1.446 *1012 • 1.446*1012 > 1.663*10-7 • Therefore, our hypothesis that tallness is a factor in the recruitment of KU basketball players is true. 7/31/2008 15 Neyman-Pearson Proof • Let A define region in the joint range of X1, X2, … Xn such that LR ≥ K. A is the critical region. – If A is the only critical region of size α we are done  L(H )  f (u ) f (u ) f (u )du du du A 0 0 1 0 2 0 n 1 2 n  A – Let’s assume another critical region of size α, defined by B  L(H )  f (u ) f (u ) f (u )du du du B 7/31/2008 0 0 1 0 2 0 n 1 2 n  B 16 Proof – H0 is rejected if the observed vector (x1, x2, …, xn) is in A or in B. – Let A and B overlap in region C – Power of the test: rejecting H0 when H1 is true • The Power of this test using A is:  L(H )   f (u ) f (u ) f (u )du du du A 7/31/2008 1 1 1 1 2 1 n 1 2 n A 17 Proof – Define: Δ = ∫AL(H1) - ∫BL(H1) • The power of the test using A minus using B      f1 (u1 )  f1 (un )du1  dun     f1 (u1 )  f1 (un )du1  dun A B     f1 (u1 )  f1 (un )du1  dun     f1 (u1 )  f1 (un )du1  dun A\C B\C • Where A\C is the set of points in A but not in C • And B\C contains points in B but not in C 7/31/2008 18 Proof – So, in A\C we have:  f1 (u1 )  f1 (u n ) K f 0 (u1 )  f 0 (u n ) f1 (u1 )  f1 (un )  Kf0 (u1 )  f 0 (un ) – While in B\C we have: f1 (u1 )  f1 (un )  Kf0 (u1 )  f 0 (un ) Why? 7/31/2008 19 Proof – Thus      Kf 0 (u1 )  f 0 (un )du1  dun     Kf 0 (u1 )  f 0 (un )du1  dun A\C B\C     Kf 0 (u1 )  f 0 (un )du1  dun     Kf 0 (u1 )  f 0 (un )du1  dun A B  K  K 0 – Which implies that the power of the test using A is greater than or equal to the power using B. 7/31/2008 20 Composite Fixed-Sample-Size Tests 7/31/2008 21 Not Identically Distributed • In most cases, random variables are not identically distributed, at least not in H1 – This affects the likelihood function, L – For example, H1 in the two-sample t-test is: m L i 1 1 e 2   ( x1i  1 ) 2 2 2 n  i 1 1 e 2   ( x2 i   2 ) 2 2 2 – Where μ1 and μ2 are different 7/31/2008 22 Composite – Further, the hypotheses being tested do not specify all parameters – They are composite – This chapter only outlines aspects of composite test theory relevant to the material in this book. 7/31/2008 23 Parameter Spaces – The set of values the parameters of interest can take – Null hypothesis: parameters in some region ω – Alternate hypothesis: parameters in Ω – ω is usually a subspace of Ω • Nested hypothesis case – Null hypothesis nested within alternate hypothesis – This book focuses on this case • “if the alternate hypothesis can explain the data significantly better we can reject the null hypothesis” 7/31/2008 24 λ Ratio • Optimality theory for composite tests suggests this as desirable test statistic: Lmax ( )  Lmax () • Lmax(ω): maximum likelihood when parameters are confined to the region ω • Lmax(Ω): maximum likelihood when parameters are confined to the region Ω, defined by H1 • H0 is rejected when λ is sufficiently small (→ Type I error) 7/31/2008 25 Example: t-tests • The next slides calculate the λ-ratio for the two sample t-test (with the likelihood) m L i 1 1 e 2   ( x1i  1 ) 2 2 2 n  i 1 1 e 2   ( x2 i   2 ) 2 2 2 – t-tests later generalize to ANOVA and T2 tests 7/31/2008 26 Equal Variance Two-Sided t-test • Setup – Random variables X11,…,X1m in group 1 are Normally and Independently Distributed (μ1,σ2) – Random variables X21,…,X2n in group 2 are NID (μ2,σ2) – X1i and X2j are independent for all i and j – Null hypothesis H0: μ1= μ2 (= μ, unspecified) – Alternate hypothesis H1: both unspecified 7/31/2008 27 Equal Variance Two-Sided t-test • Setup (continued) – σ2 is unknown and unspecified in H0 and H1 • Is assumed to be the same in both distributions – Region ω is:   {1  2 ,0   2  } – Region Ω is:   {  1  ,  2  ,0   2  } 7/31/2008 28 Equal Variance Two-Sided t-test • Derivation – H0: writing μ for the mean, when μ1= μ2, the maximum over likelihood ω is at ˆ  X  X 11  X 12    X 1m  X 21  X 22    X 2 n mn – And the (common) variance σ2 is 2 ( X  X )  ( X  X ) i1 1i i1 2i m ̂ 02  7/31/2008 2 n mn 29 Equal Variance Two-Sided t-test – Inserting both into the likelihood function, L Lmax ( )  7/31/2008 1 (2ˆ ) 2 0 m n 2 e  m n 2 30 Equal Variance Two-Sided t-test – Do the same thing for region Ω ˆ1  X 1  X 11  X 12    X 1m m ˆ 2  X 2  2 2 ( X  X )  ( X  X ) 1 2 i 1 1i i 1 2i m ˆ12  X 21  X 22    X 2 n n n mn – Which produces this likelihood Function, L Lmax ()  7/31/2008 1 (2ˆ ) 2 1 m n 2 e  m n 2 31 Equal Variance Two-Sided t-test – The test statistic λ is then e  m2 n m n 2  ˆ12  Lmax ( ) (2ˆ 02 )     2   m2 n Lmax () e  ˆ 0  2 m2 n (2ˆ1 ) mn 2 It’s the same function, just With different variances 7/31/2008 32 Equal Variance Two-Sided t-test – We can then use the algebraic identity m n m n  ( X 1i  X )   ( X 2i  X )   ( X 1i  X 1 )   ( X 2i  X 2 ) 2  2 i 1 2 i 1 – To show that 2 i 1  1     1t 2   m n2  – Where t is (from Ch. 3) 7/31/2008 i 1 mn ( X 1  X 2 )2 mn mn 2 T ( X 1  X 2 ) mn S mn 33 Equal Variance Two-Sided t-test – t is the observed value of T – S is defined in Ch. 3 as m n 2 ( X  X )  ( X  X ) 1 2  1i  2i 2 S2  i 1 i 1 mn2 λ We can plot λ as a function of t: (e.g. m+n=10) 7/31/2008 t 34 Equal Variance Two-Sided t-test – So, by the monotonicity argument, we can use t2 or |t| instead of λ as test statistic – Small values of λ correspond to large values of |t| – Sufficiently large |t| lead to rejection of H0 – The H0 distribution of t is known • t-distribution with m+n-2 degrees of freedom – Significance points are widely available • Once α has been chosen, values of |t| sufficiently large to reject H0 can be determined 7/31/2008 35 http://www.socr.ucla.edu/Applets.dir/T-table.html Equal Variance Two-Sided t-test 7/31/2008 36 Equal Variance One-Sided t-test • Similar to Two-Sided t-test case – Different region Ω for H1: • Means μ1 and μ2 are not simply different, but one is larger than the other μ1 ≥ μ2   {1  2 ,0   2  } • If x1  x 2 then maximum likelihood estimates are the same as for the two-sided case 7/31/2008 37 Equal Variance One-Sided t-test • If x1  x 2 then the unconstrained maximum of the likelihood is outside of ω • The unique maximum is at ( x1 , x 2 ) , implying that the maximum in ω occurs at a boundary point in Ω • At this point estimates of μ1 and μ2 are equal ( x) • At this point the likelihood ratio is 1 and H0 is not rejected • Result: H0 is rejected in favor of H1 (μ1 ≥ μ2) only for sufficiently large positive values of t 7/31/2008 38 Example - Revised • This scenario fits with our original example: – H1 is that the average height of KU basketball players is bigger than for the general population – One-sided test – We could assume that we don’t know the averages for H0 and H1 – We actually don’t know σ (I just guessed 2 in the original example) 7/31/2008 39 Example - Revised • Updated example: – Observation in group 1 (KU): X1 = {83, 81, 71} – Observation in group 2: X2 = {65, 72, 70} – Pick significance point for t from a table: tα = 2.132 • t-distribution, m+n-2 = 4 degrees of freedom, α = 0.05 – Calculate t with our observations (78.3  69) 9 27.9 t   2.185 12.7673 5.2122 6 – t > tα, so we can reject H0! 7/31/2008 40 Comments • Problems that might arise in other cases – The λ-ratio might not reduce to a function of a well-known test statistic, such as t – There might not be a unique H0 distribution of λ – Fortunately, the t statistic is a pivotal quantity • Independent of the parameters not prescribed by H0 – e.g. μ, σ – For many testing procedures this property does not hold 7/31/2008 41 Unequal Variance Two-Sided t-test • Identical to Equal Variance Two-Sided t-test – Except: variances in group 1 and group 2 are no longer assumed to be identical • • • • • 7/31/2008 Group 1: NID(μ1, σ12) Group 2: NID(μ2, σ22) With σ12 and σ22 unknown and not assumed identical Region ω = {μ1 = μ2, 0 < σ12, σ22 < +∞} Ω makes no constraints on values μ1, μ2, σ12, and σ22 42 Unequal Variance Two-Sided t-test – The likelihood function of (X11, X12, …, X1m, X21, X22, …, X2n) then becomes m  i 1 1 e 2  1  ( x1i  1 ) 2 2 12 n  i 1 1 2  2  e ( x21i   2 ) 2 2 22 – Under H0 (μ1 = μ2 = μ), this becomes: m  i 1 7/31/2008 1 e 2  1  ( x1i   ) 2 2 12 n  i 1 1 2  2  e ( x21i   ) 2 2 22 43 Unequal Variance Two-Sided t-test – Maximum likelihood estimates ̂ , ̂ 12 and ̂ 22 satisfy the simultaneous equations:  (x 1i 2 1  ˆ ) ˆ 2i 2 2  ˆ ) ˆ 2 1  (x  ˆ 22 (x   ˆ 7/31/2008 (x   1i 0  ˆ ) 2 m 2i  ˆ ) 2 n 44 Unequal Variance Two-Sided t-test –  cubic equation in ̂ – Neither the λ ratio, nor any monotonic function has a known probability distribution when H0 is true! – This does not lead to any useful testing statistic • The t-statistic may be used as reasonably close • However H0 distribution is still unknown, as it depends on the unknown ratio σ12/σ22 • In practice, a heuristic is often used (see Ch. 3.5) 7/31/2008 45 The -2 log λ Approximation 7/31/2008 46 The -2 log λ Approximation • Used when the λ-ratio procedure does not lead to a test statistic whose H0 distribution is known – Example: Unequal Variance Two-Sided t-test • Various approximations can be used – But only if certain regularity assumptions and restrictions hold true 7/31/2008 47 The -2 log λ Approximation • Best known approximation: – If H0 is true, -2 log λ has an asymptotic chi-square distribution, • with degrees of freedom equal to the difference in parameters unspecified by H0 and H1, respectively. • λ is the likelihood ratio • “asymptotic” = “as the sample size → ∞” – Provides an asymptotically valid testing procedure 7/31/2008 48 The -2 log λ Approximation – Restrictions: • Parameters must be real numbers that can take on values in some interval • The maximum likelihood estimator is found at a turning point of the function – i.e. a “real” maximum, not at a boundary point • H0 is nested in H1 (as in all previous slides) – These restrictions are important in the proof • I skip the proof… 7/31/2008 49 The -2 log λ Approximation • Instead: – Our original basketball example, revised again: • Let’s drop our last assumption, that the variance in the population at large is the same as in the group of KU basketball players. • All we have left now are our observations and the hypothesis that μ1 > μ2 – Where μ1 is the average height of Basketball players • Observation in group 1 (KU): X1 = {83, 81, 71} • Observation in group 2: X2 = {65, 72, 70} 7/31/2008 50 Example – Revised Again – Using the Unequal Variance One-Sided t-Test – We get: 7/31/2008 51 The Analysis of Variance (ANOVA) 7/31/2008 52 The Analysis of Variance (ANOVA) • Probably the most frequently used hypothesis testing procedure in statistics • This section – Derives of the Sum of Squares – Gives an outline of the ANOVA procedure – Introduces one-way ANOVA as a generalization of the two-sample t-test – Two-way and multi-way ANOVA – Further generalizations of ANOVA 7/31/2008 53 Sum of Squares • New variables (from Ch. 3) – The two-sample t-test tests for equality of the means of two groups. – We could express the observations as: X ij   i  Eij i  1,2 – Where the Eij are assumed to be NID(0,σ2) – H0 is μ1 = μ2 7/31/2008 54 Sum of Squares – This can also be written as: X ij     i  Eij i  1,2 • μ could be seen as overall mean • αj as deviation from μ in group j – This model is overparameterized • Uses more parameters than necessary • Necessitates the requirement m1  n 2  0 • (always assumed imposed) 7/31/2008 55 Sum of Squares – We are deriving a test procedure similar to the two-sample two-sided t-test – Using |t| as test statistic • Absolute value of the T statistic – This is equivalent to using t2 • Because it’s a monotonic function of |t| – The square of the t statistic (from Ch. 3) ( X 1  X 2 ) mn T S mn 7/31/2008 56 Sum of Squares – …can, after algebraic manipulations, be written as F B F  ( m  n  2) W – where X X  m m n X2  1j 1 j 1 j 1 X2j X n mX 1  nX 2 mn mn B ( X 1  X 2 ) 2  m( X 1  X ) 2  n ( X 2  X ) 2 mn m n W   ( X1 j  X 1 )   ( X 2 j  X 2 )2 2 j 1 7/31/2008 j 1 57 Sum of Squares – B: between (among) group sum of squares – W: within group sum of squares – B + W: total sum of squares • Can be shown to be: m (X i 1 n 2 2  X )  ( X  X )  1i 2i i 1 – Total number of degrees of freedom: m + n – 1 • Between groups: 1 • Within groups: m + n - 2 7/31/2008 58 Sum of Squares – This gives us the F statistic F  B (m  n  2) W – Our goal is to test the significance of the difference between the means of two groups • B measures the difference – The difference must be measured relative to the variance within the groups • W measures that – The larger F is, the more significant the difference 7/31/2008 59 The ANOVA Procedure • Subdivide observed total sum of squares into several components – In our case, B and W • Pick appropriate significance point for a chosen Type I error α from an F table • Compare the observed components to test our hypothesis 7/31/2008 60 F-Statistic • Significance points depend on degrees of freedom in B and W – In our case, 1 and (m + n – 2) 7/31/2008 http://www.ento.vt.edu/~sharov/PopEcol/tables/f005.html 61 Comments • The two-group case readily generalizes to any number of groups. • ANOVAs can be classified in various ways, e.g. – fixed effects models – mixed effects models – random effects model – Difference is discussed later – For now we consider fixed effect models • Parameter αi is fixed, but unknown, in group i 7/31/2008 X ij     i  Eij 62 Comments • Terminology – Although ANOVA contains the word ‘variance’ – What we actually test for is a equality in means between the groups • The different mean assumptions affect the variance, though • ANOVAs are special cases of regression models from Ch. 8 7/31/2008 63 One-Way ANOVA • One-Way fixed-effect ANOVA • Setup and derivation – Like two-sample t-test for g number of groups – Observations (ni observations, i=1,2,…,g) X i1 , X i 2 ,, X in – Using overparameterized model for X X ij     i  Eij j  1,2,, ni i  1,2,  , g – Eij assumed NID(0,σ2), Σniαi = 0, αi fixed in group i 7/31/2008 64 One-Way ANOVA – Null Hypothesis H0 is: α1 = α2 = … = αg = 0 – Total sum of squares is g ni  ( X i 1 j 1 ij  X )2 – This is subdivided into B and W g g ni W   ( X ij  X i ) 2 B   ni ( X i  X ) 2 i 1 j 1 i 1 – with ni X ij j 1 ni Xi  7/31/2008 g ni X   i 1 j 1 X ij N g N   ni i 1 65 One-Way ANOVA – Total degrees of freedom: N – 1 • Subdivided into dfB = g – 1 and dfW = N - g – This gives us our test statistic F F B Ng * W g 1 – We can now look in the F-table for these degrees of freedom to pick significance points for B and W – And calculate B and W from the observed data – And accept or reject H0 7/31/2008 66 Example • Revisiting the Basketball example – Looking at it as a One-Way ANOVA analysis • Observation in group 1 (KU): X1 = {83, 81, 71} • Observation in group 2: X2 = {65, 72, 70} – Total Sum of Squares: (73.66  83) 2  (73.66  81) 2  (73.66  71) 2  (73.66  65) 2  (73.66  72) 2  (73.66  70) 2  239.3336 – B (between groups sum of squares) g B   ni ( X i  X ) 2  3(78.33  76.33) 2  3(69  76.33) 2  130.57 i 1 7/31/2008 67 Example – W (within groups sum of squares) g ni W   ( X ij  X i ) 2 i 1 j 1  ((83  78.33) 2  (81  78.33) 2  (71  78.33) 2 )  ((65  69) 2  (72  69) 2  (70  69) 2 )  108.667 – Degrees of freedom • Total: N-1 = 5 • dfB = g – 1 = 2 - 1 = 1 • dfW = N – g = 6 – 2 = 4 7/31/2008 68 Example – Table lookup for df 1 and 4 and α = 0.05: – Critical value: F = 7.71 – Calculate F from our data: F B N  g 130.57 6  2 *  *  4.806 W g  1 108.667 2  1 – So… 4.806 < 7.71 – With ANOVA we actually accept H0! • Seems to be the large variance in group 1 7/31/2008 69 Same Example – with Excel • Screenshots: 7/31/2008 70 Excel • Offers most of these tests, built-in 7/31/2008 71 Two-Way ANOVA • Two-Way Fixed Effects ANOVA • Overview only (in the scope of this book) • More complicated setup; example: – Expression levels of one gene in lung cancer patients – a different risk classes • E.g.: ultrahigh, very high, intermediate, low – b different age groups – n individuals for each risk/age combination 7/31/2008 72 Two-Way ANOVA – Expression levels (our observations): Xijk • i is the risk class (i = 1, 2, …, a) • j indicates the age group • k corresponds to the individual in each group (k = 1, …, n) – Each group is a possible risk/age combination • The number of individuals in each group is the same, n • This is a “balanced” design • Theory for unbalanced designs is more complicated and not covered in this book 7/31/2008 73 Two-Way ANOVA – The Xijk can be arranged in a table: Risk category 1 2 3 4 1 n n n n 2 n n n n 3 n n n n 4 n n n n 5 n n n n Age group j i Number of individuals in this risk/age group (aka “cell”) 7/31/2008 This is a two-way table 74 Two-Way ANOVA – The model adopted for each Xijk is X ijk     i   j   ij  Eijk i  1,2,, a • • • • • j  1,2,, b k  1,2,, n Where Eijk are NID(μ, α2) The mean of Xijk is μ + αi + βi + δij αi is a fixed parameter, additive for risk class i βi is a fixed parameter, additive for age group i δij is a fixed risk/age interaction parameter – Should be added is a possible group/group interaction exists 7/31/2008 75 Two-Way ANOVA – These constraints are imposed • Σiαi = Σiβi = 0 • Σiδij = 0 for all j • Σjδij = 0 for all i – The total sum of squares is then subdivided into four groups: 7/31/2008 • • • • Risk class sum of squares Age group sum of squares Interaction sum of squares Within cells (“residual” or “error”) sum of squares 76 Two-Way ANOVA – Associated with each sum of squares • Corresponding degrees of freedom • Hence also a corresponding mean square – Sum of squares divided by degrees of freedom – The mean squares are then compared using F ratios to test for significance of various effects • First – test for a significant risk/age interaction • F-ratio used is ratio of interaction mean square and within-cells mean square 7/31/2008 77 Two-Way ANOVA – Example of interaction 7/31/2008 Age – No evidence of interaction Age • If such an interaction is used, it may not be reasonable to test for significant risk or age differences • Example, μ in two risk classes, two age groups: Risk 1 2 1 4 12 2 7 15 1 2 1 4 15 2 11 6 78 Multi-Way ANOVA • One-way and two-way fixed effects ANOVAs can be extended to multi-way ANOVAs • Gets complicated • Example: three-way ANOVA model: X ijkm     i   j   k   ij  ik   jk  ijk  Eijkm 7/31/2008 79 Further generalizations of ANOVA • The 2m factorial design – A particular form of the one-way ANOVA • Interactions between main effects – m “factors” taken at two “levels” • E.g. (1) Gender, (2) Tissue (lung, kidney), and (3) status (affected, not affected) – 2m possible combinations of levels/groups – Can test for main effects and interactions – Need replicated experiments • n replications for each of the 2m experiments 7/31/2008 80 Further generalizations of ANOVA – Example, m = 3, denoted by A, B, C • 8 groups, {abc, ab, ac, bc, a, b, c, 1} • Write totals of n observations Tabc, Tab, …, T1 • The total between sum of squares can be subdivided into seven individual sums of squares – – – – 7/31/2008 Three main effects (A, B, C) Three pair wise interactions (AB, AC, BC) One triple-wise interaction (ABC) Example: Sum of squares for A, and for BC, respectively (Tabc  Tab  Tac  Ta  Tbc  Tb T cT1 ) 2 8n (Tabc  Tab  Tac  Ta  Tbc  Tb T cT1 ) 2 8n 81 Further generalizations of ANOVA – If m ≥ 5 the number of groups becomes large – Then the total number of observations, n2m is large – It is possible to reduce the number of observations by a process … • Confounding – Interaction ABC probably very small and not interesting – So, prefer a model without ABC, reduce data – There are ANOVA designs for that 7/31/2008 82 Further generalizations of ANOVA • Fractional Replication – Related to confounding – Sometimes two groups cannot be distinguished from each other, then they are aliases • E.g. A and BC – This reduces the need to experiments and data – Ch. 13 talks more about this in the context of microarrays 7/31/2008 83 Random/Mixed Effect Models • So far: fixed effect models – E.g. Risk class, age group fixed in previous example • Multiple experiments would use same categories • But: what if we took experimental data on several random days? • The days in itself have no meaning, but a “between days” sum of squares must be extracted – What if the days turn out to be important? – If we fail to test for it, the significance of our procedure is diminished. – Days are a random category, unlike risk and age! 7/31/2008 84 Random/Mixed Effect Models • Mixed Effect Models – If some categories are fixed and some are random – Symbols used: • Greek letters for fixed effects • Uppercase Roman letters for random effects • Example: two-way mixed effect model with – Risk class a and days d and n values collected each day, the appropriate model is written: X ikl     i  Dl  Gil  Eikl 7/31/2008 85 Random/Mixed Effect Models • Random effect model have no fixed categories • The details on the ANOVA analysis depend on which effects are random and which are fixed • In a microarray context (more in Ch. 13) – There tend to be several fixed and several random effects, which complicates the analysis – Many interactions simply assumed zero 7/31/2008 86 Multivariate Methods ANOVA: the Repeated Measures Case Bootstrap Methods: the Twosample t-test All skipped … 7/31/2008 87 Sequential Analysis 7/31/2008 88 Sequential Analysis • Sequential Probability Ratio – Sample size not known in advance – Depends on outcomes of successive observations – Some of this theory is in BLAST • Basic Local Alignment Search Tool – The book focuses on discreet random variables 7/31/2008 89 Sequential Analysis – Consider: • • • • • • • Random variable Y with distribution P(y;ξ) Tests usually relate to the value of parameter ξ H0: ξ is ξ0 H1: ξ is ξ1 We can choose a value for the Type I error α And a value for the Type II error β Sampling then continues while P( y1 ; 1 ) P( y2 ; 1 )  P( yn ; 1 ) A B P( y1 ;  0 ) P( y2 ;  0 )  P( yn ;  0 ) 7/31/2008 90 Sequential Analysis – A and B are chosen to correspond to an α and β – Sampling continues until the ratio is less than A (accept H0) or greater than B (reject H0) – Because these are discreet variables, boundary overshoot usually occurs • We don’t expect to exactly get values α and β – Desired values for α and β approximately achieved by using A 7/31/2008  1 B 1   91 Sequential Analysis – It is also convenient to take logarithms, which gives us: P( yi ; 1 )  1  log   log  log 1 P ( yi ;  0 )  i – Using S1, 0 ( y )  log – We can write 7/31/2008 P ( y; 1 ) P ( y;  0 )  1  log   S1,0 ( yi )  log 1  i 92 Sequential Analysis • Example: sequence matching – H0: p0 = 0.25 (probability of a match is 0.25) – H1: p1 = 0.35 (probability of a match is 0.35) – Type I error α and Type II error β chosen 0.01 – Yi: 1 if there is a match at position i, otherwise 0 – Sampling continues while – with 7/31/2008 1 log   S1, 0 (Yi )  log 99 99 i (0.35)Yi (0.65) (1Yi ) S1, 0 (Yi )  log (0.25)Yi (0.75) (1Yi ) 93 Sequential Analysis – S can be seen as the support offered by Yi for H1 – The inequality can be re-written as  9.581   (Yi  0.2984)  9.581 i – This is actually a random walk with step sizes 0.7016 for a match and -0.2984 for a mismatch 7/31/2008 94 Sequential Analysis • Power Function for a Sequential Test – Suppose the true value of the parameter of interest is ξ – We wish to know the probability that H1 is accepted, given ξ – This probability is the power Ρ(ξ) of the test ( )  7/31/2008  * 1 1   *  *  1 1 ( ( ) ( ) ) 95 Sequential Analysis – Where θ* is the unique non-zero solution to θ in   P ( y; 1 )    1 P ( y;  )  yR  P ( y;  0 )  – R is the range of values of Y – Equivalently, θ* is the unique non-zero solution to θ in  P( y; )e S1, 0 ( y ) 1 yR – Where S is defined as before 7/31/2008 96 Sequential Analysis – This is very similar to Ch. 7 – Random Walks – The parameter θ* is the same as in Ch. 7 – And it will be the same in Ch 10 – BLAST – < skipping the random walk part > 7/31/2008 97 Sequential Analysis • Mean Sample Size – The (random) number of observations until one or the other hypothesis is accepted – Find approximation by ignoring boundary overshoot – Essentially identical method used to find the mean number of steps until the random walk stops 7/31/2008 98 Sequential Analysis – Two expressions are calculated for ΣiS1,0(Yi) • One involves the mean sample size • By equating both expressions, solve for mean sample size    1   S ( y )  ( 1   (  )) log   (  ) log     i 1,0 i 1         P(Yi ; 1 )  P(Yi ; 1 )   E ( S1,0 (Yi ))  E log   P(Yi ;  ) log  P(Yi ;  0 )  yR P(Yi ; 0 )  7/31/2008 99 Sequential Analysis – So, the mean sample size is: (1  ( )) log( 1 )  ( ) log( 1 ) P ( y ;1 ) P ( y ;  ) log  yR P ( y ; 0 ) – Both numerator and denominator depend on Ρ(ξ), and so also on θ* – A generalization applies if Q(y) of Y has different distribution than H0 and H1 – relevant to BLAST (1  ( )) log( 1 )  ( ) log( 1 ) P ( y ;1 ) Q ( y ) log  yR P ( y ; 0 ) 7/31/2008 100 Sequential Analysis • Example – Same sequence matching example as before • H0: p0 = 0.25 (probability of a match is 0.25) • H1: p1 = 0.35 (probability of a match is 0.35) • Type I error α and Type II error β chosen 0.01 – Mean sample size equation is: 9.190( p)  4.595 13 p log 75  (1  p ) log 15 – Mean sample size is when H0 is true: 194 – Mean sample size is when H1 is true: 182 7/31/2008 101 Sequential Analysis • Boundary Overshoot – So far we assumed no boundary overshoot – In practice, there will almost always be, though • Exact Type I and Type II errors different from α and β – Random walk theory can be used to assess how significant the effects of boundary overshoot are – It can be shown that the sum of Type I and Type II errors is always less than α + β (also individually) – BLAST deals with this in a novel way -> see Ch. 10 7/31/2008 102

Classical Hypothesis Testing Theory Adapted from Alexander Senf

Related documents

Products

Support

Classical Hypothesis Testing Theory Adapted from Alexander Senf

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib