# Classical Hypothesis Testing Theory Adapted from Alexander Senf

```Classical Hypothesis Testing
Theory
Review
• 5 steps of classical hypothesis testing (Ch. 3)
1. Declare null hypothesis H0 and alternate
hypothesis H1
2. Fix a threshold α for Type I error (1% or 5%)
•
•
Type I error (α): reject H0 when it is true
Type II error (β): accept H0 when it is false
3. Determine a test statistic
•
7/31/2008
a quantity calculated from the data
2
Review
4. Determine what observed values of the test
statistic should lead to rejection of H0
•
Significance point K (determined by α)
5. Test to see if observed data is more extreme
than significance point K
•
•
7/31/2008
If it is, reject H0
Otherwise, accept H0
3
Overview of Ch. 9
– Simple Fixed-Sample-Size Tests
– Composite Fixed-Sample-Size Tests
– The -2 log λ Approximation
– The Analysis of Variance (ANOVA)
– Multivariate Methods
– ANOVA: the Repeated Measures Case
– Bootstrap Methods: the Two-sample t-test
– Sequential Analysis
7/31/2008
4
Simple Fixed-Sample-Size Tests
7/31/2008
5
The Issue
• In the simplest case, everything is specified
– Probability distribution of H0 and H1
• Including all parameters
– α (and K)
– But: β is left unspecified
• It is desirable to have a procedure that
minimizes β given a fixed α
– This would maximize the power of the test
• 1-β, the probability of rejecting H0 when H1 is true
7/31/2008
6
Most Powerful Procedure
• Neyman-Pearson Lemma
– States that the likelihood-ratio (LR) test is the most
powerful test for a given α
– The LR is defined as:
f1 ( X 1 ) f1 ( X 2 )  f1 ( X n )
LR 
f0 ( X 1 ) f0 ( X 2 ) f0 ( X n )
– where
• f0, f1 are completely specified density functions for H0,H1
• X1, X2, … Xn are iid random variables
7/31/2008
7
Neyman-Pearson Lemma
– H0 is rejected when LR ≥ K
– With a constant K chosen such that:
P(LR ≥ K when H0 is true) = α
– Let’s look at an example using the NeymanPearson Lemma!
– Then we will prove it.
7/31/2008
8
Example
• Basketball players seem to be
taller than average
– Use this observation to formulate
our hypothesis H1:
• “Tallness is a factor in the
players”
– The null hypothesis, H0, could be:
• “No, the players on KU’s team are a
just average height compared to the
population in the U.S.”
• “Average height of the team and the
population in general is the same”
7/31/2008
9
Example
• Setup:
– Average height of males in the US: 5’9 ½“
– Average height of KU players in 2008: 6’04 ½”
• Assumption: both populations are normal-distributed
centered on their respective averages (μ0 = 69.5 in, μ1 =
76.5 in) and σ = 2
( x  76.5 ) 2
( x  69.5 ) 2


• Sample size: 3
8
8
e
e
f1 ( x) 
f 0 ( x) 
2 2
2 2
– Choose α: 5%
7/31/2008
10
Example
• The two populations:
f0
f1
p
height (inches)
7/31/2008
11
Example
– Our test statistic is the Likelihood Ratio, LR
e

( x1  76.5 ) 2
8
f1 ( x1 ) f1 ( x2 ) f1 ( x3 )  2 2
 ( x) 
( x  69.5 )

f 0 ( x1 ) f 0 ( x2 ) f 0 ( x3 )
8
e
2 2
( x2  76.5 ) 2
8
e
2 2
2
1
e

e

( x2  69.5 ) 2
8
2 2

( x3  76.5 ) 2
8
2 2
e

( x3  69.5 ) 2
8
2 2
3
e

1
( xi 69.5) 2 ( xi 76.5) 2
8 i1
– Now we need to determine a significance point K
at which we can reject H0, given α = 5%
7/31/2008
• P(Λ(x) ≥ K | H0 is true) = 0.05, determine K
12
Example
– So we just need to solve for K’ and calculate K:
  
f
0
( x1 ) f 0 ( x2 ) f 0 ( x3 )dx1dx2 dx3  0.05
K1' K 2' K 3'
• How to solve this? Well, we only need one set of values
to calculate K, so let’s pick two and solve for the third:
 
  f
0
( x1 ) f 0 ( x2 ) f 0 ( x3 )dx1dx2 dx3  0.05
6871 K 3'
• We get one result: K3’=71.0803
7/31/2008
13
Example
– Then we can just plug it in to Λ and calculate K:
3
K e
e

1
( K i' 69.5) 2 ( K i' 76.5) 2
8 i1

1
( 6869.5) 2 ( 6876.5) 2  ( 7169.5) 2 ( 7176.5) 2  ( 71.080369.5) 2 ( 71.080376.5) 2
8
 1.663 *10 7
7/31/2008
14

Example
– With the significance point K = 1.663*10-7 we can
now test our hypothesis based on observations:
• E.g.: Sasha = 83 in, Darrell = 81 in, Sherron = 71 in
3
 ( X  {83,81,71})  e

1
( X i  69.5 ) 2  ( X i  76.5 ) 2
8 i 1
(83,81,71)  1.446 *1012
• 1.446*1012 > 1.663*10-7
• Therefore, our hypothesis that tallness is a factor in the
recruitment of KU basketball players is true.
7/31/2008
15
Neyman-Pearson Proof
• Let A define region in the joint range of X1, X2,
… Xn such that LR ≥ K. A is the critical region.
– If A is the only critical region of size α we are done
 L(H )  f (u ) f (u ) f (u )du du du
A
0
0
1
0
2
0
n
1
2
n

A
– Let’s assume another critical region of size α,
defined by B
 L(H )  f (u ) f (u ) f (u )du du du
B
7/31/2008
0
0
1
0
2
0
n
1
2
n

B
16
Proof
– H0 is rejected if the observed vector (x1, x2, …, xn)
is in A or in B.
– Let A and B overlap in region C
– Power of the test: rejecting H0 when H1 is true
• The Power of this test using A is:
 L(H )   f (u ) f (u ) f (u )du du du
A
7/31/2008
1
1
1
1
2
1
n
1
2
n
A
17
Proof
– Define: Δ = ∫AL(H1) - ∫BL(H1)
• The power of the test using A minus using B
     f1 (u1 )  f1 (un )du1  dun     f1 (u1 )  f1 (un )du1  dun
A
B
    f1 (u1 )  f1 (un )du1  dun     f1 (u1 )  f1 (un )du1  dun
A\C
B\C
• Where A\C is the set of points in A but not in C
• And B\C contains points in B but not in C
7/31/2008
18
Proof
– So, in A\C we have:

f1 (u1 )  f1 (u n )
K
f 0 (u1 )  f 0 (u n )
f1 (u1 )  f1 (un )  Kf0 (u1 )  f 0 (un )
– While in B\C we have:
f1 (u1 )  f1 (un )  Kf0 (u1 )  f 0 (un )
Why?
7/31/2008
19
Proof
– Thus
     Kf 0 (u1 )  f 0 (un )du1  dun     Kf 0 (u1 )  f 0 (un )du1  dun
A\C
B\C
    Kf 0 (u1 )  f 0 (un )du1  dun     Kf 0 (u1 )  f 0 (un )du1  dun
A
B
 K  K
0
– Which implies that the power of the test using A is
greater than or equal to the power using B.
7/31/2008
20
Composite Fixed-Sample-Size Tests
7/31/2008
21
Not Identically Distributed
• In most cases, random variables are not
identically distributed, at least not in H1
– This affects the likelihood function, L
– For example, H1 in the two-sample t-test is:
m
L
i 1
1
e
2 
 ( x1i  1 ) 2
2 2
n

i 1
1
e
2 
 ( x2 i   2 ) 2
2 2
– Where μ1 and μ2 are different
7/31/2008
22
Composite
– Further, the hypotheses being tested do not
specify all parameters
– They are composite
– This chapter only outlines aspects of composite
test theory relevant to the material in this book.
7/31/2008
23
Parameter Spaces
– The set of values the parameters of interest can take
– Null hypothesis: parameters in some region ω
– Alternate hypothesis: parameters in Ω
– ω is usually a subspace of Ω
• Nested hypothesis case
– Null hypothesis nested within alternate hypothesis
– This book focuses on this case
• “if the alternate hypothesis can explain the data
significantly better we can reject the null hypothesis”
7/31/2008
24
λ Ratio
• Optimality theory for composite tests suggests
this as desirable test statistic:
Lmax ( )

Lmax ()
• Lmax(ω): maximum likelihood when parameters are
confined to the region ω
• Lmax(Ω): maximum likelihood when parameters are
confined to the region Ω, defined by H1
• H0 is rejected when λ is sufficiently small (→ Type I error)
7/31/2008
25
Example: t-tests
• The next slides calculate the λ-ratio for the
two sample t-test (with the likelihood)
m
L
i 1
1
e
2 
 ( x1i  1 ) 2
2 2
n

i 1
1
e
2 
 ( x2 i   2 ) 2
2 2
– t-tests later generalize to ANOVA and T2 tests
7/31/2008
26
Equal Variance Two-Sided t-test
• Setup
– Random variables X11,…,X1m in group 1 are
Normally and Independently Distributed (μ1,σ2)
– Random variables X21,…,X2n in group 2 are NID
(μ2,σ2)
– X1i and X2j are independent for all i and j
– Null hypothesis H0: μ1= μ2 (= μ, unspecified)
– Alternate hypothesis H1: both unspecified
7/31/2008
27
Equal Variance Two-Sided t-test
• Setup (continued)
– σ2 is unknown and unspecified in H0 and H1
• Is assumed to be the same in both distributions
– Region ω is:
  {1  2 ,0   2  }
– Region Ω is:
  {  1  ,  2  ,0   2  }
7/31/2008
28
Equal Variance Two-Sided t-test
• Derivation
– H0: writing μ for the mean, when μ1= μ2, the
maximum over likelihood ω is at
ˆ  X 
X 11  X 12    X 1m  X 21  X 22    X 2 n
mn
– And the (common) variance σ2 is
2
(
X

X
)

(
X

X
)
i1 1i
i1 2i
m
̂ 02 
7/31/2008
2
n
mn
29
Equal Variance Two-Sided t-test
– Inserting both into the likelihood function, L
Lmax ( ) 
7/31/2008
1
(2ˆ )
2
0
m n
2
e

m n
2
30
Equal Variance Two-Sided t-test
– Do the same thing for region Ω
ˆ1  X 1 
X 11  X 12    X 1m
m
ˆ 2  X 2 
2
2
(
X

X
)

(
X

X
)
1
2
i 1 1i
i 1 2i
m
ˆ12 
X 21  X 22    X 2 n
n
n
mn
– Which produces this likelihood Function, L
Lmax () 
7/31/2008
1
(2ˆ )
2
1
m n
2
e

m n
2
31
Equal Variance Two-Sided t-test
– The test statistic λ is then
e
 m2 n
m n
2
 ˆ12 
Lmax ( ) (2ˆ 02 )


  2 
 m2 n
Lmax ()
e
 ˆ 0 
2 m2 n
(2ˆ1 )
mn
2
It’s the same function, just
With different variances
7/31/2008
32
Equal Variance Two-Sided t-test
– We can then use the algebraic identity
m
n
m
n
 ( X 1i  X )   ( X 2i  X )   ( X 1i  X 1 )   ( X 2i  X 2 ) 2 
2
i 1
2
i 1
– To show that
2
i 1
 1 
   1t 2 
 m n2 
– Where t is (from Ch. 3)
7/31/2008
i 1
mn
( X 1  X 2 )2
mn
mn
2
T
( X 1  X 2 ) mn
S mn
33
Equal Variance Two-Sided t-test
– t is the observed value of T
– S is defined in Ch. 3 as
m
n
2
(
X

X
)

(
X

X
)
1
2
 1i
 2i
2
S2 
i 1
i 1
mn2
λ
We can plot λ as a
function of t:
(e.g. m+n=10)
7/31/2008
t
34
Equal Variance Two-Sided t-test
– So, by the monotonicity argument, we can use t2
or |t| instead of λ as test statistic
– Small values of λ correspond to large values of |t|
– Sufficiently large |t| lead to rejection of H0
– The H0 distribution of t is known
• t-distribution with m+n-2 degrees of freedom
– Significance points are widely available
• Once α has been chosen, values of |t| sufficiently large
to reject H0 can be determined
7/31/2008
35
http://www.socr.ucla.edu/Applets.dir/T-table.html
Equal Variance Two-Sided t-test
7/31/2008
36
Equal Variance One-Sided t-test
• Similar to Two-Sided t-test case
– Different region Ω for H1:
• Means μ1 and μ2 are not simply different, but one is
larger than the other μ1 ≥ μ2
  {1  2 ,0   2  }
• If x1  x 2 then maximum likelihood estimates are the
same as for the two-sided case
7/31/2008
37
Equal Variance One-Sided t-test
• If x1  x 2 then the unconstrained maximum of the
likelihood is outside of ω
• The unique maximum is at ( x1 , x 2 ) , implying that the
maximum in ω occurs at a boundary point in Ω
• At this point estimates of μ1 and μ2 are equal ( x)
• At this point the likelihood ratio is 1 and H0 is not
rejected
• Result: H0 is rejected in favor of H1 (μ1 ≥ μ2) only for
sufficiently large positive values of t
7/31/2008
38
Example - Revised
• This scenario fits with our original example:
– H1 is that the average height of KU basketball
players is bigger than for the general population
– One-sided test
– We could assume that we don’t know the
averages for H0 and H1
– We actually don’t know σ (I just guessed 2 in the
original example)
7/31/2008
39
Example - Revised
• Updated example:
– Observation in group 1 (KU): X1 = {83, 81, 71}
– Observation in group 2: X2 = {65, 72, 70}
– Pick significance point for t from a table: tα = 2.132
• t-distribution, m+n-2 = 4 degrees of freedom, α = 0.05
– Calculate t with our observations
(78.3  69) 9
27.9
t

 2.185
12.7673
5.2122 6
– t > tα, so we can reject H0!
7/31/2008
40
• Problems that might arise in other cases
– The λ-ratio might not reduce to a function of a
well-known test statistic, such as t
– There might not be a unique H0 distribution of λ
– Fortunately, the t statistic is a pivotal quantity
• Independent of the parameters not prescribed by H0
– e.g. μ, σ
– For many testing procedures this property does
not hold
7/31/2008
41
Unequal Variance Two-Sided t-test
• Identical to Equal Variance Two-Sided t-test
– Except: variances in group 1 and group 2 are no
longer assumed to be identical
•
•
•
•
•
7/31/2008
Group 1: NID(μ1, σ12)
Group 2: NID(μ2, σ22)
With σ12 and σ22 unknown and not assumed identical
Region ω = {μ1 = μ2, 0 < σ12, σ22 < +∞}
Ω makes no constraints on values μ1, μ2, σ12, and σ22
42
Unequal Variance Two-Sided t-test
– The likelihood function of (X11, X12, …, X1m, X21, X22,
…, X2n) then becomes
m

i 1
1
e
2  1

( x1i  1 ) 2
2 12
n

i 1
1
2  2

e
( x21i   2 ) 2
2 22
– Under H0 (μ1 = μ2 = μ), this becomes:
m

i 1
7/31/2008
1
e
2  1

( x1i   ) 2
2 12
n

i 1
1
2  2

e
( x21i   ) 2
2 22
43
Unequal Variance Two-Sided t-test
– Maximum likelihood estimates ̂ , ̂ 12 and ̂ 22 satisfy
the simultaneous equations:
 (x
1i
2
1
 ˆ )
ˆ
2i
2
2
 ˆ )
ˆ
2
1
 (x

ˆ 22
(x


ˆ
7/31/2008
(x


1i
0
 ˆ ) 2
m
2i
 ˆ ) 2
n
44
Unequal Variance Two-Sided t-test
–  cubic equation in ̂
– Neither the λ ratio, nor any monotonic function
has a known probability distribution when H0 is
true!
– This does not lead to any useful testing statistic
• The t-statistic may be used as reasonably close
• However H0 distribution is still unknown, as it depends
on the unknown ratio σ12/σ22
• In practice, a heuristic is often used (see Ch. 3.5)
7/31/2008
45
The -2 log λ Approximation
7/31/2008
46
The -2 log λ Approximation
• Used when the λ-ratio procedure does not
lead to a test statistic whose H0 distribution is
known
– Example: Unequal Variance Two-Sided t-test
• Various approximations can be used
– But only if certain regularity assumptions and
restrictions hold true
7/31/2008
47
The -2 log λ Approximation
• Best known approximation:
– If H0 is true, -2 log λ has an asymptotic chi-square
distribution,
• with degrees of freedom equal to the difference in
parameters unspecified by H0 and H1, respectively.
• λ is the likelihood ratio
• “asymptotic” = “as the sample size → ∞”
– Provides an asymptotically valid testing procedure
7/31/2008
48
The -2 log λ Approximation
– Restrictions:
• Parameters must be real numbers that can take on
values in some interval
• The maximum likelihood estimator is found at a turning
point of the function
– i.e. a “real” maximum, not at a boundary point
• H0 is nested in H1 (as in all previous slides)
– These restrictions are important in the proof
• I skip the proof…
7/31/2008
49
The -2 log λ Approximation
– Our original basketball example, revised again:
• Let’s drop our last assumption, that the variance in the
population at large is the same as in the group of KU
• All we have left now are our observations and the
hypothesis that μ1 > μ2
– Where μ1 is the average height of Basketball players
• Observation in group 1 (KU): X1 = {83, 81, 71}
• Observation in group 2: X2 = {65, 72, 70}
7/31/2008
50
Example – Revised Again
– Using the Unequal Variance One-Sided t-Test
– We get:
7/31/2008
51
The Analysis of Variance (ANOVA)
7/31/2008
52
The Analysis of Variance (ANOVA)
• Probably the most frequently used hypothesis
testing procedure in statistics
• This section
– Derives of the Sum of Squares
– Gives an outline of the ANOVA procedure
– Introduces one-way ANOVA as a generalization of
the two-sample t-test
– Two-way and multi-way ANOVA
– Further generalizations of ANOVA
7/31/2008
53
Sum of Squares
• New variables (from Ch. 3)
– The two-sample t-test tests for equality of the
means of two groups.
– We could express the observations as:
X ij   i  Eij
i  1,2
– Where the Eij are assumed to be NID(0,σ2)
– H0 is μ1 = μ2
7/31/2008
54
Sum of Squares
– This can also be written as:
X ij     i  Eij
i  1,2
• μ could be seen as overall mean
• αj as deviation from μ in group j
– This model is overparameterized
• Uses more parameters than necessary
• Necessitates the requirement m1  n 2  0
• (always assumed imposed)
7/31/2008
55
Sum of Squares
– We are deriving a test procedure similar to the
two-sample two-sided t-test
– Using |t| as test statistic
• Absolute value of the T statistic
– This is equivalent to using t2
• Because it’s a monotonic function of |t|
– The square of the t statistic (from Ch. 3)
( X 1  X 2 ) mn
T
S mn
7/31/2008
56
Sum of Squares
– …can, after algebraic manipulations, be written as F
B
F  ( m  n  2)
W
– where
X
X 
m
m
n
X2 
1j
1
j 1
j 1
X2j
X
n
mX 1  nX 2
mn
mn
B
( X 1  X 2 ) 2  m( X 1  X ) 2  n ( X 2  X ) 2
mn
m
n
W   ( X1 j  X 1 )   ( X 2 j  X 2 )2
2
j 1
7/31/2008
j 1
57
Sum of Squares
– B: between (among) group sum of squares
– W: within group sum of squares
– B + W: total sum of squares
• Can be shown to be:
m
(X
i 1
n
2
2

X
)

(
X

X
)

1i
2i
i 1
– Total number of degrees of freedom: m + n – 1
• Between groups: 1
• Within groups: m + n - 2
7/31/2008
58
Sum of Squares
– This gives us the F statistic F  B (m  n  2)
W
– Our goal is to test the significance of the
difference between the means of two groups
• B measures the difference
– The difference must be measured relative to the
variance within the groups
• W measures that
– The larger F is, the more significant the difference
7/31/2008
59
The ANOVA Procedure
• Subdivide observed total sum of squares into
several components
– In our case, B and W
• Pick appropriate significance point for a
chosen Type I error α from an F table
• Compare the observed components to test
our hypothesis
7/31/2008
60
F-Statistic
• Significance points depend on degrees of
freedom in B and W
– In our case, 1 and (m + n – 2)
7/31/2008
http://www.ento.vt.edu/~sharov/PopEcol/tables/f005.html
61
• The two-group case readily generalizes to any
number of groups.
• ANOVAs can be classified in various ways, e.g.
– fixed effects models
– mixed effects models
– random effects model
– Difference is discussed later
– For now we consider fixed effect models
• Parameter αi is fixed, but unknown, in group i
7/31/2008
X ij     i  Eij
62
• Terminology
– Although ANOVA contains the word ‘variance’
– What we actually test for is a equality in means
between the groups
• The different mean assumptions affect the variance,
though
• ANOVAs are special cases of regression
models from Ch. 8
7/31/2008
63
One-Way ANOVA
• One-Way fixed-effect ANOVA
• Setup and derivation
– Like two-sample t-test for g number of groups
– Observations (ni observations, i=1,2,…,g)
X i1 , X i 2 ,, X in
– Using overparameterized model for X
X ij     i  Eij
j  1,2,, ni
i  1,2,  , g
– Eij assumed NID(0,σ2), Σniαi = 0, αi fixed in group i
7/31/2008
64
One-Way ANOVA
– Null Hypothesis H0 is: α1 = α2 = … = αg = 0
– Total sum of squares is
g
ni
 ( X
i 1 j 1
ij
 X )2
– This is subdivided into B and W
g
g
ni
W   ( X ij  X i ) 2
B   ni ( X i  X ) 2
i 1 j 1
i 1
– with
ni
X ij
j 1
ni
Xi 
7/31/2008
g
ni
X  
i 1 j 1
X ij
N
g
N   ni
i 1
65
One-Way ANOVA
– Total degrees of freedom: N – 1
• Subdivided into dfB = g – 1 and dfW = N - g
– This gives us our test statistic F
F
B Ng
*
W g 1
– We can now look in the F-table for these degrees
of freedom to pick significance points for B and W
– And calculate B and W from the observed data
– And accept or reject H0
7/31/2008
66
Example
– Looking at it as a One-Way ANOVA analysis
• Observation in group 1 (KU): X1 = {83, 81, 71}
• Observation in group 2: X2 = {65, 72, 70}
– Total Sum of Squares:
(73.66  83) 2  (73.66  81) 2  (73.66  71) 2  (73.66  65) 2  (73.66  72) 2  (73.66  70) 2  239.3336
– B (between groups sum of squares)
g
B   ni ( X i  X ) 2  3(78.33  76.33) 2  3(69  76.33) 2  130.57
i 1
7/31/2008
67
Example
– W (within groups sum of squares)
g
ni
W   ( X ij  X i ) 2
i 1 j 1
 ((83  78.33) 2  (81  78.33) 2  (71  78.33) 2 )  ((65  69) 2  (72  69) 2  (70  69) 2 )
 108.667
– Degrees of freedom
• Total: N-1 = 5
• dfB = g – 1 = 2 - 1 = 1
• dfW = N – g = 6 – 2 = 4
7/31/2008
68
Example
– Table lookup for df 1 and 4 and α = 0.05:
– Critical value: F = 7.71
– Calculate F from our data:
F
B N  g 130.57 6  2
*

*
 4.806
W g  1 108.667 2  1
– So… 4.806 < 7.71
– With ANOVA we actually accept H0!
• Seems to be the large variance in group 1
7/31/2008
69
Same Example – with Excel
• Screenshots:
7/31/2008
70
Excel
• Offers most of these tests, built-in
7/31/2008
71
Two-Way ANOVA
• Two-Way Fixed Effects ANOVA
• Overview only (in the scope of this book)
• More complicated setup; example:
– Expression levels of one gene in lung cancer patients
– a different risk classes
• E.g.: ultrahigh, very high, intermediate, low
– b different age groups
– n individuals for each risk/age combination
7/31/2008
72
Two-Way ANOVA
– Expression levels (our observations): Xijk
• i is the risk class (i = 1, 2, …, a)
• j indicates the age group
• k corresponds to the individual in each group (k = 1, …, n)
– Each group is a possible risk/age combination
• The number of individuals in each group is the same, n
• This is a “balanced” design
• Theory for unbalanced designs is more complicated and
not covered in this book
7/31/2008
73
Two-Way ANOVA
– The Xijk can be arranged in a table:
Risk category
1
2
3
4
1
n
n
n
n
2
n
n
n
n
3
n
n
n
n
4
n
n
n
n
5
n
n
n
n
Age group
j
i
Number of individuals in this
risk/age group (aka “cell”)
7/31/2008
This is a two-way table
74
Two-Way ANOVA
– The model adopted for each Xijk is
X ijk     i   j   ij  Eijk
i  1,2,, a
•
•
•
•
•
j  1,2,, b
k  1,2,, n
Where Eijk are NID(μ, α2)
The mean of Xijk is μ + αi + βi + δij
αi is a fixed parameter, additive for risk class i
βi is a fixed parameter, additive for age group i
δij is a fixed risk/age interaction parameter
– Should be added is a possible group/group interaction exists
7/31/2008
75
Two-Way ANOVA
– These constraints are imposed
• Σiαi = Σiβi = 0
• Σiδij = 0 for all j
• Σjδij = 0 for all i
– The total sum of squares is then subdivided into
four groups:
7/31/2008
•
•
•
•
Risk class sum of squares
Age group sum of squares
Interaction sum of squares
Within cells (“residual” or “error”) sum of squares
76
Two-Way ANOVA
– Associated with each sum of squares
• Corresponding degrees of freedom
• Hence also a corresponding mean square
– Sum of squares divided by degrees of freedom
– The mean squares are then compared using F
ratios to test for significance of various effects
• First – test for a significant risk/age interaction
• F-ratio used is ratio of interaction mean square and
within-cells mean square
7/31/2008
77
Two-Way ANOVA
– Example of interaction
7/31/2008
Age
– No evidence of interaction
Age
• If such an interaction is used, it may not be reasonable
to test for significant risk or age differences
• Example, μ in two risk classes, two age groups:
Risk
1
2
1
4
12
2
7
15
1
2
1
4
15
2
11
6
78
Multi-Way ANOVA
• One-way and two-way fixed effects ANOVAs
can be extended to multi-way ANOVAs
• Gets complicated
• Example: three-way ANOVA model:
X ijkm     i   j   k   ij  ik   jk  ijk  Eijkm
7/31/2008
79
Further generalizations of ANOVA
• The 2m factorial design
– A particular form of the one-way ANOVA
• Interactions between main effects
– m “factors” taken at two “levels”
• E.g. (1) Gender, (2) Tissue (lung, kidney), and (3) status
(affected, not affected)
– 2m possible combinations of levels/groups
– Can test for main effects and interactions
– Need replicated experiments
• n replications for each of the 2m experiments
7/31/2008
80
Further generalizations of ANOVA
– Example, m = 3, denoted by A, B, C
• 8 groups, {abc, ab, ac, bc, a, b, c, 1}
• Write totals of n observations Tabc, Tab, …, T1
• The total between sum of squares can be subdivided
into seven individual sums of squares
–
–
–
–
7/31/2008
Three main effects (A, B, C)
Three pair wise interactions (AB, AC, BC)
One triple-wise interaction (ABC)
Example: Sum of squares for A, and for BC, respectively
(Tabc  Tab  Tac  Ta  Tbc  Tb T cT1 ) 2
8n
(Tabc  Tab  Tac  Ta  Tbc  Tb T cT1 ) 2
8n
81
Further generalizations of ANOVA
– If m ≥ 5 the number of groups becomes large
– Then the total number of observations, n2m is large
– It is possible to reduce the number of observations
by a process …
• Confounding
– Interaction ABC probably very small and not
interesting
– So, prefer a model without ABC, reduce data
– There are ANOVA designs for that
7/31/2008
82
Further generalizations of ANOVA
• Fractional Replication
– Related to confounding
– Sometimes two groups cannot be distinguished
from each other, then they are aliases
• E.g. A and BC
– This reduces the need to experiments and data
microarrays
7/31/2008
83
Random/Mixed Effect Models
• So far: fixed effect models
– E.g. Risk class, age group fixed in previous example
• Multiple experiments would use same categories
• But: what if we took experimental data on several
random days?
• The days in itself have no meaning, but a “between
days” sum of squares must be extracted
– What if the days turn out to be important?
– If we fail to test for it, the significance of our procedure is
diminished.
– Days are a random category, unlike risk and age!
7/31/2008
84
Random/Mixed Effect Models
• Mixed Effect Models
– If some categories are fixed and some are random
– Symbols used:
• Greek letters for fixed effects
• Uppercase Roman letters for random effects
• Example: two-way mixed effect model with
– Risk class a and days d and n values collected each day, the
appropriate model is written:
X ikl     i  Dl  Gil  Eikl
7/31/2008
85
Random/Mixed Effect Models
• Random effect model have no fixed categories
• The details on the ANOVA analysis depend on
which effects are random and which are fixed
• In a microarray context (more in Ch. 13)
– There tend to be several fixed and several random
effects, which complicates the analysis
– Many interactions simply assumed zero
7/31/2008
86
Multivariate Methods
ANOVA: the Repeated Measures Case
Bootstrap Methods: the Twosample t-test
All skipped …
7/31/2008
87
Sequential Analysis
7/31/2008
88
Sequential Analysis
• Sequential Probability Ratio
– Sample size not known in advance
– Depends on outcomes of successive observations
– Some of this theory is in BLAST
• Basic Local Alignment Search Tool
– The book focuses on discreet random variables
7/31/2008
89
Sequential Analysis
– Consider:
•
•
•
•
•
•
•
Random variable Y with distribution P(y;ξ)
Tests usually relate to the value of parameter ξ
H0: ξ is ξ0
H1: ξ is ξ1
We can choose a value for the Type I error α
And a value for the Type II error β
Sampling then continues while
P( y1 ; 1 ) P( y2 ; 1 )  P( yn ; 1 )
A
B
P( y1 ;  0 ) P( y2 ;  0 )  P( yn ;  0 )
7/31/2008
90
Sequential Analysis
– A and B are chosen to correspond to an α and β
– Sampling continues until the ratio is less than A
(accept H0) or greater than B (reject H0)
– Because these are discreet variables, boundary
overshoot usually occurs
• We don’t expect to exactly get values α and β
– Desired values for α and β approximately achieved
by using
A
7/31/2008

1
B
1 

91
Sequential Analysis
– It is also convenient to take logarithms, which
gives us:
P( yi ; 1 )

1 
log
  log
 log
1
P ( yi ;  0 )

i
– Using
S1, 0 ( y )  log
– We can write
7/31/2008
P ( y; 1 )
P ( y;  0 )

1 
log
  S1,0 ( yi )  log
1

i
92
Sequential Analysis
• Example: sequence matching
– H0: p0 = 0.25 (probability of a match is 0.25)
– H1: p1 = 0.35 (probability of a match is 0.35)
– Type I error α and Type II error β chosen 0.01
– Yi: 1 if there is a match at position i, otherwise 0
– Sampling continues while
– with
7/31/2008
1
log
  S1, 0 (Yi )  log 99
99 i
(0.35)Yi (0.65) (1Yi )
S1, 0 (Yi )  log
(0.25)Yi (0.75) (1Yi )
93
Sequential Analysis
– S can be seen as the support offered by Yi for H1
– The inequality can be re-written as
 9.581   (Yi  0.2984)  9.581
i
– This is actually a random walk with step sizes
0.7016 for a match and -0.2984 for a mismatch
7/31/2008
94
Sequential Analysis
• Power Function for a Sequential Test
– Suppose the true value of the parameter of
interest is ξ
– We wish to know the probability that H1 is
accepted, given ξ
– This probability is the power Ρ(ξ) of the test
( ) 
7/31/2008
 *
1
1   *
 *

1
1 (
(
) (
)
)
95
Sequential Analysis
– Where θ* is the unique non-zero solution to θ in

 P ( y; 1 ) 
  1
P ( y;  )

yR
 P ( y;  0 ) 
– R is the range of values of Y
– Equivalently, θ* is the unique non-zero solution to
θ in
 P( y; )e
S1, 0 ( y )
1
yR
– Where S is defined as before
7/31/2008
96
Sequential Analysis
– This is very similar to Ch. 7 – Random Walks
– The parameter θ* is the same as in Ch. 7
– And it will be the same in Ch 10 – BLAST
– < skipping the random walk part >
7/31/2008
97
Sequential Analysis
• Mean Sample Size
– The (random) number of observations until one or
the other hypothesis is accepted
– Find approximation by ignoring boundary
overshoot
– Essentially identical method used to find the mean
number of steps until the random walk stops
7/31/2008
98
Sequential Analysis
– Two expressions are calculated for ΣiS1,0(Yi)
• One involves the mean sample size
• By equating both expressions, solve for mean sample
size
  
1  
S
(
y
)

(
1


(

))
log


(

)
log




i 1,0 i
1








P(Yi ; 1 ) 
P(Yi ; 1 )


E ( S1,0 (Yi ))  E log
  P(Yi ;  ) log

P(Yi ;  0 )  yR
P(Yi ; 0 )

7/31/2008
99
Sequential Analysis
– So, the mean sample size is:
(1  ( )) log( 1 )  ( ) log( 1 )
P ( y ;1 )
P
(
y
;

)
log
 yR
P ( y ; 0 )
– Both numerator and denominator depend on Ρ(ξ),
and so also on θ*
– A generalization applies if Q(y) of Y has different
distribution than H0 and H1 – relevant to BLAST
(1  ( )) log( 1 )  ( ) log( 1 )
P ( y ;1 )
Q
(
y
)
log
 yR
P ( y ; 0 )
7/31/2008
100
Sequential Analysis
• Example
– Same sequence matching example as before
• H0: p0 = 0.25 (probability of a match is 0.25)
• H1: p1 = 0.35 (probability of a match is 0.35)
• Type I error α and Type II error β chosen 0.01
– Mean sample size equation is:
9.190( p)  4.595
13
p log 75  (1  p ) log 15
– Mean sample size is when H0 is true: 194
– Mean sample size is when H1 is true: 182
7/31/2008
101
Sequential Analysis
• Boundary Overshoot
– So far we assumed no boundary overshoot
– In practice, there will almost always be, though
• Exact Type I and Type II errors different from α and β
– Random walk theory can be used to assess how
significant the effects of boundary overshoot are
– It can be shown that the sum of Type I and Type II
errors is always less than α + β (also individually)
– BLAST deals with this in a novel way -> see Ch. 10
7/31/2008
102
```