Hypothesis Tests Stat 430 Fall 2011

Hypothesis Tests

Stat 430

Fall 2011

Outline

•

Hypothesis Testing:

• large sample vs small sample

• two sample comparisons

• multiple testing

•

Intro to Linear Models

Hypothesis Tests

•

State null hypothesis

•

State alternative hypothesis

•

State test statistic

•

Find distribution of test statistic under null hypothesis

•

Compute p-value

•

Draw conclusion: reject null hypothesis (or fail to reject)

Hypothesis Testing

•

Type I error

P( reject H

0

| H

0

is true) = alpha

•

Type II error

P( accept H

0

| H

0

is false) = beta

Hypothesis Testing

accept H

0 reject H

0

H

0

true

Type I

H

0

false

Type II

• trying to decrease type I error will usually increase type II and vice versa

•

Type I is called the significance level

•

Power of a test: P ( reject H

0

| H

0

false )

Hypothesis Testing

• acceptance/rejection region: all observed values that will lead us to not reject/reject the null hypothesis

Hypothesis testing

•

Test statistic: quantity that reflects how close we are to the null hypothesis this quantity needs to have a known distribution

Hypothesis

•

P-value: probability to observe value of the test statistic (or more extreme value) assuming that the null hypothesis is true .

•

Reject H

0

, if P-value is “too small” (i.e. usually below 5%)

Example: Gene Expression

•

Expression values for gene 244912_at experimental condition:

8.18 8.16 7.95

controls:

8.64 8.28 8.7 8.34 8.41 8.35 8.42 8.11 8.55


•

Our assumption: gene 244912_at is underexpressed due to the treatment in the experiment (we would like to show that)

•

H

0

:

H a

:


•

Test statistic: compare mean expression values

• mean

S

= 8.42 sd

S

= 0.18

mean

C

= 8.10 sd

C

= 0.13

•

T = (8.42-8.10)/ √ (0.13

2 /3+0.18

2 /9) = 3.3

it’ll turn out, that we reject the null hypothesis and accept the alternative hypothesis

H

0 p=p

0

µ=µ

0

µ

1

-µ

2

= d p

1

-p

2

= d

Test statistics & distribution

test statistic

�

ˆ

ˆ

(1

−

− p

0 p ) /n

¯ s/

− µ n

0

� s

¯

1

−

�

1

/n

1

¯

+

2 s

−

2

2 d

/n

2

�

ˆ

1

(1 − p

1 p ˆ

1

− p ˆ

2

− d

) /n

1

+ ˆ

2

(1 − p

2

) /n

2

All these test statistics have a large sample standard normal distribution

Test statistics & small sample distribution

distr.

H

0 p=p

0

µ=µ

0

µ

1

-µ

2

= d p

1

-p

2

= d test statistic

�

ˆ

ˆ

(1

−

− p

0 p ) /n

¯ s/

− µ n

0

�

¯

1

− s

�

1

/n

1

¯

+

2

− s 2

2 d

/n

2

�

ˆ

1

(1 − p

1 p ˆ

1

− p ˆ

2

− d

) /n

1

+ ˆ

2

(1 − p

2

) /n

2

*** needs some more work t n-1 t n-1

***

***

Two sample tests

•

Situation I samples 1 and 2 are paired, i.e. values in sample 1 correspond to a ‘before’ measurement, values in sample 2 correspond to an ‘after’ measurement on the same individual/experimental unit

Example

Two sample tests

•

Situation I: paired samples obtain new data as differences from sample, i.e. that way we are in a single sample testing situation

Two sample tests

•

Situation II: independent samples, assume same variance

• pooled variance: s 2 = [(n

1

-1) s

1

2 + (n

2

-1) s

2

2 ]/(n

1

+n

2

-2)

• test statistic has t distribution with

(n

1

+n

2

-2) degrees of freedom

Two sample tests

•

Situation III:

1 n

�

( y i

− y ˆ i

)

2

• variance of difference: s 2

θ ij

= [s

1

2

= log

/n

1

+ s

2

2 m m i

2 i

]

+1

+1

,j

,j

+1 m m i,j ij

+1

= ...

= β i +1

− β i

• test statistic has t distribution with k df

...

= β

•

Welch-Satterthwaite: m i +1 ,j m i,j +1

( s 2

1

+ s 2

2

) 2 k = s 4

1

/ ( n

1

− 1) + s 4

2

/ ( n

2

− 1)

� s

¯

1

−

2

1

/n

1

¯

+

2 s

−

2

2 d

/n

2

¯ s/

− µ n

0

�

ˆ

ˆ

(1

−

− p

0 p ) /n

�

ˆ

1

(1 − p

1 p ˆ

1

− p ˆ

2

− d

) /n

1

+ ˆ

2

(1 − p

2

) /n

2

λ

XY ij

= β u i v j

λ

XY ij

= β i v j

λ

XY ij

= β j u i

λ

XY ij

= β i

β j

�

¯ − t · √ n

,

¯

+ t · √ n

�

H o

: π ijk

= π i ++

π

+ j +

π

++ k

H o

: π ijk

= π i ++

π

+ jk

H o

: π ijk

= π ij +

π

+ jk

/ π

++ k

ˆ

1

− p

2

± z ·

�

ˆ

1

(1 − ˆ

1

) /n

1

+ ˆ

2

(1 − p

2

) /n

2

1

Large number of tests

•

In the gene expression example there are

22,810 genes

•

5,628 show a significant difference in expression between study and control (at a .05 level)

• not practicable!

Multiple testing “fixes”

•

Bonferroni Adjustment

•

False Discovery Rate

•

Graphics

Bonferroni Adjustment

•

Lower significance level according to the number of tests:

•

P-values have to be less than alpha/n instead of less than alpha

•

Gene expression: 94 genes significant on

0.05/22,810 level -- very conservative!

Bonferroni Results

12

11

10

9

8

7

6

S11 S12 S13 S21 very conservative!

S22 S23 variable

S31 S32 S33 C1 C2 C3


•

Type I error: % of false positive results

• for n=22,810 we’d expect 1140.5 false positives at 5% significance level

• we got 5,628, therefore 1140.5/5628 =

20.2

% are false positives


•

Idea: control the false discovery rate

•

Pick cutoff of significance level such that the false discovery rate is under a threshold

• gene expression:

2,279 results alpha

0.05

0.025

0.01

FDR

20.3%

13.4%

7.7%

0.009

5.1%

0.0049956

5.0%


13

12

11

10

9

8

7

6

12

10

8

6

4

S1 S2 C1 C2

-1

C3 C4 C5 C6 variable

S1 S2 C1 C2

1

C3 C4 C5 C6

S11 S12 S13 S21 S22 S23 S31 S32 S33 C1 C2 C3 variable

S11 S12 S13 S21 S22 S23 S31 S32 S33 C1 C2 C3

On average, 5% of the results are false positives

Graphics

• might help to distinguish between

“interesting” and not so interesting results from the other two methods

• problem: we need to be creative - graphics are not an out of the box solution

•

Here: separate out those genes that have a very low overall variance, i.e. “flat-liners”

Graphics

12

11

10

9

8

7

6

S11 S12 S13 S21 S22 S23 S31 S32 S33 C1 C2 C3 variable

S11 S12 S13 S21 S22 S23 S31 S32 S33 C1 C2 C3

Linear Models

Linear Models

•

Response (or dependent) variable

Y

•

Explanatory (or co-variate) variables

X

1

, X

2

, ...., X p

•

Try to find a relationship f that helps to determine outcome Y based on values of

X

1

, X

2

^

, ...., X

Y = f(X p

1

, X

2

, ...., X p

)

Simple Linear Regression

•

Situation: we assume that f is a linear function, and p=1

• i.e. we want to find f() with

f(x) = a + b*x that is “closest” to values of Y i.e. we want to find values for a and b


•

Model

Y = a + bX + error

Olympic Gold Medallists

!"#"$%$&'(")"&*%*+"

Interpretation

•

We get values for parameters a and b as a = -18.85

b = 0.013

• a is the intercept - i.e. the value for Y if X=0 in this data the interpretation is a bit obscure: for the year

0 we would expect the winner to jump 18.85m backwards

(quite a feat!)

• b is the average increase that we expect for Y when we increase X by 1 unit: for each year we expect the winner to jump 1.3 cm further, from one Olympics to the next we’d expect an increase of 5.2 cm


•

How do we get a and b?

•

How good is the model?

•

What are confidence intervals for the parameters a and b, for predicted values?

Hypothesis Tests Stat 430 Fall 2011

Hypothesis Tests

Outline