ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED

advertisement

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

Example 1: Suppose that X

1

, X

2

, …, X

25

is from a normal population with mean

and standard deviation

= 10. We’d like to test H

0

:

= 40 versus H

1

:

= 41.

Sufficiency immediately reduces the problem to consideration of only

X

, which is

 normal with mean

and standard deviation

25

= 2. The Neyman-Pearson procedure rejects if

X

≥ c . The type I error probability is

P[ X ≥ c |

= 40 ] = P

X

2

40

 c

2

40

 

40

=

40

 c

2

Here

is the cumulative distribution function for the standard normal.

If you wanted the type I error probability to be 0.05, you would solve

0.05 let

40

 c

. This gets c = 40 - 2

 -1

(0.05) = 40 - 2 (-1.645) = 43.29.

2

The type II error probability is

P[

X

< c |

= 41 ] = P

X

2

41

 c

2

41

 

41

= c

2

41

As c gets larger, the type I error probability drops and the type II error probability grows.

As c gets smaller, the type I error probability grows and the type II error probability drops. Here is a picture as to how these move together:

Scatterplot of P(type II) vs P(type I)

1.0

c = 43.29

0.8

0.6

0.4

0.2

0.0

0.0

0.2

0.4

0.6

0.8

1.0

P(type I)

This is a continuous curve, but it’s only drawn at a set of discretely-spaced values for c .

1

  gs 2011

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

The problem is fairly difficult, as it is not possible to get both error probabilities to be small. This graph shows the null and alternative distributions for

X

:

Distribution Plot

Normal, StDev=2

0.20

Mean

40

41

0.15

0.10

0.05

0.00

35.0

37.5

40.0

X

42.5

45.0

47.5

Here’s an easier one.

Example 2: This is the same as the previous, but the alternative is H

1

:

= 42. The

Type I error probability setup has not changed, but now the type II error probability is

 

 c

2

42

. Here is the plot of the error probabilities:

Scatterplot of P(type II) vs P(type I)

1.0

0.8

0.6

0.4

0.2

0.0

0.0

0.2

0.4

P(type I)

0.6

0.8

1.0

This has the same shape, but it is much more bowed to the ideal point (0, 0).

2

  gs 2011

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

The null and alternative distributions are now somewhat more separated:

Distribution Plot

Normal, StDev=2

0.20

Mean

40

42

0.15

0.10

0.05

0.00

35.0

37.5

40.0

X

42.5

45.0

47.5

50.0

Example 3: This is the same as the previous, changing now H

1

:

= 44.

Scatterplot of P(type II) vs P(type I)

1.0

0.4

0.2

0.0

0.8

0.6

0.0

0.2

0.8

1.0

0.4

P(type I)

0.6

3

  gs 2011

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

The points are getting much closer to the ideal (0, 0). The distributions for X are now normal (

= 40,

= 2) for H

0

and normal (

= 44,

= 2) for H

1

. These separate easily:

Distribution Plot

Normal, StDev=2

0.20

Mean

40

44

0.15

0.10

0.05

0.00

35.0

37.5

40.0

42.5

X

45.0

47.5

50.0

Example 4: Suppose that X

1

, X

2

, …, X

25

is from a normal population with mean

and standard deviation

= 10. We’d like to test H

0

:

= 40 versus H

1

:

> 40. As with the previous examples, the test has the rejection rule

= { x | x ≥ c }. Let’s decide on type I error probability of 0.05, and recall that this leads to c = 43.29. The type II error probability is not obvious, since H

1

is not simple; there are many ways in which H

0

can be wrong. As a function of

, the probability of being in set

=



is

43.29

 

2

.

Here is a plot of that function:

Power vs Mu

1.0

0.8

0.6

0.4

0.2

0.0

35.0

37.5

40.0

42.5

Mu

45.0

47.5

50.0

52.5

4

  gs 2011

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

At the null hypothesis value 40, the height of the curve is 0.05. The farther that

is beyond 40, the larger the probability that H

0

will be rejected.

This picture gives the power curve , defining simply Power(

) = P[ Reject H

0

|

]. On the set {

> 40 }, we can say that 1 – Power(

) is the probability of type II error.

The picture sometimes gets called the operating characteristic curve . Some will use this term on 1 – Power(

).

Example 5: Suppose that X

1

, X

2

, …, X

25

is from a normal population with mean

and standard deviation

= 10. We’d like to test H

0

:

= 40 versus H

1

:

 

40. This test has a two-sided H

1

, and the technique does not follow directly from the Neyman-Pearson ideas. Several other logical paths lead to the test, including the likelihood ratio test.

The rejection set is

= { x | | x - 40 | ≥ c }. The value of c can be found to make the type I error probability equal to any specified value. Here’s how to get it to 0.05:

P X

40

 c  want

0.95

P

X

 want

40

 c

0.975

P

X

40

 c

= P

X

40

2

 c

2

= P

Z

 c

2

 want

0.975

The normal table provides P[ Z < 1.96 ] = 0.975, leading to c = 2

1.96 = 3.92. The rejection set is identified as

= { x | x ≤ 36.08 }

{ x | x ≥ 43.92 }.

This test can be described through its power curve Power(

) = P[ Reject H

0

|

].

Power(

) = P X

36.08

P X

43.92

 

= P

X

 

2

36.08

 

2

P

X

 

2

43.92

 

2

Recall that the “2” here is SD( X

).

=

36.08

 

2

1

 

43.92

 

2

=

36.08

 

2

 

 

43.92

2

5

  gs 2011

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

Here’s what the power curve looks like:

Scatterplot of POWER vs Mu

1.0

0.4

0.2

0.8

0.6

0.0

30 35 40

Mu

45 50

The power curve hits its minimum at the null hypothesis value 40.

A test with the property min P[ reject H

0

]

 max P[ reject H

0

]

H

1

H

0

is called unbiased . This is rarely discussed. This property happens almost automatically for tests based on a statistic (like the normal) with a symmetric distribution. Non-symmetric cases get complicated.

Example 6: Suppose that X

1

, X

2

, …, X

64

is from a normal population with mean

and standard deviation

= 10. We’d like to test H

0

:

= 40 versus H

1

:

 

40. This is exactly the previous example, but with a larger sample size. For this story, SD( X )

10

= = 1.25. As in the previous example,

= { x | | x - 40 | ≥ c }. The

64 calculation for c that makes the type I error probability equal to 0.05 is

P

X

40

 c

= P

X

40

1.25

 c

1.25

= P

Z

 c

1.25

 want

0.975

This leads to c = 1.25

1.96 = 2.45.

The rejection set is

= { x | x ≤ 37.55 }

{ x | x ≥ 42.45 } and the power function is Power(

) =

37.55

1.25

 

 

 

42.45

1.25

.

6

  gs 2011

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

Here is the graph of this power function:

Scatterplot of POWER vs Mu

1.0

0.8

0.6

0.4

0.2

0.0

30 35 40

Mu

Here are the power curves for examples 5 and 6 together:

Power

1.0

Sample size 64

0.8

45

0.6

Sample size 25

0.4

0.2

0.0

30 35 40

Mu

45

This shows the bang that you get from larger sample size!

50

50

V

7

  gs 2011

Download