ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED GRAPHS

                    

Example 1: Suppose that X

1

, X

2

, …, X

25

is from a normal population with mean



and standard deviation



= 10. We’d like to test H

0

:



= 40 versus H

1

:



= 41.

Sufficiency immediately reduces the problem to consideration of only

X

, which is

 normal with mean




25

= 2. The Neyman-Pearson procedure rejects if

X

≥ c . The type I error probability is

P[ X ≥ c |



= 40 ] = P





X



2

40

 c



2

40

 

40



=

40

 c

2

Here



is the cumulative distribution function for the standard normal.

If you wanted the type I error probability to be 0.05, you would solve

0.05 let



40

 c

. This gets c = 40 - 2

 -1

(0.05) = 40 - 2 (-1.645) = 43.29.

2

The type II error probability is

P[

X

< c |



= 41 ] = P





X



2

41

 c



2

41

 

41



= c



2

41

As c gets larger, the type I error probability drops and the type II error probability grows.

As c gets smaller, the type I error probability grows and the type II error probability drops. Here is a picture as to how these move together:

Scatterplot of P(type II) vs P(type I)

1.0

c = 43.29

0.8

0.6

0.4

0.2

0.0

0.0

0.2

0.4

0.6

0.8

1.0

P(type I)

This is a continuous curve, but it’s only drawn at a set of discretely-spaced values for c .

1

  gs 2011


                    

The problem is fairly difficult, as it is not possible to get both error probabilities to be small. This graph shows the null and alternative distributions for

X

:

Distribution Plot

Normal, StDev=2

0.20

Mean

40

41

0.15

0.10

0.05

0.00

35.0

37.5

40.0

X

42.5

45.0

47.5

Here’s an easier one.

Example 2: This is the same as the previous, but the alternative is H

1

:



= 42. The

Type I error probability setup has not changed, but now the type II error probability is

 



 c



2

42

. Here is the plot of the error probabilities:


1.0

0.8

0.6

0.4

0.2

0.0

0.0

0.2

0.4

P(type I)

0.6

0.8

1.0

This has the same shape, but it is much more bowed to the ideal point (0, 0).

2

  gs 2011


                    

The null and alternative distributions are now somewhat more separated:


Normal, StDev=2

0.20

Mean

40

42

0.15

0.10

0.05

0.00

35.0

37.5

40.0

X

42.5

45.0

47.5

50.0

Example 3: This is the same as the previous, changing now H

1

:



= 44.


1.0

0.4

0.2

0.0

0.8

0.6

0.0

0.2

0.8

1.0

0.4

P(type I)

0.6

3

  gs 2011


                    

The points are getting much closer to the ideal (0, 0). The distributions for X are now normal (



= 40,



= 2) for H

0

and normal (



= 44,



= 2) for H

1

. These separate easily:


Normal, StDev=2

0.20

Mean

40

44

0.15

0.10

0.05

0.00

35.0

37.5

40.0

42.5

X

45.0

47.5

50.0


1

, X

2

, …, X

25








0

:



= 40 versus H

1

:



> 40. As with the previous examples, the test has the rejection rule



= { x | x ≥ c }. Let’s decide on type I error probability of 0.05, and recall that this leads to c = 43.29. The type II error probability is not obvious, since H

1

is not simple; there are many ways in which H

0

can be wrong. As a function of



, the probability of being in set



=



is

43.29

 

2

.

Here is a plot of that function:

Power vs Mu

1.0

0.8

0.6

0.4

0.2

0.0

35.0

37.5

40.0

42.5

Mu

45.0

47.5

50.0

52.5

4

  gs 2011


                    

At the null hypothesis value 40, the height of the curve is 0.05. The farther that



is beyond 40, the larger the probability that H

0

will be rejected.

This picture gives the power curve , defining simply Power(



) = P[ Reject H

0

|



]. On the set {



> 40 }, we can say that 1 – Power(



) is the probability of type II error.

The picture sometimes gets called the operating characteristic curve . Some will use this term on 1 – Power(



).


1

, X

2

, …, X

25








0

:



= 40 versus H

1

:

 

40. This test has a two-sided H

1

, and the technique does not follow directly from the Neyman-Pearson ideas. Several other logical paths lead to the test, including the likelihood ratio test.

The rejection set is



= { x | | x - 40 | ≥ c }. The value of c can be found to make the type I error probability equal to any specified value. Here’s how to get it to 0.05:

P X



40

 c  want



0.95

P



X

 want



40

 c



0.975

P



X



40

 c



= P





X



40

2

 c

2





= P





Z

 c

2



 want



0.975

The normal table provides P[ Z < 1.96 ] = 0.975, leading to c = 2



1.96 = 3.92. The rejection set is identified as



= { x | x ≤ 36.08 }



{ x | x ≥ 43.92 }.

This test can be described through its power curve Power(



) = P[ Reject H

0

|



].

Power(



) = P X



36.08







P X



43.92

 

= P





X

 



2

36.08

 

2







P





X

 

2



43.92

 

2







Recall that the “2” here is SD( X

).

=



36.08

 

2



1

 

43.92

 

2

=



36.08

 

2

 

 

43.92

2

5

  gs 2011


                    

Here’s what the power curve looks like:

Scatterplot of POWER vs Mu

1.0

0.4

0.2

0.8

0.6

0.0

30 35 40

Mu

45 50

The power curve hits its minimum at the null hypothesis value 40.

A test with the property min P[ reject H

0

]

 max P[ reject H

0

]

H

1

H

0

is called unbiased . This is rarely discussed. This property happens almost automatically for tests based on a statistic (like the normal) with a symmetric distribution. Non-symmetric cases get complicated.


1

, X

2

, …, X

64








0

:



= 40 versus H

1

:

 

40. This is exactly the previous example, but with a larger sample size. For this story, SD( X )

10

= = 1.25. As in the previous example,



= { x | | x - 40 | ≥ c }. The

64 calculation for c that makes the type I error probability equal to 0.05 is

P



X



40

 c



= P





X



40

1.25

 c

1.25





= P





Z

 c

1.25



 want



0.975

This leads to c = 1.25



1.96 = 2.45.

The rejection set is



= { x | x ≤ 37.55 }



{ x | x ≥ 42.45 } and the power function is Power(



) =



37.55

1.25

 

 

 

42.45

1.25

.

6

  gs 2011


                    

Here is the graph of this power function:

Scatterplot of POWER vs Mu

1.0

0.8

0.6

0.4

0.2

0.0

30 35 40

Mu

Here are the power curves for examples 5 and 6 together:

Power

1.0

Sample size 64

0.8

45

0.6

Sample size 25

0.4

0.2

0.0

30 35 40

Mu

45

This shows the bang that you get from larger sample size!

50

50

V

7

  gs 2011

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED

Related documents

Products

Support

ERROR PROBABILITIES FOR HYPOTHESIS TESTS, RELATED

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib