TESTING OF HYPOTHESES

advertisement

CHAPTER 6

TESTING HYPOTHESES

OVERVIEW

A hypothesis is a statement that something is true . The following statements are examples of hypotheses that can be tested by the procedures that we develop in this paper:

1.

The Opposition Party nominee will win the election in May 2004.

2.

Our brand of tires lasts an average of 50,000 kilometers.

3.

The students in this statistics class have an average I.Q. that is greater than the average for all other statistics classes.

4.

People who exercise daily have less cholesterol than those who do not.

5.

The average weight of an American car is 1,800 kilograms.

TESTING A CLAIM ABOUT A MEAN

A hypothesis test , or test of significance , involves procedures that allow us to make inferences about whole populations by analyzing samples. In this decision-making process, we begin by hypothesizing (sometimes just guessing) about the population. After gathering sample data, we try to determine whether the data support the hypothesis or whether they are statistically significant.

We need to formalize this decision-making process as in he following:

We decide to reject the null hypothesis

The null hypothesis is true

Type I error

The null hypothesis is false

Correct decision

We fail to reject the null hypothesis (We accept the null hypothesis)

Correct decision Type II error

Null hypothesis (denoted by H

0

): The statement of a zero or null difference that is directly tested This will correspond to the original claim if that claim includes the condition of no change or difference. Otherwise, the null hypothesis is the negation of the original claim. We test the null hypothesis directly in the sense that the final conclusion will be either the rejection of H

0

or failure to reject H

0

.

Alternative hypothesis (denoted by H

1

): The statement that must be true if the null hypothesis is false.

Type I error: The mistake of rejecting the null hypothesis when it is true.

Type II error: The mistake of failing to reject the null hypothesis when it is false.

 

(alpha): Symbol used to represent the probability of a type I error (significance level).

 

(beta): Symbol used to represent the probability of a type II error.

Test statistic: A sample statistic or a value based on the sample data. It is used in making the decision to accept the null hypothesis or to reject it.

Critical region: The set of all values of the test statistic that would cause us to reject the null hypothesis.

Critical value(s): the value(s) that separates the critical region from the values of the rest of the statistic that would not lead to rejection of the null hypothesis. The critical value(s) depends on the nature of the null hypothesis, the relevant sampling distribution, and the level of significance

.

Follow these three steps to determine the null and alternative hypotheses:

1.

Identify the claim that was made and translate it into symbolic form.

2.

Write the symbolic form that must be true when the original claim is false.

3.

Of the two symbolic statements, only one of them should contain the condition of equality. Call the statement containing the condition of equality the null hypothesis. The alternative hypothesis is the statement that does not contain the condition of equality.

We always test the null hypothesis and our initial conclusion will always be one of the following:

1.

Fail to reject (accept) the null hypothesis, H

0

.

2.

Reject the null hypothesis, H

0

Most of our hypothesis testing will be concerned with values of parameters – numerical characteristics of a population such as the mean, variance, or proportion.

The essential steps for testing hypotheses are as follows:

1.

Based upon some claim, formulate the null hypothesis H

0

and the alternative hypothesis H

1

.

2.

Based on the seriousness of a type I error, select

3.

Determine which sample statistic is appropriate. Also determine the appropriate sampling distribution.

4.

Using the sample data, compute the test statistic.

5.

Using the computed test statistic and the corresponding critical value, either reject

H

0 or assert that the sample data do not warrant a rejection of H

0.

(At this point, it is usually wise to graph the appropriate sampling distribution with the test statistic, critical region, and critical value(s) identified).

6.

In simple non-technical terms, state what the results suggest.

Example 1

Suppose we hear the claim that a new and more expensive type of snow tire has a shorter skid distance. Because of the costs involved, the owner of a large fleet of cars will purchase these tires only if strongly convinced that the tires really do skid less. The only way to prove or disprove the claim is to test all such tires, but that is clearly impractical.

Instead, sample data must be used to form a conclusion. Let's assume that we have obtained sample test results from an independent testing laboratory. We learn that 36 of these tires were tested and, under standard conditions, the mean skid distance for that sample group is found to be X = 148 feet. The testing laboratory also informs us that skid distances of the traditional type of snow tire are normally distributed with a mean of 152 feet and a standard deviation of 12 feet. Does the sample mean score of 148 feet represent a statistically significant decrease from the population mean of 152, or is the difference more likely due to chance variations in the skid distances?

Solution

1.

Based upon some claim, formulate the null hypothesis H

0

and the alternative hypothesis H

1

.

Claim: The new type of tire has a shorter skid distance than the old type;

  152.

The symbolic form that must be true when the original claim is false:

 

152.

Null hypothesis: H

0

:

 

152

Alternate hypothesis: H

1

:

 152

2.

Based on the seriousness of a type I error, select

= 0.05

3.

Determine which sample statistic is appropriate. Also determine the appropriate sampling distribution.

The sampling mean X is the relevant statistic. Sample means can be approximated by a normal distribution

4.

Using the sample data, compute the test statistic.

We find that the sample mean of 148 is equivalent to Z=-2.00 by computing

 x

=

/

= 12 / 36 = 2

Z = ( X -

 x

) /

 x

= (148 – 152) / 2 = -2.00

We determine that Z = - 1.645 is the cutoff separating unusual results from chance fluctuations by observing that if the shaded region in the left tail represents 5%, or 0.05,of the total area, then the rightmost limit of that region and

Z = 0 must encompass 45%, or 0.45, of the total. From the Table of Z scores, we see that 0.4500 is halfway between the values for Z = 1.64 and Z = 1.65, so we

split the difference and make Z negative since it is below the mean. This gives Z

= -1.645.

5.

Using the computed test statistic and the corresponding critical value, either reject

H

0 or assert that the sample data do not warrant a rejection of H

0.

(At this point, it is usually wise to graph the appropriate sampling distribution with the test statistic, critical region, and critical value(s) identified).

The sample mean of 148 is equivalent to Z = -2.00. In the 5 th

step, we determined that the critical region consists of all values less than –1.645. The value Z = -2.00 is less than –1.645. Thus, we reject H

0.

6.

In simple non-technical terms, state what the results suggest.

There is sufficient evidence to support the claim that the new tire has a shorter skid distance than the ordinary tire.

Example 2

The engineering department of a car manufacturer claims that the fuel consumption rate of the Gasmiser model is equal to 35 miles per gallon (mpg). The advertising department wants to test this claim to see if the announced figure should be higher or lower than 35 mpg. The quality-control group suggests that the standard deviation is 4 mpg and a sample of 50 Gasmisers yields a sample mean of 33.6 mpg. Test the claim of the engineering department.

Solution

Using the above steps in hypothesis testing, we proceed as follows:

1.

The claim:

= 35 mpg; alternative claim:

 

35 mpg

Null hypothesis: H

0:

= 35

Alternate hypothesis: H

1:

 

35

2.

Choose

= 0.05

3.

The sampling mean X is the relevant statistic. Sample means can be approximated by a normal distribution

4.

Using the sample data, compute the test statistic.

We find that the sample mean of 33.6 is equivalent to Z=-2.47 by computing

 x

=

/

= 4 / 50

Z = ( X -

 x

) /

 x

= (33.6 – 35) / (4 / 50 ) = -2.47

The critical z values are found by distributing

= 0.05 equally between the two tails to get 0.025 in each tail. We then refer to the Table of Z scores to find the Z value corresponding to 0.5 – 0.025 or 0.4750. After finding Z = 1.96, we use the property of symmetry to conclude that the left critical value is – 1.96.

5.

The sample mean of 33.6 mpg corresponding to Z = -2.47 falls within the critical region so that we reject H

0.

6.

The mean fuel consumption rate is probably not 35 mpg.

Example 3

A brewery distributes beer in bottles labeled 32 ounces. The local Bureau of Standards randomly selects 50 of those bottles, measures their contents, and obtains a sample mean of 31.0 ounces. Assuming that the standard deviation is known to be 0.75 ounces, is it valid at the 0.01 significance level to conclude that the brewery is cheating the consumer?

Solution

1.

The claim is that the mean is less than 32 ounces:

  32 ounces; alternate claim from the original claim is

 

32 ounces.

Null hypothesis: H

0

:

 

32

Alternate hypothesis: H

1

:

 32

2.

Choose

= 0.01

3.

Use the sample mean as the test statistic.

4.

The Z value corresponding to the sample mean 31 ounces is – 9.43 as computed in the following:

 x

=

/

= 0.75 / 50

Z = ( X -

 x

) /

 x

= (33.6 – 35) / (0.75 / 50 ) = -9.43

From the Table of Z scores, the critical Z value is – 2.33 corresponding to an area of 0.4900.

5.

We see that the sample mean of 31 ounces ( Z = -9.43) does fall within the critical region so we reject H

0.

6.

The brewery is probably cheating the consumer.

TESTING USING STUDENT'S t-TEST

In testing hypotheses the central limit theorem can be used if the population standard deviation is given, the samples are large, and each hypothesis tested relates to a population mean.

If the samples are small and the population standard deviation is not known, we use the

Student's t-test. In population that are essentially normal, assume that we randomly select small samples and we do not know the value of the population standard deviation. The distribution of these sample means is approximately a student t distribution.

The test statistic is t = ( X -

) / (s /

)

whereby s is the sample standard deviation. In tests on a mean, the number of degrees of freedom is simply the sample size minus 1: DF =

- 1.

Keep in mind that the student t distribution applies when we test a claim about a population mean and the following conditions are all met:

1.

The sample is small ( n

30); and

2.

population standard deviation

is unknown; and

3.

the parent population is essentially normal.

Example 1

A pilot training program usually takes an average of 57.2 hours, but new teaching methods were used on the last class of 25 students. Computations reveal that for this experimental class, the completion times had a mean of 54.8 hours and a standard deviation of 4.3 hours. At the

= 0.05 significance level, test the claim that the new teaching techniques reduce the instruction time.

Solution

Let

represent the mean completion time for the new teaching method. The claim that it reduces instruction time is equivalent to the claim that

  57.2 hours. Apply the steps in testing hypothesis:

1.

The claim is

  57.2 hours; the alternate to this original claim is

 

57.2 hours.

Null hypothesis: H

0

:

 

57.2

Alternate hypothesis: H

1

:

 57 .

2

2.

The level of significance is given in the problem

= 0.05.

3.

The test statistic is the t.

4.

We compute for the test statistic: t = ( X -

) / (s /

) = (54.8 – 57.2)/(4.3/ 25 ) = -2.791.

We find the critical t value from the Table of Student's t distribution where we locate 25-1 0r 24, degrees of freedom at he left column and

= 0.05 (one tail) across the top. The critical t value of 1.711 is obtained; but since small values of

X will cause the rejection of the null hypothesis, we recognize that t = -1.711 is the actual t value that is the boundary for the critical region.

5.

Since the test statistic t = –2.791 is in the critical region, we reject H

0.

Any sample mean equivalent to a t score below –1.711 represents a significant difference. The mean of 54.8 hours is significantly below 57.2 hours.

6.

The new teaching method does appear to reduce the training completion time.

Example 2

A tobacco company claims that its best selling cigarettes contain at most 40 mg of nicotine. Test this claim at the 1% significance level by using the results of 15 randomly selected cigarettes for which X = 42.6 mg and s = 3.7 mg.

Solution

1.

The null and alternate hypotheses are as follows:

Null hypothesis: H

0

:

 

40 mg

Alternate hypothesis: H

1

:

  40 mg

2.

Choose

= 0.01

3.

The test statistic to be used is t.

4.

Compute the t statistic: t = ( X -

) / (s /

) = (42.6 – 40)/(3.7/ 15 ) = 2.722

In this right-tailed test, the critical value of 2.625 is found from the Table using

DF = 15-1 = 14 and

= 0.01.

5.

Since the test statistic of t = 2.722 does fall in the critical region, we reject H

0.

6.

There is sufficient evidence to warrant rejection of the tobacco company's claim.

TESTING USING TESTS OF PROPORTION

The data we collect are basically either quantitative or qualitative. Quantitative data are also called variable data and can be counted or measured. Qualitative data, or attribute data, can be classified or described, but they cannot be counted or measured. Actual incomes of workers in various occupations would be an example of variable data, since they consist of specific numbers or quantities. A list of occupations, however, would be an example of attribute data, since they can only be described qualitatively. We are able to apply standard statistical methods to attribute data by representing that data in the form of a proportion or percentage.

To test a hypothesis made about a population proportion or percentage, we will therefore follow the standard procedure described earlier by the value of the test statistic will be found by computing Z as follows: z = ( x – np ) / ( n .

p .

q ) (see Binomial Distribution)

Were n = number of trials; p = population proportion (given in the null hypothesis); q = 1 - p

Example 1

A senator claims that 60% or more of his constituents favor a gun-control bill. An independent pollster contacts 500 constituents selected at random and finds 273 that favor the bill. What can we conclude about the senator's claim? Assume a 5% level of significance.

Solution

1.

The claim is p

0.60; the alternate to the original claim is p  0.60 .

Null hypothesis: H

0

: p

0.60

Alternate hypothesis: H

1

: p  0.60

2.

Specified

= 0.05

3.

The sample proportion should be used to test a claim about the population proportion. Since np and nq are both at least 5, these sample proportions have a distribution that can be approximated by the normal distribution.

4.

Compute the test statistic:

z = ( x – np ) / ( n .

p .

q ) = (273-(500)(0.6)) / ( ( 500 )( 0 .

6 )( 0 .

4 ) ) = -2.465

From the Table of z scores, the critical value for the left-tail (

= 0.05) is –1.645.

5.

The test statistic is in the critical region, we reject H

0.

6.

On the basis of the sample data, we reject the Senator's claim that 60% or more of the constituents favor the gun-control bill.

Example 2

Let us consider the close 1960 US presidential election and for the sake of simplicity, we will exclude consideration of electoral votes and popular votes cast for minor candidates.

We will assume that the votes cast represent a random sample taken from the population of all eligible voters. The results are summarized as follows:

Popular votes cast for Kennedy: 34,227,000

Popular votes cast for Nixon: 34,108,000

Total number of eligible voters: 108,297,000

Since Kennedy won by about 119,000 votes and about 40 million eligible people did not vote, it is reasonable to wonder whether the sample of 68,335,000 votes cast reflected the true proportion of preferences. Let us test the claim that the true population proportion for

Kennedy, denoted by p, exceeded 50% or 0.5 despite the small margin of victory.

This claim suggests the following null and alternative hypotheses:

Null hypothesis: H

0

: p

0.50

Alternate hypothesis: H

1

: p  0.50

We compute for the test statistic: z = ( x – np ) / ( n .

p .

q ) = (34,227,000-(68,335,000)(0.5)) /

( ( 68 , 335 , 000 )( 0 .

5 )( 0 .

5 ) ) = 14.40

The critical value of z = 2.33 determines a critical region which certainly contains the test statistic of 14.40. As a result, we reject the null hypothesis: we reject the claim that at most 50% of the voters actually favored Kennedy. That the apparently slim margin of victory is so significant might seem surprising when w consider the margin of 119,000 votes in comparison to the 40 million votes that were not cast. It's results like this that make the subject of statistics fascinating and useful.

The tests of proportions are very useful in a variety of applications including surveys, polls, and quality control considerations involving the proportions of defective parts.

TESTING USING TESTS OF VARIANCES

In a normally distributed population with variance

 2 , we randomly select independent samples of size s

2 n and compute the sample variance s

2

for each sample. The quantity ( n1)

/

 2

has a distribution called the chi-square distribution . We denote chi-square by χ

2

.

The test statistic used in tests of hypotheses about variances is χ 2

:

χ 2

= ( n1) s

2

/

 2 where n = sample size; s 2 = sample variance;

 2 = population variance.

The chi-square distribution resembles the student t distribution in that there is actually a different distribution for each sample size n. However, it does not have the same symmetric bell-shape. It has a longer right tail and does not include negative numbers.

Example 1

Find the critical values of which determine critical regions containing areas of 0.025 in each tail. Assume that the relevant sample size is 10 so that the degrees of freedom are

10-1, or 9.

Solution

See the following figure and refer from Table of chi-square distribution:

The critical value to the right (19.023) is obtained in a straightforward manner by locating 9 in the degrees-of-freedom column at the left and 0.025 across the top. The left critical value of 2.700 once again corresponds to 9 in the degrees-of-freedom, but we must locate 0.975 across the top since the values in the top row are always areas to the right of the critical value.

Example 2

Test the claim that scores on a standard I.Q. test have a variance equal to 225 if a sample of 41 students randomly selected subjects achieve scores with a variance of 258. Use a significance level of 0.05.

Solution

1.

The claim:

 2

= 225; alternate to this original claim:

 2 

225

Null hypothesis: H

0

:

 2

= 225

Alternate hypothesis: H

1

:

 2

225

2.

Choose significance level of 0.05

3.

Use the test statistic χ 2 .

4.

Compute:

χ 2

= ( n1) s

2

/

 2

= (41-1)(258) / 225 = 45.867

We find the right critical χ 2

value of 59.342 from the Table of chi-square distribution by locating 40 degrees of freedom and an area of 0.025. We find the left critical value of 24.433 by locating 40 DF and an area of 0.975. The test statistic based on the sample data is not in the critical region.

5.

We fail to reject the null hypothesis.

6.

On the basis of the available evidence, we cannot refute the claim that the variance equals 225.

Example 3

With individual lines at its various windows, a bank finds that the standard deviation for waiting times on Friday afternoons is 6.2 minutes. The bank experiments with a single main waiting line and finds that for a random sample of 25 customers, the waiting times have a standard deviation of 3.8 minutes. At the significance level of 0.05, test the claim that a single line causes lower variation among the waiting times.

Solution

1.

The claim:

  6.2 minutes; alternate to this original claim:

 

6.2 minutes

Null hypothesis: H

0

:

 

6.2

Alternate hypothesis: H

1

:

  6.2

2.

Choose significance level of 0.05

3.

Use the test statistic χ 2 .

4.

Compute:

χ 2

= ( n1) s

2

/

 2

= (25-1)(3.8

2

) / 6.2

2

= 9.016

This test is left-tailed since the null hypothesis will be rejected only for small values of χ 2

; with significance level of 0.05 and DF=24, we go to the Table and align 24 DF with an area of 0.95 to obtain the critical value of 13.848. The test statistic value of 9.016 falls within the critical region.

5.

We reject the null hypothesis.

6.

On the basis of the available evidence, the single main line does appear to lower the variation among waiting times.

Download