Chapter 8: Hypothesis Testing for One Population Mean, Variance

advertisement
Chapter 8: Hypothesis Testing for One Population Mean,
Variance, and Proportion
Learning Objectives
Upon successful completion of Chapter 8, you will be able to:
•
•
•
•
•
•
Understand terms.
State the null and alternative hypotheses.
Use the 5 steps to test hypotheses for both the critical value and p-value.
Test specific hypothesized values for means, variances, and proportions.
Describe the relationship between type I and type II errors.
Given a value for the test statistic, find the p-value.
I. General Concept
Hypothesis testing or a decision-making process between 2 choices about a population
parameter called: the null hypothesis and the alternate or research hypothesis.
A. Kinds of Hypotheses
•
•
The null hypothesis (H0) states that the parameter equals a specific value (or in Chapter 9
two parameters are equal).
The alternative hypothesis (H1) states the parameter is different from a specific value (or in
Chapter 9 the alternate hypothesis is a difference between two parameters).
B. Hypotheses Testing Methods
The traditional or crticial value method which is a location method.
The P-value (prob-value) method which is a comparison or areas method.
C. Common Phrases used in Testing
>
<
Is greater than, Is increased
Is less than, Is decreased or reduced from
≥
≤
Is greater than or equal to, Is at least
Is less than or equal to, Is at most
=
≠
Is equal to, Has not changed
Is not equal to, Has changed from
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 1
D. Very Important Notation
H0: Always has the = sign; Northing is different from the usual.
H1: Called the research or the alternative hypothesis; includes the direction the researcher
hopes to justify, it will always be <, >, or ≠.
E. Stating the Null and Alternate Hypotheses
I. Example:
Calcium is the most abundant mineral in the body and also one of the most important.
It works with phosphorous to build and maintain bones and teeth. The recommended
daily allowance (RDA) of calcium is 800 milligrams. Assume we want to test whether
people with income below the poverty level receive an average less than the RDA of
800mg.
.
H0 : μ = 800
H1 : μ < 800
II. Example:
The R.R. Bowker Company of New York collects information on the retail prices of
books. In 2000, the standard deviation of the price of history books was $3.25. Suppose
this company wishes to determine whether this year’s standard deviation is higher than
the standard deviation price in 2000.
.
H0 : σ = 3.25
H1 : σ > 3.25
Question 1
A researcher is concerned that the mean weight of honey bees is more than 11g.
a) H0 : μ = 11
H1 : μ ≠ 11
b) H0 : μ = 11
H1 : μ > 11
H1 : μ < 800
c) H0 : μ = 11
d) I do not know.
Question 2
The variance in the amount of salt in granola is different from 3.42mg.
a) H0 : σ = 3.42 H1 : σ ≠ 3.42
b) H0 : σ ≠ 3.42 H1 : σ = 3.42
c) H0 : σ2 = 3.42 H1 : σ2 ≠ 3.42
d) H0 : σ2 ≠ 3.42 H1 : σ2 = 3.42
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 2
Question 3
3% of homes remain unsold after 6 months in Newark, NJ. It is assumed that homes in the
vicinity of the college have a faster sales rate. What null and alternate hypotheses should
be used to test this theory?
a) H0 : p = .03
H1 : p ≠ .03
b) H0 : p = .03
H1 : p > .03
c) H0 : p = .03
H1 : p < .03
d) H0 : p < .03
H1 : p = .03
Question 4
A company that produces snack food uses a machine to package the product in 454 gram
quantities. They wish to test the machine.
Question 5
The NCAA believes 57.6% of football injuries occur during practice. A head trainer thinks
this is too high.
Question 6
A manufacturer of machine parts must have a standard deviation in measurement not more
than 0.32 mm. Use a sample of 25 parts to test if the standard deviation is outside of the
requirement.
F. Statistical Tests
I. Process
• A statistical test compares the sample statistic with the null hypothesized value to
decide whether or not the null hypothesis should be rejected.
• The numerical value obtained from a statistical test is called the test value or test
statistic.
II. Errors
a) Possible Outcomes of a Hypothesis Test
H0 True
Reject H0
Error Type I
Do not reject H0
Correct Decision
Dr. Janet Winter, jmw11@psu.edu
Stat 200
H0 False
Correct Decision
Error Type II
Page 3
b) Types of Errors
• A Type I error occurs if one rejects the null hypothesis when it is true.
o The maximum probability of committing a type I error:
•
P (type I error) ≤ α
o The probability of rejecting a true null hypothesis.
o Also called the level of significance or α.
o Set by the researcher. If not stated, use 0.05.
A Type II error occurs if one does not reject the null hypothesis when it is false.
o P (type II error) = β
o The probability of not rejecting a false null hypothesis.
o Decreases when α increases.
c) Level of significance ∝
• Maximum probability of type I error
• P(type I error)≤∝
• The probability of rejecting a true null hypothesis
• Set by researcher
• If not note, use 0.05
d) α (alpha) and β (beta) Probabilities
In most hypothesis testing situations, β cannot easily be computed.
However, decreasing either alpha or beta increases the other.
II. Hypothesis Testing of a Specified Value for the Population Mean: The
Traditional or Critical Value Method
(This is a “comparison of locations method”)
A. Terms
I. The rejection region is the range of test values that indicates a significant difference exists
between the sample statistic and the hypothesized parameter. The critical value marks
the start of the rejection region or critical region. If the test statistic is located in the
critical region, the null hypothesis should be rejected.
II. The non-critical or non-rejection region is the range of values of the test value that
indicates that the difference was probably due to chance. If the test value is located in the
non-critical region, the null hypothesis should not be rejected.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 4
III. Critical Value (c.v.)
• Separates the critical region or rejection region from the non-critical region or nonrejection region.
• It is determined from the appropriate table.
• It is based on the significance level and the alternate hypothesis.
B. Finding Critical Values
I. Concepts for One-Tailed Tests
• The critical region or rejection region is only on ONE side of the center value for the
distribution or in one tail.
• It is Right-tailed or left-tailed, depending on the direction of the inequality of the
alternate hypothesis.
• The area of one tail is equal to the level of significance.
a) Left-Tailed z-tests or t-tests
H1 : μ < k
H0 : μ = k
•
This is a left tailed test since the alternate hypothesis has the “<” symbol.
•
Since the alternate hypothesis is less than k, the rejection region is to the left of
a negative z critical value.
Noncritical
region
Critical
region
-z
0
Note: Whenever the alternative hypothesis is “less than”, the rejection region is
to the left of the center of the normal distribution. H1 contains “<” which points
to the left  like a left pointing arrow.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 5
•
To find a left tailed z critical value (c.v.), locate the level of significance or alpha
in the center portion of the Normal Probability Table. Move your hand along the
row to the left to find the units and tenths digits for z. Next, move your hand to
the top of the column to find the hundredth digits for z. Notice this value is
negative.
•
To find a left tailed t critical value, use the t table. Find the one tailed α row at
the top of the table. Use the column with the specified value of α for the
problem. Move your hand down this column to the row for the df in the
problem. Affix a negative sign to this value because the cv is less than 0.
•
Example:
For a left-tailed test, find critical z for α = 0.01
.
Since the test is left-tailed, alternate hypothesis contains a “less than” and the
rejection region is on the left. Locate 0.0100 inside of the normal probability
table. Move along the row to the left to -2.3. Again, starting at 0.0100, move to
the top of the column to .03. The critical value is – ( 2.3 + .03). The rule would be
to reject the null hypothesis if the test value is less than or equal to -2.33.
b) Right-Tailed z-tests or t-tests
H0 : μ = k
•
H1 : μ > k
This is a right tailed test since the alternate hypothesis has the “>” symbol.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 6
•
Since the alternate hypothesis is more than k, the rejection region is to the right
of a positive c.v.
𝛼
Noncritical
region
1−𝛼
0
Critical
region
+z
Note: Whenever the alternative hypothesis is “greater than” or has the “>”
symbol, the rejection region is to the right of the critical value (c.v.) and to the
right of the center of the normal distribution. H1 contains > which points to the
right like a right pointing arrow .
•
To find a one tailed, right tailed z critical value (c.v.), locate (1 – α) or 1 minus the
level of significance in the center portion of the Normal Probability Table. Move
your hand along the row to the left to find the units and tenths digits for z and
move your hand to the top of the column to find the hundredth digits for z.
Notice this c.v. is positive.
•
To find a one tailed, right tailed t critical value, use the t-table. Use the one tailed
row at the top of the page. Find the column for t and move your hand down this
column until you reach the row for n – 1 degrees of freedom. The entry at the
intersection of column t and row df is the c.v. DO NOT MAKE IT NEGATIVE.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 7
•
Example:
Use the z table to find the critical value for a right-tailed test for α = 0.05.
0
1.65
Since the test is right-tailed, the alternate hypothesis contains the “more than”
symbol and, the rejection region is on the right. Locate 1 – .05 = .9500 inside the
normal probability table. Since it is exactly half way between 1.64 and 1.65, the
critical value would be 1.645 or rounded to two decimal places it would be 1.65.
Thus reject for any test value is greater than or equal to 1.65.
II. Concepts for Two-Tailed Tests
• The alternative hypothesis always has
H0 : μ = k
H1 : μ k
• Since the alternative hypothesis has , the test statistic can be either on the right or
on the left of the center.
• The null hypothesis is rejected when the test value is in either of 2 critical or
rejection regions (one on the left and one on the right).
•
•
•
It is necessary to find critical values (c.v.) for both sides.
The critical values are opposites (only f the z and t tables or cv = ± c value).
The sum of the areas in the two tails is equal to the level of significance or alpha.
That is, each tail has probability or area
Note: every two-tailed test has the sign, 2 critical values, and 2 rejection regions.
Each tail has probability or area .
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 8
•
Example:
Using the z-table, find the critical values for a two tailed test when α = 0.01.
For a two-tailed test, the alternative hypothesis will be “not equal to” and the
rejection region is on the right and left with each area 0.01/2 = 0.005. To find the
critical value on the left, locate 0.005 inside the normal probability table. The left
critical value is -2.575 or rounded to two decimal places is -2.58. Using symmetry, the
critical value one the right is 2.58. The rule would be to reject the null hypothesis for
any text value greater than or equal to 2.58 AND also to reject the null hypothesis for
any test value smaller than -2.58. CV = ± 2.58
Question 7
Using the z-table, find the critical value for a right tailed test with
= .025.
Question 8
Use the z-table, find the critical value for a left tailed test with
= .10.
Question 9
Use the z-table, find the critical values for a two tailed test with
= .05.
III. Combining Concepts
• Example:
In 2002, the mean age of an inmate on death row was 40.7 years, according to data
obtained from the U.S. Department of Justice. A sociologist wants to test the
statement using the 0.10 level of significance. State the null and alternate
hypotheses. State the critical value or values, and the rejection rule.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 9
Since the sociologist is not implying a direction, there is none and the hypotheses
are:
H0 : μ = 40.7
H1 : μ ≠ 40.7
Since the alternative hypothesis is ≠, the test is two tailed with 0.10/2 = 0.05
probability in each tail. Use the Normal Probability table backwards to find the z
value with 0.05 in the left tail. The left critically value is z = -1.65. By symmetry the
right critical value is the opposite or +1.65. The rule is to reject the null hypothesis if
the test statistic is either greater than 1.65 or less than -1.65. CV = ± 1.65.
Question 10
An energy official thinks that the oil output per well has declined from the 1998 level of 11.1
barrels per day. Use the .001 level of significance. State the null and alternative
hypotheses. State the critical value or values, and the rejection rule.
C. Hypothesis Testing Process using the Critical Value or Traditional Method
(location method)
1. State the hypothesis.
2. Compute the test value or test statistic.
3. Find the critical value or beginning of the rejection region or regions from the
appropriate table.
4. Decide to reject or fail to reject the null hypothesis based on the location of the test value.
5. Record a conclusion in terms of the situation in the problem.
Note: This is very, very important – MEMORIZE THIS!!!
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 10
I. Test Statistics (to test the value of the Population Mean)
1. z Test for the Population Mean
Use z when 𝜎, the population standard deviation, is know and either n ≥ 30 or the
population is normally distributed.
𝑍=
𝑋�−𝜇
𝜎/√𝑛
𝑧 = (𝑋� − 𝜇) ÷ �𝜎 ÷ �(𝑛)�
Where:
𝑋�= sample mean
𝜇= hypothesized population mean
𝜎= population standard deviation
𝑛= sample size
2. t Test for the Population Mean
Use the t Test when 𝜎 is not known and either n ≥ 30 or the population is normally
distributed.
𝑋�−𝜇
𝑡 = 𝑠/
𝑑𝑓: 𝑛 − 1
√𝑛
𝑡 = (𝑋� − 𝜇) ÷ (𝑠 ÷ �(𝑛))
Your calculator work will be the same for every problem testing the specified value
for the mean.
II. Problems:
• Example:
According to the USA Today, the average age of commercial jets in the U.S. is 14
years. An executive of a large airline selects a sample of 36 planes. The average age
of the jets in this sample is 11.8 years. The population standard deviation is 2.7
years. At the 0.01 level of significance, is the data sufficient to conclude that the
average age of the planes in this company is less than the national average?
1. State the null and alternate hypotheses
H0 : μ = 14 H1 : μ < 14
2. Find the test statistic
𝑧=
11.8−14
�
2.7
�
√36
= (11.8 − 14) ÷ �2.7 ÷ √36� = −4.89
*Use z since the population standard deviation is stated and n = 36 ≥ 30.
NOTICE THAT THE TEST VALUE IS COMPUTED USING SAMPLE DATA.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 11
3. Find the critical value. This is a one tailed test with a level of significance equal to
0.01 or the probability in the left tail is 0.01. Since this is a z test statistic, use the
normal probability table to find: c.v. = -2.33
NOTICE THAT THE REJECTION REGION IS DEFINED BY A TABLED CRITICAL VALUE.
4. Since -4.89 is less than -2.33 or the test value is in the rejection region, reject the
null hypothesis.
5. The data supports an average age of the planes in the executive’s airline to be
less than the national average of 14 years.
•
Example:
Two researchers measured the pH of randomly selected lakes in the Southern Alps:
7.2 7.3 6.1 6.9 6.6 7.3 6.3 5.5 6.3 6.5 5.7 6.9 6.7 7.9 5.8
It is assumed that pH is a normally distributed random variable. Based on this sample
data, can we conclude that on average, lakes in the Southern Alps are non-acidic or
have a pH higher than 6.0?
Use your calculator to find the sample average and sample standard deviation. Since
the researcher used a random samples of lakes, use 𝑋� and 𝑠 from the calculator.
𝑋� = 6.60
𝑠 = .672
𝑛 = 15
1. Since there is a concern about a pH “higher than 6.0” use:
H0 : μ = 6.0
H1 : μ > 6.0
2. Use the t test statistic since the population standard deviation is not given but
the variable is normally distributed with n = 15 < 30.
𝑡=
(6.60−6.0)
�
.672
�
√15
= (6.60 − 6.00) ÷ �. 672 ÷ �(15)� = 3.46
NOTICE THAT THE TEST VALUE IS COMPUTED USING SAMPLE DATA.
3. Use the t-table with df: 15 – 1 = 14 to find c.v. = 1.761. The rejection region is to
the right of 1.761.
NOTICE THAT THE REJECTION REGION IS DEFINED BY A TABLED CRITICAL VALUE.
4. Since the test value 3.46 is larger than the c.v. = 1.761, it is in the rejection
region. Reject the null hypothesis.
5. Since the null hypothesis is rejected, it is removed and the hypothesis that
remains is the alternate hypothesis. The data supports an average pH greater
than 6.0 for lakes in the Southern Alps.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 12
Question 11
Concerned that adult females under 51 years are not getting adequate iron intake, a
statistician selected a random sample of women under 51 years old and found the listed iron
intake in milligrams for a 24-hour period. Use a 5% level of significance and test his concern.
Assume RDA for iron is 18mg and assume it is a normally distributed random variable.
15.0 18.1 14.4 14.6 10.9 18.2 18.3 15.0 11.0 15.6 11.6 19.8 20.7 13.1
12.1 11.5
Question 12
To see if young men ages 8 through 17 years spend an average of $24.44 per shopping trip
to a local mall, the manager surveyed 33 young men and found the average amount spent
per visit was $22.97. The standard deviation of the sample was $3.70. Assume the amount
spent on a shopping trip is normally distributed. At α = 0.02, can it be concluded that the
average amount spent at a local mall is not equal to the national average of $24.44?
III. Hypothesis Testing: P – value, prob – value or comparison of areas method
A. Computation of P
•
•
For a one-tailed test: the P-value is the area from the test statistic to more extreme
values in the direction of the alternative hypothesis.
For a two-tailed test: the P-value is twice the area from the test statistic to the end of
the tail. If the test value is less than zero, the area is to the left. If the test value is more
than zero, the area is to the right.
B. Process for the P-value Method (comparison of areas method)
1. State the hypotheses.
2. Compute the test value.
3. Find the P-value or area in the tail or tails past the test statistic.
Notice the test statistic is used to find the P-value.
4. Decision Rule: reject the null hypothesis whenever p is less than or equal to the level of
significance.
• If P-value ≤∝, reject the null hypothesis
• If P-value >∝, fail to reject the null hypothesis
NOTE: This is very, very important – MEMORIZE THIS!!!
5. Record a conclusion related to the problem.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 13
C. TI-83 Directions for Normal and T Probability
(Use the 2nd function VARS to access DISTR)
1. Use 2nd function VARS to enter the DISTR Menu
2. Select normalcdf (for z) or tcdf (for t)
3. For z, enter the values for the left end point and the right endpoint (separated by
commas) for the area or probability to be determined
• normalcdf (left endpoint, right endpoint)
• For H1 : μ < μ0, p = normalcdf (-100, test value)
• For H1 : μ > μ0, p = normal cdf (test value, 100)
• For H1 : μ ≠ μ0, p = 2normal cdf (|𝐭𝐞𝐬𝐭 𝐯𝐚𝐥𝐮𝐞|, 100)
4. For t, enter the values for the left and right endpoints and degrees of freedom for t.
• tcdf (left endpoint, right endpoint, df)
• For H1 : μ < μ0, p = tcdf (-100, test value, df)
• For H1 : μ > μ0, p = tcdf (test value, 100, df)
• For H1 : μ ≠ μ0, p = tcdf (|𝐭𝐞𝐬𝐭 𝐯𝐚𝐥𝐮𝐞|, 100, df)
Refer to the Chapter 6 Guide for more information on the normalcdf.
Example:
The average stopping distance of a school bus traveling 50 miles per hour is usually 26
feet (Snapshot, USA TODAY, March 12, 1992). A group of automotive engineers
determined the average stopping distance for 30 randomly selected busses, traveling 50
miles per hour to be 262.3 feet. The standard deviation of the population was 3 feet.
Use this data to test if the average stopping distance is actually less than 264 feet.
Use the P-value method with α = 0.01.
1. H0 : μ = 264 H1 : μ < 264
2. 𝒛 =
𝑿−𝝁
𝝈
√𝒏
=
𝟐𝟔𝟐.𝟑−𝟐𝟔𝟒
𝟑
√𝟑𝟎
= (𝟐𝟔𝟐. 𝟑 − 𝟐𝟔𝟒) ÷ �𝟑 ÷ √𝟑𝟎� = −𝟑. 𝟏𝟎
*Use z since the population standard deviation is stated and n = 36 ≥ 30.*
3. Since this is a one tailed test, find the area less than -3.10 or P(z < -3.10) = .0010.
The P-value is .0010 = normalcdf (-100, -3.10)
4. Reject the null hypothesis since P = .0010 < α = .010.
5. The data supports an average stopping distance less than 264 feet.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 14
Example: Critical Value Method
The average salary of graduates entering the actuarial field is reported to be $500000.
To test this figure, a statistics professor surveyed 20 graduates. Their average salary is
$53,228 with a standard deviation of $4000. Assume that these salaries are normally
distributed and use the critical value method with a 0.05 level of significance to test the
reported $55000.
X = $53, 228
s = $4,000
1. H0 : μ = 50,000
2. 𝒕 =
𝑿−𝝁
𝒔
√𝒏
=
H1 : μ ≠ $50,000
𝟓𝟑,𝟐𝟐𝟖−𝟓𝟎,𝟎𝟎𝟎
𝟒𝟎𝟎𝟎
√𝟐𝟎
3. c.v. = ± 2.093 d.f. = 19
0
− 2.093
n = 20
= −𝟑. 𝟔𝟏
2.093
4. The test statistic 3.61 is further out in the tail past the cv = 2.093. Reject the null
hypothesis.
5. The data is sufficient to reject or refute an average salary equal to $50,000.
Example: P – Value Method
The average salary of graduates entering the actuarial field is reported to be $500000.
To test this figure, a statistics professor surveyed 20 graduates. Their average salary is
$53,228 with a standard deviation of $4000. Assume that these salaries are normally
distributed and use P-value method with a 0.05 level of significance to test the reported
$55000.
1. H0 : μ = 50,000
2. 𝑡 =
𝑋−𝜇
𝑠
√𝑛
=
H1 : μ ≠ $50,000
53,228−50,000
4000
√20
= −3.61
3. For a two tailed test, be sure to double the tail area past the test statistic.
4. Find twice the area in the tail from the test value -3.61 to the end of the left tail or P
= 2P(t < -3.61) = 2tcdf(-100, -3,61, 19) = .0019
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 15
5. Since P is less than the level of significance 0.05, reject the null hypothesis.
6. The data is sufficient to reject an average salary equal to $50,000.
Example:
In 2001, the mean household expenditure for energy was $1493, according to data
obtained from the U.S. Energy Information Administration. An economist wants to know
whether this amount has changed significantly from its 2001 level. Using a random
sample of 35 household, she finds the mean expenditure (in 2001 dollars) for energy
during the most recent year to be $1618, with standard deviation $321. Complete the
test for the economist at the 0.05 level of significance using the P-value method.
.
𝑿 = 𝟏𝟔𝟏𝟖,
𝒔 = 𝟑𝟐𝟏,
𝒏 = 𝟑𝟓
1. H0: μ = 1493 H1: μ ≠ 1493
2. Use t since the standard deviation for the population is not given and n=35 >30
𝑋−𝜇
1618 − 1493
= (1618 − 1493) ÷ �321 ÷ √35� = 2.30
𝑠 =
321
√𝑛
√35
3. This is a two tailed test since the alternate hypothesis has ≠.
The degrees of freedom are n – 1 or 35 -1 = 34
Calculator: P = 2P(t >2.30) =2tcdf (2.30,100, 34) = .0277
Table: Using the row for degrees of freedom 34, 2.30 is between the values 2.032
(go to the top of its column) with two tailed probability 0.05 AND 2.441 (go to the
top of its column) with two tailed probability 0.02. This means P is bounded by 0.05
and 0.02 or 0.02 < P < 0.05
4. From the calculator, P = .0277 and P < .05. Reject the null hypothesis.
From the table, since the upper bound for P is 0.05, P < .05. Reject the null
hypothesis.
5. The data is sufficient to reject an average energy expenditure equal to $1493.
𝑡=
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 16
Question 13
Last year the average cost of a concert ticket was $54.80. This year, a random sample of 15
recent concerts had an average price of $62.30 with a variance of $90.25. At the 0.05 level
of significance, can it be concluded that the cost has increased?
Question 14
A study published in the American Journal of Psychiatry measured the effect of alcohol on
the developing hippocampus, or the portion of the brain responsible for long term memory.
To determine if the volume of the hippocampus is less than the normal 9.02 cm3 for
adolescents who abuse alcohol, the research used a sample of 12 adolescents with alcohol
abuse problems. The average weight of their hippocampus was 8.10 cm3 with a standard
deviation of 0.7 cm3 . Use the 0.01 level of significance.
Question 15:
The U.S. golf Association requires that golf balls have a diameter that is 1.68 inches. An
engineer for the USGA wishes to discover whether Maxfli XS golf balls have a mean
diameter different from 1.68 inches. A random sample of Maxfli Xs golf balls was selected.
Assume the diameters are normally distributed and test with a 0.10 type one error rate.
Conduct the test using the P-value method.
1.683
1.684
1.677
1.684
1.681
1.673
Using a calculator find:
Question 16:
1.685
1.685
𝑋 = 1.6810,
1.678
1.682
1.686
1.674
𝑠 = 0.0045,
𝑛 = 12 and then conduct the test.
A Gallup Poll stated that women visit their physician an average of 5.8 times a year. The
number of physician visits for 2007 is listed below for 20 randomly selected women:
3
8
2
0
1
5
3
6
7
4
2
2
9
1
4
3
6
4
6
1
Assume that the number of physician visits is normally distributed. At the 0.05 level of
significance can we conclude that the Gallup Poll average is correct? Use the p-value
method.
𝜇 = 5.8 Use a calculator to find: 𝑋� = 3.85 n = 20 s = 2.52 ∝ =0.05
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 17
IV.Hypothesis Tests for a Proportion (Always use z)
A. Formula
Use this method only when 𝒏𝒑 ≥ 𝟓 and 𝒏(𝟏 − 𝒑) ≥ 𝟓
𝒛=
�− 𝒑
𝒑
𝒑𝒒
�𝒏
where
� − 𝒑) ÷ �(𝒑 ∙ 𝒒 ÷ 𝒏)
= (𝒑
𝑿
�=
𝒑
𝒏
𝒑 = 𝒉𝒚𝒑𝒐𝒕𝒉𝒆𝒔𝒊𝒛𝒆𝒅 𝐩𝐫𝐨𝐩𝐨𝐫𝐭𝐢𝐨𝐧
𝒏 = 𝐬𝐚𝐦𝐩𝐥𝐞 𝐬𝐢𝐳𝐞
� and the hypothesized proportion p in the
Note: the positions of the sample proportion 𝒑
formula.
B. Problem Characteristics:
•
•
all involve a hypothesis about a percent, proportion, or fraction
change all percents, proportions, or fractions to four place decimals before computing
the test statistic
•
If necessary, form 𝑝̂ = 𝑛 where x is the count of the number of success and n is the
𝑋
maximum number of tries
•
•
𝑋
𝑝̂ = 𝑛 is derived from data
Example:
In a survey conducted by the American Animal Hospital Association, 37% of the
respondents state that they talk to their pets on the answering machine or telephone. A
veterinarian thought this percent was high. He randomly selected 150 pet owners. 54 of
them responded that they speak to their pet on the answering machine or telephone.
Use this data to run the test at the 0.10 level using the p-value method.
P = 0.37
𝟓𝟒
�=
𝒑
= 𝟎. 𝟑𝟔 is derived from data
𝟏𝟓𝟎
1. H0: p = 0.37 H1: p < 0.37
2. The test statistic is:
𝑧=
.36− .37
(.37)( .63)
150
�
(claim)
= (.36 − .37) ÷ �(. 37 ∙ .63 ÷ 150) = -.26
3. Since this is a one tailed test, find the area less than - 0.26 or
P = P(z < -0.26) =.3974 or
P = normalcdf(-100, -.26) = .3974
4. Fail to reject the null hypothesis since P = .3974 > α = .010.
5. The data does not support less than 37% of pet owners talking to their pets on the
telephone or answering machine.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 18
Question 17:
It has been reported that 40% of the adult population over 60 use e-mail. From a random
sample of 180 adults, 65 used e-mail. At ∝ = 0.01, is there sufficient evidence to conclude
that the proportion differs from 40%? Use the P-value method.
Question 18:
USA TODAY reported that 63% of Americans will take a vacation this summer. In a survey of
143 Americans 85 were planning to vacation this summer. Use this data to test the USA
TODAY report at the 0.05 level of significance with the critical value method.
V. Hypothesis Tests for One Variance (always use the Chi-Squre Test)
A. Formula
With d.f. = n – 1 where
Notice the position of the sample variance and the hypothesized variance in this formula.
B. Review of Chi-Square Distributions
The chi-square distributions are a family of probability distributions identified by degrees of
freedom (n-1) with:
1. Zero or positive values
2. Distribution skewed to the right
3. One mode slightly to the left of the degrees of freedom
Important: The mode slightly left of degrees of freedom
Chi-Square Distributions depend on degrees of freedom
Note: as df increases the curve moves to the right
If degrees of freedom are not listed in the table, use the closest smaller value.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 19
C. Comments
•
•
•
•
•
•
•
•
Also used to test a value of a standard deviation
(square the sample and hypothesized standard deviation to find the variances)
Can be left tailed, right tailed, or two tailed
Left tailed test statistics should be to the left of df
Right tailed test statistics should be to the right o df
Two tailed test statistics can be either on the left or the right of the df
Mark the df slightly to the right of the mode. Now you know the approximate location
of the center.
The value of the c.v. is always positive for 𝑥 2
Example:
A researcher believes that the standard deviation of the number of cars stolen each year
is less than 15. For a sample of 12 years, the standard deviation for the number of
stolen aircrafts is 13.6. Use ∝ = 0.05 and the critical value method for the test.
Since the “researcher believes that the standard deviation of the number of cars stolen
each year is less than 15”, the test is a left tailed test. The test statistics should be on the
left of the df and the critical value will be to the left of the degrees of freedom which is
approximately at the mode. The p-value will be the area from the test statistic to the left
end of the tail.
1. H0 : 𝝈 = 15
2. 𝒙𝟐 =
(𝒏−𝟏)𝒔𝟐
𝝈𝟐
H1 : 𝝈 < 15
=
(𝟏𝟐−𝟏)∙𝟏𝟑.𝟔𝟐
𝟏𝟓𝟐
= 𝟏𝟏 ∙ 𝟏𝟑. 𝟔𝟐 ÷ 𝟏𝟓𝟐 = 𝟗. 𝟎𝟒
*Since this is a left tailed test, use the 1- ∝ = 1 - .05 = .95 𝒙𝟐 column. With df: n – 1
= 12 – 1 = 11 to find c.v.
3. C.V. = 4.575 d.f. = 11
Note: This c.v. is always positive.
4.575
11
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 20
4. The test statistic 9.04 is not less than the critical value 4.575. Do not reject the null
hypothesis.
5. The data is not sufficient to reject a standard deviation equal to 15.
D. TI-83 Calculator: P-values for the Chi-Square Statistics
•
•
•
•
For left tailed tests, the test statistic is left of df and p is the area from zero to the test
statistics or the area on the left or
• p = (0, test statistic, df)
For right tailed tests, the test statistic is right of df and p is the area from the test value
to the end of the right tail or
• p = (test statistic, 1000, df)
For two tailed tests, first determine if the test value is to the right or left of the df
• If the test value is less than df, p = 2 (0, test statistic, df)
• If the test value is greater than df, p = 2 (test statistic, 1000, df)
Example:
A random sample of 20 different kinds of doughnuts had the calories listed above. At ∝
= 0.01, is there sufficient evidence to conclude that the standard deviation of calorie
content is greater than 20 calories? Use the critical value method.
290
270
320
260
260
310
220
200
300
250
310
250
310
270
270
210
250
260
230
300
Use your calculator to find the sample standard deviation. s = 35.11
1. H0 :
H1 :
2.
3. The c.v. is 36.191.
4. For a right tailed test, the cv is to the right of degrees of freedom. Since the test
statistic is further to the right than the cv, reject the null hypothesis.
5. The data is sufficient to support a standard deviation greater than 20.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 21
Question 19:
The manager of a large company is concerned about the variability of the time that it takes
a telephone call to be transferred to the correct office in her company. A sample of 15 calls
is selected, and the transfer time is recorded. The standard deviation of the sample is 1.8
minutes. At ∝ = 0.01, test if the population standard deviation is more than 1.2 minutes.
Use the P-value method.
Question 20:
The data below is a random sample of home run totals for National League Champions from
1938 to 2001. At the 0.05 level of significance, is there sufficient evidence to conclude that
the variance is smaller than 70? Use the prob-value method and the cv method.
34
44
47
43
43
40
23
39
36
41
50
47
42
45
VI.Summary
•
A statistical hypothesis is a conjecture about a population mean, proportion, or variance.
•
There are two hypotheses: the null hypothesis states that there is no difference (=) and the
alternative hypothesis specifies a difference ( <, >, or ≠).
•
Researchers compute a test value from the sample data to decide whether the null
hypothesis should or should not be rejected.
•
If null hypothesis is rejected, the difference between the population parameter and the
sample statistic is said to be significant.
•
The difference is determined to be significant when either:
• the test value falls in the critical region or
• the p-value is less than or equal to α, the level of significance of the test.
•
The significance level of a test is the probability of committing a type I error.
•
The significance level is usually specified in the problem. The default value for the level of
significance is 0.05.
•
A type I error occurs when the null hypothesis is rejected when it is true.
•
The type II error can occur when the null hypothesis is not rejected when it is false.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 22
Answer: Question 1
A researcher is concerned that the mean weight of honey bees is more than 11g.
a) H0 : μ = 11
H1 : μ ≠ 11
b) H0 : μ = 11
H1 : μ > 11
c) H0 : μ = 11
H1 : μ < 800
d) I do not know.
Answer: Question 2
The variance in the amount of salt in granola is different from 3.42mg.
a) H0 : σ = 3.42
H1 : σ ≠ 3.42
b) H0 : σ ≠ 3.42
H1 : σ = 3.42
2
c) H0 : σ = 3.42
H1 : σ2 ≠ 3.42
2
d) H0 : σ ≠ 3.42
H1 : σ2 = 3.42
Answer: Question 3
3% of homes remain unsold after 6 months in Newark, NJ. It is assumed that homes in the vicinity
of the college have a faster sales rate. What null and alternate hypotheses should be used to test
this theory?
a) H0 : p = .03
H1 : p ≠ .03
b) H0 : p = .03
H1 : p > .03
c) H0 : p = .03
H1 : p < .03
d) H0 : p < .03
H1 : p = .03
Answer: Question 4
A company that produces snack food uses a machine to package the product in 454 gram
quantities. They wish to test the machine.
H0 : μ = 454 H1 : μ ≠ 454
Answer: Question 5
The NCAA believes 57.6% of football injuries occur during practice. A head trainer thinks this is too
high.
H0 : p = .576 H1 : p < .576
Answer: Question 6
A manufacturer of machine parts must have a standard deviation in measurement not more than
0.32 mm. Use a sample of 25 parts to test if the standard deviation is outside of the requirement.
H0: σ = 0.32 H1: σ > 0.32
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 23
Answer: Question 7
Use the z-table, find the critical value for a right tailed test with α = .025.
Since this is a right tailed test, the alternate hypothesis has a “greater than” symbol and the
rejection region is on the right. Locate 1 - .025 = .0750 inside the normal probability table. Moving
to the left and then moving to the top of the table, the z value is 1.96. The c.v. is 1.96 and the rule
would be to reject the null hypothesis if the test value is greater than 1.96.
Answer: Question 8
Use the z-table, find the critical value for a left tailed test with α =.10.
Since the test is left-tailed, alternate hypothesis contains a “less than” and the rejection region is
on the left. Locate 0.100 inside of the normal probability table. Move along the row to the left to 1.2. Again, starting at 0.100, move to the top of the column to .08. The critical value is – ( 1.2 +
.08). The rule would be to reject the null hypothesis if the test value is less than or equal to -1.28.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 24
Answer: Question 9
Use the z-table, find the critical values for a two tailed test with α = .05.
For a two-tailed test, the alternate hypothesis will be “not equal to” and the rejection region is on
the right and left with each area 0.05/2 = 0.025. To find the critical value on the left, locate 0.025
inside the normal probability table. The left critical value is -1.96. Using symmetry, the critical
value on the right is +1.96. The rule would be to reject the null hypothesis for any test value
greater than or equal to 1.96 AND also to reject the null hypothesis for any test value smaller
than -1.96. CV = ± 1.96
Answer: Question 10
An energy official thinks that the oil output per well has declined from the 1998 level of 11.1
barrels per day. Use the .001 level of significance. State the null and alternate hypotheses. State
the critical value or values, and the rejection rule.
Since the energy official “thinks that the oil output per well has declined”, the hypotheses are:
H0: μ = 11.1
H1: μ < 11.1
The test is left tailed since the alternate hypothesis has the “<” symbol. The rejection area in the
one tail is equal to the level of significance or .001.
Use the Normal Probability table backwards to find the z value with 0.001 in the left tail. There are
several z values with left tail probability equal to 0.001. Use the center z value or -3.09. The rule is
to reject the null hypothesis if the test statistic is less than -3.09.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 25
Answer: Question 11
Concerned that adult females under 51 years are not getting adequate iron intake, a statistician
selected a random sample of women under 51 years old and found the listed iron intake in
milligrams for a 24-hour period. Use a 5% level of significance and test his concern. Assume RDA
for iron is 18mg and iron intake is normally distributed.
.
15.0
13.1
18.1
12.1
14.4
11.5
14.6
Use a calculator to find:
10.9
18.2
𝑋 = 14.99,
18.3
15.0
𝑠 = 3.22,
11.0
15.6
11.6
19.8
20.7
𝑛 = 16
1. H0 : μ = 18.0
H1 : μ < 18.0 (claim)
“not getting enough calcium” translates into an alternate hypothesis μ < 18.0
2. Use t since the population standard deviation is not given and the distribution is normal.
(𝟏𝟒. 𝟗𝟗 − 𝟏𝟖. 𝟎𝟎)
𝒕=
= (𝟏𝟒. 𝟗𝟗 − 𝟏𝟖. 𝟎𝟎) ÷ �𝟑. 𝟐𝟐 ÷ �(𝟏𝟔)� = −𝟑. 𝟕𝟒
𝟑. 𝟐𝟐
�
�
√𝟏𝟔
3. Use the t-table with 𝒅𝒇– 𝟏𝟔– 𝟏 = 𝟏𝟓, because the population standard deviation σ is not given,
but the variable is normally distributed. c.v. = -1.753. Since this is a left-tailed test, the
rejection region is to the left of -1.753.
4. Since the test value -3.74 is to the left of c.v.=-1.753, it is in the rejection region. Reject the null
hypothesis.
5. The data is sufficient to support the average daily iron intake is less than 18 for women under
51 years.
Answer: Question 12
To see if young men ages 8 through 17 years spends an average of $24.44 per shopping trip to a
local mall, the manager surveyed 33 young men and found the average amount spent per visit was
$22.97. The standard deviation of the sample was $3.70. Assume that the amount spent is
normally distributed. At α = 0.02, can it be concluded that the average amount spent at a local mall
is not equal to the national average of $24.44?
.
1. H0 : μ = $24.44
2. 𝒕 =
(𝟐𝟐.𝟗𝟕−𝟐𝟒.𝟒𝟒)
�
𝟑.𝟕𝟎
�
√𝟑𝟑
3. c.v. = ± 2.449
H1 : μ ≠ $24.44
= (𝟐𝟐. 𝟗𝟕 − 𝟐𝟒. 𝟒𝟒) ÷ �𝟑. 𝟕𝟎 ÷ �(𝟑𝟑)� = −𝟐. 𝟐𝟖
df: 32.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 26
-2.449
0
2.449
4. Do not reject the null hypothesis. -2.28 is the test statistic and is not less than -2.449, nor is it
greater than +2.449.
5. The data is not sufficient to reject a mean average spending per shopping trip for men 8
through 17 years equal to $24.44.
Answer: Question 13
Last year the average cost of a concert ticket was $54.80. This year, a random sample of 15 recent
concerts had an average price of $62.30 with a variance of $90.25. Assume the price of concert
tickets is normally distributed. At the 0.05 level of significance, can it be concluded that the cost
has increased? Use the critical value and the P-value method.
n = 15 𝑋�=$62.30 𝑠 2 = $90.25
Critical Value Method:
1. H0 : μ = $54.80 H1 : μ > $54.80 (claim)
2.
3. Use the t table to find the critical value with 0.05 in the one tail column and df 14
c.v. = 1.761
3.06
1.761
4. Since the test statistic (3.06) is further out in the tail past the cv, reject the null hypothesis.
5. The data supports an increase in the average cost of concert tickets.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 27
P-value Method:
1. H0 : μ = $54.80 H1 : μ > $54.80 (claim)
2. 𝑡 =
𝑋−𝜇
𝑠
√𝑛
=
62.3−54.8
√90.5
√15
= 3.06
3. Find the area in the tail to the right of 3.06 (the test statistic) with df 15 – 1 = 14. The closest
value on the chart is 2.997. Since the area in the tail to the right of 2.997 is .005, the area in the
tail to the right of 3.06 would have to be less than .005 OR p < .005 using the formula p=tcdf
(3.00, 100, 14)
4. Since P < .005, it is less than the level of significance which is .05. Reject the null hypothesis
since p is less than alpha.
5. The data supports an average price greater than $54.80.
Answer: Question 14
A study published in the American Journal of Psychiatry measured the effect of alcohol on the
developing hippocampus, or the portion of the brain responsible for long term memory. To
3
determine if the volume of the hippocampus is less than the normal 9.02 cm for adolescents who
abuse alcohol, the research used a sample of 12 adolescents with alcohol abuse problems. The
3
3
average weight of their hippocampus was 8.10 cm with a standard deviation of 0.7 cm . Use the
0.01 level of significance.
1. H0 : μ = 9.02
2. 𝒕 =
𝑿−𝝁
𝒔
√𝒏
=
H1 : μ < 9.02 (claim)
𝟖.𝟏𝟎−𝟗.𝟎𝟐
𝟎.𝟕
√𝟏𝟐
= (𝟖. 𝟏𝟎 − 𝟗. 𝟎𝟐) ÷ �𝟎. 𝟕 ÷ √𝟏𝟐� = −𝟒. 𝟓𝟓
*Use t since the population standard deviation is not stated and n = 12 *
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 28
3. Use the row with df: 12 – 1 to find the area less than -4.55. -4.55 is not on the table. -3.055 is
the number closes to it.
P = P( t < -4.55) < P(t < -3.055) = .005 So, the P-value less than .005.
4. Reject the null hypothesis since P < .005 < α = .01.
5. The data supports a decrease in the size of the long term memory portion of the brain for
adolescent who abuse alcohol.
Answer: Question 15
The U.S. golf Association requires that golf balls have a diameter that is 1.68 inches. An engineer for
the USGA wishes to discover whether Maxfli XS golf balls have a mean diameter different from 1.68
inches. A random sample of Maxfli Xs golf balls was selected. Assume the diameters are normally
distributed and test with a 0.10 type one error rate. Conduct the test using the P-value method.
1.683
1.684
1.677
1.684
1.681
1.673
Using a calculator find:
1. H0: μ = 1.68 (claim)
2. 𝑡 =
𝑋−𝜇
𝑠
√𝑛
=
1.6810−1.68
.0045
√12
1.685
1.685
𝑋 = 1.6810,
1.678
1.682
𝑠 = 0.0045,
H1: μ ≠ 1.68
1.686
1.674
𝑛 = 12
= (1.6810 − 1.68) ÷ �0.0045 ÷ √12� = 0.77
*Use t since the population standard deviation is not stated, but the diameters are normally
distributed with n = 12.
3. Since this is a two tailed test, find twice the area greater than 0.87 or P = 2P(t > 0.77) = 2tcdf(
0.77, 1000, 11) = .4575
4. Fail to reject the null hypothesis since P = .4575 > α = .010.
5. The data is not sufficient to reject an average diameter equal to 1.68.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 29
Answer: Question 16
A Gallup Poll stated that women visit their physician an average of 5.8 times a year. The number of
physician visits for 2007 is listed below for 20 randomly selected women:
3
8
2
0
1
5
3
6
7
4
2
2
9
1
4
3
6
4
6
1
Assume that the number of physician visits is normally distributed. At the 0.05 level of significance
can we conclude that the Gallup Poll average is correct? Use the p-value method.
𝜇 = 5.8 𝑋� = 3.85 n = 20 s = 2.52 ∝ =0.05
1. H0 : μ = 5.8
2.
H1 : μ ≠ 5.8
3. For a two tailed test, be sure to double the area past the test statistic.
P = 2P (t <-3.46)
with the TI-83: P = 2tcdf (-100, -3.46, 19) = .0026
with table F and df = 19, -3.46 is smaller than the smallest value in that row. 2.861 is in the two
tailed column .01 so, P < .01
4. For calculator, since p = .0026 is less than alpha = .0500, reject the null hypothesis
5. For table work, since P < .01, it is also less that ∝ = .05. Reject the null hypothesis.
6. The data is sufficient to refute or reject an average number of yearly physician visits equal to
5.8 for females.
Answer: Question 17
It has been reported that 40% of the adult population over 60 use e-mail. From a random sample
of 180 adults, 65 used e-mail. At ∝ = 0.01, is there sufficient evidence to conclude that the
proportion differs from 40%? Use the P-value method.
1. H0 : p = 0.40
2.
p = 0.40 q = 0.60
H1 : p ≠ 0.40
*Note: c.v. = ±2.58
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 30
3. For a two tailed test, the p-value is twice the area or probability in the tail past the test statistic.
Table: P = 2P (z < -1.07) = 2(.1423) = .2846
TI-83: P = 2 normalcdf(-100, -1.07) = 0.2846
4. Since p = 0.2846 is not less than 0.01, do not reject the null hypothesis.
5. The data does not support a proportions using email different from 40%.
Answer: Question 18
USA TODAY reported that 63% of Americans will take a vacation this summer. In a survey of 143
Americans 85 were planning to vacation this summer. Use this data to test the USA TODAY report
at the 0.05 level of significance with the critical value method.
p = 0.63 q = 0.37
1. H0 : p = 0.63 (claim) H1 : p ≠ 0.63
2.
3. C.V. = ± 1.96
4. -0.88 is not located in the rejection region which is further out in the tail from the c.v. Do not
reject the null hypothesis.
5. The data is not sufficient to reject 63% of Americans will vacation this summer.
Answer: Question 19
The manager of a large company is concerned about the variability of the time that it takes a
telephone call to be transferred to the correct office in her company. A sample of 15 calls is
selected, and the transfer time is recorded. The standard deviation of the sample is 1.8 minutes.
At ∝ = 0.01, test if the population standard deviation is more than 1.2 minutes. Use the P-value
method.
= 0.01 s = 1.8 n = 15 d.f. = 14
1. H0 :
= 1.2
H1 :
> 1.2
2.
3. This is a right tailed test, and the test statistic 31.5 is to the right of df = 14. Use right tail.
4. P – value = 0.0047 < 0.01 Reject the null hypothesis.
5. The data does not support a standard deviation more than 1.2 minutes.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 31
Answer: Question 20
The data below is a random sample of home run totals for National League Champions from 1938
to 2001. At the 0.05 level of significance, is there sufficient evidence to conclude that the variance
is smaller than 70? Use the prob-value method and the cv method.
34
44
47
43
43
40
23
39
36
41
50
47
42
45
Critical Value Answer:
1. a
2. a
5.892
3. cv = 5.892
The rejection region is to the left of the cv since this is a left tailed test.
4. Fail to reject the null hypothesis since the test statistic 8.43 is not in the rejection region.
5. The data does not support a variance in the number of home runs smaller than 70.
P-value Answer:
1. a
2. a
3. Since this is a left tailed test, the p-value is the area between 0 and the test statistic or
For table answers use the df line for 13
p > .10 and is not less than = 0.05
4. Since p is not less than the level of significance of 0.05, do not reject the null hypothesis
5. The data does not support a variance in the number of home runs smaller than 70.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 32
Download