Chapters 22 and 23
OVERIEW a.
Ch. 10, 11: Relationship between two measurement variables.
Ho: there is no linear relationship between the two variables (correlation is
0)
Ha: There is a linear relationship between the two variables (correlation is not equal to 0) b.
Ch 12, 13:: relationship between two categorical variables
Ho: there is no relationship between the two variables
Ha: There is a relationship between the two variables c.
Ch 20&21: Confidence intervals for 1-proportion, 1-mean, 2-means d.
Ch 22&23: We have hypothesis tests for 1-proportion, or 1-mean, or 2-means where we want to test the validity of some claimed or “hypothesized” value.
4 Steps to conducting a hypothesis test:
1.
Determine null hypothesis (Ho) and alternative hypothesis (Ha)
2.
Collect data, summarize results using a single value called a
“Test Statistic”
3.
Find p-value i.e. how unlikely the test statistic assuming that Ho was true.
4.
Make decision by comparing p-value to some level of significance (0.05)
STEP 1:
Test 1-Proportion :
Ho : P = P o
P= symbol for population proportion i.e. parameter
Po= some hypothesized value (so we will put a value in for Po based on question)
Alternative: Three choices (select only one)
Ha: P
Po or Ha: P<Po or Ha: P>Po
Test for 1 mean:
Ho: u=u
0 u = symbol for population mean (parameter) u
0
: hypothesized value (again will substitute in some number)
1
Alternative: Three choices (select only one)
Ha: u
u
0
or Ha: u< u
0
or Ha: u> u
0
Test for 2-means:
Ho= u
1
- u
2
= 0
Ha: u
1
- u
2
0 or Ha: u
1
- u
2
> 0 or Ha: u
1
- u
2
< 0
The null hypothesis is simply that there is no difference between the two means.
STEP 2:
Test Statistics: = (Sample statistic- hypothesized value)/( SE)
Sample statistic=> P-hat (sample proportion)
X-bar (sample mean)
X-bar d
(difference between 2-sample means)
Hypothesized value is simply the p
0
or u
0
value (i.e. the value that is claimed to be true).
Test Statistic for 1-proportion = Z
P
ˆ
P o
P o
( 1
P o
)
N
Test Statistic for 1-mean = Z
X
u o
SD
N
Test Statistic for 2-means = Z
( X
1
SEM
1
2
X
2
)
SEM
2
2
where SEM is
SD
N
for group
1 and group 2, respectively.
For simplicity, each of these test statistics for 1proportion, 1-Mean, 2-Means are going to follow a standard normal or Z distribution (Table 8.1). Also, to remind us that this is a test statistic we may want to use the notation Z
STAT
2
STEP 3
Finding P-value is based on the alternative hypothesis.
1. If Ha uses “<” or “less than” the p-value is found by :
1. From Table 8.1 we look for the z-value closest to Z
STAT and the p-value is simply the proportion below that value
(NOTE: Picture a bell shaped graph with the lesser side shaded such as:)
2. If Ha uses “>” or “greater than” then the p-value is found by :
1.
From Table 8.1 we look for the z-value closest to Z
STAT then,
2.
The p-value is simply one minus the proportion below that
(NOTE: Picture a bell shaped graph with the greater side shaded such as:)
3. If Ha is “not equal” then p-value found by:
1.
From Table 8.1 find the z-value closest to the absolute value (i.e. positive value) of Zstat then,
2.
Take one minus the proportion below and finally,
3.
Double that result to get p-value.
STEP 4:
Compare the p-value to 0.05 --- if p-value is less than 0.05 reject the null hypothesis and state in words that the evidence supports the alternative. If p-value is greater than 0.05 fail to reject the null hypothesis and state in words that there is not enough evidence to support the alternative.
3
EXAMPLES:
1-proportion: A politician asserts that only 40% of Pennsylvanians are in favor of changing the way that presidential primaries are held. An anchor at Fox News disputed this, saying it was certainly more than 40%. A poll of 850 people showed that 382, or
45%, are in favor of changing the ways that presidential primaries are conducted. Who do you think is right--the politician or Fox News anchor?
Ho: p = 0.40 Ha: p > 0.40
Z
sample _ proportion hypo _ val * (
1 hypothesiz ed
hypo _ val )
_ value
0 .
0 .
45
40 *
( 1
0
.
40
0 .
40 )
0
0
.
.
05
24
2 .
94 n 850 850
Since Ha is “greater than” we find the p-value by taking 1 minus the “proportion below the test statistic” which we will use 3.00 from Table 8.1 since this is closest to 2.94 This results in a p-value of 1 minus 0.9987 which equals 0.0013 and is less than 0.05 so we reject Ho and conclude that more than 40% of Americans are in favor of changing the ways presidential primaries are conducted.
1-mean: Stat department instructors believes the average number of hours spent Stat 100 students spend studying during any week is less than 4 hours. From a survey of 100 students the mean number of hours students reported studying for Stat100 was 3.7 hours with standard deviation of 4 hours.
Ho: u = 4 Ha: u < 4
Z
X
u o
SD
3 .
7
4
4
0 .
3
4
0 .
3
0 .
4
0 .
75 =
N 100 10
Since Ha is “less than” we find the p-value by using Table 8.1 and getting the “proportion below our test statistic” or the proportion below – 0.74 (use – 0.74 since closest to – 0.75) which is 0.23 and since this is less than 0.05 we fail to reject Ho and conclude that there is not enough evidence to say that the mean number of hours students study for Stat100 is less than 4 hours a week. Thus 4 hours is a plausible number.
2-means: Students were asked: How many dates do you think it is appropriate to date someone before engaging in sexual activity? Several sections of stat 200 were surveyed to compare males and females on this issue with the research question asking if there is a difference in the genders on the average number of dates one should go on prior to engaging in sexual activity.
Ho: u
F
– u
M
= 0 Ha: u
F
– u
M
≠ 0
The results were as follows:
4
N Mean StDev SE Mean
Females 120 16.2 19.8 1.8
Males 71 8.2 26.0 3.1
Z
( X
F
SEM
F
2
X
M
)
0
2
SEM
M
16 .
2
8 .
2
1 .
8
2
3 .
1
2
8
3 .
58
2 .
24
Using Table 8.1 and with Ha being “not equal” we find p-value by finding the area to the right of absolute value of test statistic (this is 2.24). This is found be taking 1 minus the
“proportion below” 2.23 (use 2.33 since closest to 2.24) which is 1 minus 0.99 equaling
0.01 and then multiplying this value by 2 which results in a p-value of 0.02 Since less than 0.05 we reject Ho and conclude that there is a difference between the genders in the mean number of dates one should go on prior to engaging in sexual activity.
Determine if following statements represent a null or alternative hypothesis:
1. There is no difference between the proportion of males who smoke and females who smoke in America.
2. The average time to graduate for undergraduate Math majors is more than the average time to graduate for undergraduate Stat majors.
3. The average price of a particular statistics textbook sold online is the same as the average price of the textbook sold at all bookstores in State College.
4. The graduation rate for female division I athletes is not equal to the 84% claimed by the NCAA.
Answers:
1. Null (i.e. no difference)
2. Alternative (more than)
3. Null (same meaning no difference)
4. Alternative (not equal)
5