hypothesis testing

advertisement
LAB N – Chapter 9.1 and 9.2
INTRODUCTION: Two indispensable statistical decision-making tools are (i)confidence intervals (part 2),
and (ii) hypothesis testing (part 1). In this lab you will learn how to calculate confidence intervals and perform
hypothesis tests (by calculating the sigma) using Excel.
Use the two tabs for each part
Each part has instructions on entering the formulas into Excel. Follow these in the two tabs below. Before
going to the problems on the last page, double check that your results agree with the example result written in
red at the bottom of pages two and three. Then do the PROBLEMS on the last two pages.
“Ho – Problems” for Part One (HYPOTHESIS TESTING)
“CL – Problems” for Part Two (CONFIDENCE INTERVALS)
Cut and Paste your results into the tab: “RESULTS”
Then email the entire file to your teacher
Always title the file, and the subject line of your submittal as 1N-XXX or 4N-XXX
Part One - LESSION 9.2
HYPOTHESIS TESTING
This exercise tests a claim called the null hypothesis 𝐻𝑜 . This parameter p, and the size of the SRS, N are used
to find the standard deviation of a normal distribution. The study will also have a level of significance 𝛼 to
determine whether to accept the 𝐻𝑜 . Then the SRS results in a statistical value “P-hat”. This is compared to the
normal curve to see how probable (P-value) it would be if 𝐻𝑜 were true. If the P-value is less than the level of
significance 𝛼 you reject 𝐻𝑜 .
(1) Confidence Level: This is the probability that the value is correct based the assumption that the true
distribution is normal
(2) Margin of error: This is the range from Low to High that the statistics are within the confidence level:
For example a study could show that the true parameter is between 36 and 40% with a 95% confidence level.
Here are the cells you should define in Excel to show the P-value for the null hypothesis 𝐻𝑜 :
Note: enter your formulae into row 3 (row 2 is a 2nd row for headings)
A
1
2
3
4
B
𝐻𝑜 , p
.60
C
N
D
𝛼
E
X
100
.05
65
F
p-hat
G
S.D
H
Std Score
I
P-value
J
K
L
P-Value
P-Value
one side
two side
In row 1 you could enter the headings some of which are given in the problem, and some that are calculated
In row 3 you would enter the values given, and the appropriate calculation
Given: 𝐻𝑜 , p = .60 Sample Size N =100, Significance Level 5% number in favor X = 65, (columns B,C,D,E)
-note for the first problem, the X value is left blank and the P-hat value is given
Now enter these calculations into next columns:
columns
(Spaces not necessary, just added for clarity)
F
P-hat
= E3 / C3
G
Std Dev
= SQRT (B3 * (1 – B3) / C3)
H
Std Score = (F3 – B3) / G3
(this can be manually entered using Table B using the H2 value and extrapolation, (I have entered code that
I
P-value
automatically uses a Vlookup function and then finds the most accurate value)
=VLOOKUP(M3,TableB,3)
Below are the results for this example:
Example
𝑯𝒐 , p
0.6
N
100
Sig
level
0.05
X
p hat
sigma
65
0.65
0.04899
Std
Score
1.020621
P-value
0.154
The entire row can now be copied repeatedly and different entries made in columns B thru F for different
problems. (into the Green Cells)
LAB N – Part Two - LESSION 9.1
CONFIDENCE INTERVALS
CONFIDENCE INTERVALS
When an SRS yields a given result this statistical value is referred to as “P-hat”. It is considered an estimate of
the true parameter “P” of the population the sample intends to represent. The estimate is measured in two ways.
(3) Confidence Level: This is the probability that the value is correct based the assumption that the true
distribution is normal
(4) Margin of error: This is the range from Low to High that the statistics are within the confidence level:
For example a study could show that the true parameter is between 36 and 40% with a 95% confidence level.
Here are the cells you could define in Excel to show the confidence interval for a given confidence level:
A
1
2
3
4
B
N
C
X
100
30
D
CL
.95
E
P hat
F
G
H
I
S.D Mult Low High
J
In row 1 you could enter the headings some of which are given in the problem, and some that are calculated
In row 2 you would enter the values given, and the appropriate calculation
Given: Sample Size N =100, number in favor X = 30, Confidence Level 95% (columns B,C, D)
Now enter these calculations into next columns:
columns
E
P-hat
F
Std Dev
G
multiplier
H
I
Low
High
(Spaces not necessary, just added for clarity)
= C2 / B2
= SQRT (E2 * (1 - E2) / B2)
(this can be manually entered using Table 9.1 using the D2 value, or can use a Vlookup function)
= Vlookup (D2 , Critical_values, 2)
=E2 - G2 * F2
=E2 + G2 * F2
The entire row can now be copied repeatedly and different entries made in columns B, C and D for different
problems.
Below are the results for this example:
A
B
C
1
N
X
2
100
30
D
CL
0.95
E
P hat
0.3
F
Std Dev
0.04582576
G
Mult
1.95996398
H
low
0.21018317
I
high
0.38981683
PROBLEMS:
(I)
HYPOTHESIS TESTING
A standard final examination in an elementary statistics course is designed to produce a mean score of 75%.
Below is a list of 5 different classes that took the test with their respective mean scores. Using the .05 level of
significance, test the claim for each class that they are an average class (i.e. the null hypothesis if p=0.75). If
you reject 𝐻𝑜 then state if your class is most probably above or below average. (this is using a single-tail, but
you reject the null hypothesis, you can conclude it is above or below 75%)
Fill in all the missing values in the table
Accept or
Reject 𝐻𝑜
Class
number of
students (N)
average grade
(this is your “p hat”)
A
36
85%
below
B
49
65%
below
C
100
85%
below
D
81
65%
below
std dev
std score
p-value
conclusion
Conclusions:
A)
B)
C)
D)
Extra Credit:
Using your knowledge of the “two-tailed” distributions using null hypothesis that 35% (of the class likes the
teacher, find your conclusion using the following SRS’s whether this is valid.
Accept or
Reject
𝐻𝑜
Class
number of
students
(N)
Number that like
the Teacher (X)
A
36
16
below
B
49
30
below
C
100
25
below
D
81
40
below
Conclusions:
A)
B)
C)
D)
P Hat
std dev
std score
p-value
conclusion
(II)
CONFIDENCE INTERVALS
Find the confidence intervals and fill in the missing values for the following SRS results
Confidence Interval
Sample
Size
50
Positive Confidence
Responses
Level
24
.95
100
82
.99
5000
2500
.60
1000
600
.80
2000
200
.999
P hat
Std Dev
Multiplier
Low
High
Extra Credit:
Using your knowledge of the “magic number” find the sample size to get an 80% confidence with a margin of
error of +/- 5%. Then enter the resultant interval if your positive responses were 65.
Confidence Interval
Sample
Size
Positive Confidence
Responses
Level
65
.80
P hat
Std Dev
Multiplier
Low
High
Download