File - phsapstatistics

advertisement
WARM - UP
EXAMPLE: A survey of randomly selected College students age 21
years and younger, which found that 411 of 1012 men and 535 of
1062 women enjoyed college statistics. Is there evidence that the
proportion of men who enjoy college statistics statistics differs
from that of women? (α = 0.05)
pi = The true proportion of student.
pm = Men and pw = Women
H0: pm = pw
Ha: pm ≠ pw
TWO Proportion
z – Test
pˆ m
z
p1
p
pˆ w
1
nm
CONDITIONS
1. SRS – The data was collected randomly
2. Population of Men is ≥ 10 · (1012)
Population of Women is ≥ 10 · (1062)
3. 1012 · (0.41) ≥ 10 AND 1012 · (1 – 0.41) ≥ 10
1062 · (0.50) ≥ 10 AND 1062 · (1 – 0.50) ≥ 10
1
nw
P Value
2P Z
4.4626
2 normalcdf
E 99, 4.4626
0
Since the P-Value is less than α = 0.05 the data IS
significant . There is STRONG evidence to REJECT H0 .
The proportion of men enjoy college statistics does differs
from that of women.
What is the Statistical Inference Test you would use if you
needed to determine if there was a Significant Difference
between Three or More Proportions?
Multiple Proportions
Chapter 26: Chi Square(d) or X2–Test
The P-Values/Probabilities for the X 2 – Test come from a
family of Chi Square Distributions, which only take Positive
Values and are all Skewed Right. A specific distribution is
specified by a parameter called the Degree of freedom (df).
(Degree of Freedom = df = n – 1).
Calculating The X2 - Test Statistic:
X 
2
 Observed Count  Expected Count 
Expected Count
2

O  E 
2
E
Calculating The X2 - P-Value:
P  Value  P( x  X2 )  X 2cdf (X2 , E99, df )
Or find the appropriate line on the X2 Table.
Find the P-Value for a Chi-Square Statistic = 12.132 with df = 6.
P-Value = X2cdf (12.132, E99, 6) = 0.0591
Ch. 26 - Multiple Proportions
There are THREE types of Chi-Square Tests:
1. The Chi-Square Test for Goodness of Fit.
2. The Chi-Square Test for Independence.
3. The Chi-Square Test for Homogeneity.
The Chi-Square Test for
GOODNESS OF FIT
A test of whether the distribution of Counts in one
categorical variable matches the distribution
predicted (expected) by a model.
(Degree of Freedom = df = n – 1)
Where n = # of levels of the Category.
CONDITIONS
1. SRS
2. All Expected Counts are 1 or greater.
3. No more than 20% of the Expected Counts are less than 5.
EXAMPLE: Is there one month of the year that stands out as
having more births occurring as compared to the others? If
births were distributed uniformly across the year, we would
expect 1/12 of them to occur each month. To test the claim,
birth data was randomly collected and compiled.
JAN.
FEB.
MAR
APR.
MAY
JUN.
JUL.
AUG.
SEP.
OCT.
NOV.
DEC.
OBS.
DATA
75 87 91 88 76 98 87 74 81 70 74 83
EXP.
DATA
82
82
X2
Goodness of
Fit Test
82 82
82
82
82 82 82
82
82
H0: Births are uniformly distributed over the year.
Ha: Births are NOT uniformly distributed over the year.
or
H0: Proportion of births in Jan = Feb.= Mar.= • • • = Dec.
Ha: Prop. of births are not all uniform. Not all pi’s equal.
Obs  Exp 

2
X2 = 9.5366
X 
2
Exp
82
P-Value =
X2cdf (9.5366, E99, 11)
= 0.5725
Since the P-Value is NOT less than α = 0.05 we Fail to REJECT H0 .
No evidence that Births do NOT occur uniformly through out the year.
CONDITIONS
1. SRS √
2. All Expected Counts are 1 or greater. √
3. No more than 20% of the Expected Counts are less than 5. √
EXP.
DATA
82 82 82 82 82 82 82 82 82 82 82 82
OR… CONDITIONS
1. SRS √
2. All Expected Counts greater than 5. √
EXAMPLE: The NY Civil Liberties Union feel that the NYC
Police Dept is not hiring an ‘ethnic composition’ representing
the city. NYC is 29.2% White, 28.3% Black, 31.5% Latino,
9.1% Asian, and 2% other. If the NYC Police Dept. is
composed of the following, does the Union have a case?
OBS.
DATA
EXP.
DATA
X2
Goodness of
Fit Test
White
Black
Latino
Asian
Other
8560
7120
2762
1852
560
6089.4 5901.7
6569
1897.7 417.08
H0: The Police Dept. represents the Population of NYC.
Ha: The Police Dept. Does NOT represents the
Population of NYC.
Obs  Exp 

2
X2 = 3510.3
X 
2
Exp
P-Value =
X2cdf (3510.3, E99, 4)
=0
Since the P-Value is less than α = 0.05 the data IS significant . There
is STRONG evidence to REJECT H0 . The hiring practice of the NYC
police dept. does NOT represent the ethnic composition of NYC.
CONDITIONS
1. SRS X
2. All Expected Counts are 1 or greater. √
3. No more than 20% of the Expected Counts are less than 5. √
EXP.
DATA
6089.4
5901.7
6569
1897.7
417.08
Homework
Page 628:
#3, 10
Homework: Page 628:
#10
Homework
Page 628:
#3-5, 9
Homework
Page 628:
#3-5, 9
Homework
Page 628:
#3-5, 9
WARM – UP (Matching)
σ - Unknown
Name the Statistical Inference Test you would use if you need
to determine if there was a Significant Difference between…
b
1. ___The
Quantitative Means of Two Independent
Samples.
d
2. ___The
Proportion of a Sample and a Stated
Proportional Value.
3. ___The
Quantitative Mean of a Sample and a Stated
a
Mean.
e
4. ___The
Proportions of Two Independent Samples.
c
5. ___The
Quantitative Means of Two Dependent Samples.
a.) One Sample T-Test
b.) Two Sample T-Test
c.) One Sample Matched Pairs Test
d.) One Proportion z – Test
e.) Two Proportion z – Test
750 of a survey of 1785 students indicated that they cheat
on tests. Do the results provide good evidence that less
than half of students cheat. What Type Error could result.
cheat on tests.
1. p = The true proportion of students who
pˆ  p0
0.420  0.50
z

z

2. H0: p = 0.50 3. One Proportion
0.5 1  0.5 
p0 1  p0 
Ha: p < 0.50
z – Test
1785
n
4. 1. SRS – Not Stated
2. Population of students ≥ 10(1785)
3. 1785(0.50) ≥ 10
1785(1 – 0.50) ≥ 10
5.
z   6.7457
P  Value  0
6. Since the P-Value is less than α = 0.05 there is strong evidence
to REJECT H0 . There is evidence that less than half of student
cheat on test.
Since you Rejected H0, you could be making a TYPE I error IF the
in actuality the true prop. cheaters is exactly half.
Download