110 Fundamentals of Hypothesis Testing 2

advertisement
Principles of Statistics
Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman
Former Director,
Centre for Real Estate Studies
Faculty of Geoinformation Science and Engineering,
Universiti Teknologi Malaysia,
Skudai, Johor.
E-mail: hamid@fksg.utm.my
Hypothesis Testing
Content:
• Concepts of hypothesis testing
• Test of statistical significance
• Hypothesis testing one variable at a time
Hypothesis
• Unproven proposition
• Supposition that tentatively explains certain
facts or phenomena
• Assumption about nature of the world
• E.g. the mean price of a three-bedroom
single storey houses in Skudai is RM
155,000.
Hypothesis (contd.)
• An unproven proposition or supposition that
tentatively explains certain facts or
phenomena:
– Null hypothesis
– Alternative hypothesis
• Null hypothesis is that there is no systematic
relationship between independent variables
(IVs) and dependent variables (DVs).
• Research hypothesis is that any relationship
observed in the data is real.
Null Hypothesis
• Statement about the status quo
• No difference
• Statistically expressed as:
Ho: b=0
where b is any sample parameter used to
explain the population.
Alternative Hypothesis
• Statement that indicates the opposite of the
null hypothesis
• There is difference
• Statistically expressed as:
H1: b  0
H1: b < 0
H1: b > 0
Significance Level
• Critical probability in choosing between the Ho
and H1.
• Simply means, the cut-off point (COP) at
which a given value is probably true.
• Tells how likely a result is due to chance
• Most common level, used to mean “something
is good enough to be believed”, is .95.
• It means, the finding has a 95% chance of
being likely true.
• What is the COP at 95% chance?
Significance Level (contd.)
• Denoted as 
• Tells how much the probability mass is in the
tails of a given distribution
• Probability or significance level selected is
typically .05 or .01
• Too low to warrant support for the null
hypothesis
• In other words, high chances to warrant
support for alternative hypothesis
• Main purpose of statistical testing: to reject
null hypothesis
Significance Level (contd.)
P[-1.96  Z  1.96] = 1 -  = 0.95
P[Z  Zc] = P[Z  -Zc] = /2
Let say we have the following relationship:
Y = β + ei i=1,…, T and ei ~ N(0,σ2) ……………....(1)
The least square estimator for β is:
T
b=Yi/T ……………………………………………..(2)
i=1
with the following properties:
1) E[b] = β ………………………………………….(3a)
2) Var(b)=E[(b-β)]2 = σ2/T ………………………...(3b)
3) b~N(β, σ2/T) …………………………………….(3c)
The “standardized” normal random variable for β is:
b-β
Z =-------- ~ N(0,1) ……………………………………..(4)
(σ2T)
The critical value of Z, i.e. Zc, such that α=0.05 of the
probability mass is in the tails of distribution, is given as:
P[Z 1.96] = P[Z -1.96]=0.025 ………………………(5a)
and
P[-1.96  Z 1.96]=1-0.05=0.95 ………………………(5b)
Substituting SND for variable β (Eqn. 4) into Eqn (5a),
we get:
b-β
P[-1.96  --------- 1.96]=0.95 ……………………………..…...(6)
(σ2/T)
Solving for β, we get:
P[b-1.96σ/T β b+1.96σ/T]=0.95 ………………………… (7)
In general: P[b-Zcσ/T  β b+Zcσ/T]= 1- ……………….. (8a)
b-β
b -β
Also: P[-------  -Zc] = P[ --------  Zc] = α/2 (2-tail test) ...…(8b)
σ/T
σ/T
Example
You suspect that the mean rental of 225 purposebuilt office units in Johor is RM 3.00/sq.ft. If the
std. dev. is RM 1.50/sq.ft., what is the 95%
confidence interval of the mean?
The null hypothesis that the mean is equal to
3.0:
Ho: μ = 3.0
The alternative hypothesis that the mean
does not equal to 3.0:
H1: μ  3.0
A Sampling Distribution
=.025
-XL = ?
=.025
m=3.0
XU = ?
x
Critical values of m
Critical value - upper limit
S
= m  ZS X or m  Z
n
 1 .5 
= 3.0  1.96

 225 
Critical values of m
= 3.0  1.960.1
= 3.0  .196
= 3.196
Critical values of m
Critical value - lower limit
= m - ZS X or m - Z
 1 .5 
= 3.0 - 1.96

 225 
S
n
Critical values of m
= 3.0  1.960.1
= 3.0  .196
= 2.804
Region of Rejection
LOWER
LIMIT
m=3.0
UPPER
LIMIT
Hypothesis Test m =3.0
2.804
m=3.0
3.196
3.78
Type I and Type II Errors
Null is true
Null is false
Accept null
Reject null
Correctno error
Type I
error
Type II
error
Correctno error
Type I and Type II Errors
in Hypothesis Testing
State of Null Hypothesis
in the Population
Decision
Accept Ho
Reject Ho
Ho is true
Ho is false
Correct--no error
Type II error
Type I error
Correct--no error
Example
You estimate that the average price, μ, of singleand double-storey houses in Malaysia’s major
industrialised towns to be RM 1,600/sq.m.
Based on a sample of 101 houses, you found
that the mean price, , is 1,579.44/sq.m. with a std
dev. of RM 350.13/sq.m.
(a) Would you reject your initial estimate at 0.05
significance level?
(b) What is the confidence interval of rental at 5% s.l.?
Answer (a)
Ho = 1,600
H1  1,600
1,579.44 – 1,600
Test statistic: Z = -------------------350.13/101
≈ -0.59
P[Z  Zc] = P[Z  -Zc] = 0.05
P[0.59  Zc ] = 0.05
From Z-table, Zc = 1.645
Since Z < Zc,do not reject Ho.
∴ Rental = RM 1,600/sq.m.
Answer (b)
1,579.13-1.645(34.84)=RM 1,521.82 (lower limit)
1,579.13+1.645(34.84)=RM 1,636.44 (upper limit)
PARAMETRIC
STATISTICS
NONPARAMETRIC
STATISTICS
t-Distribution
• Symmetrical, bell-shaped distribution
• Mean of zero and a unit standard deviation
• Shape influenced by degrees of freedom
Degrees of Freedom
• Abbreviated d.f.
• Number of observations
• Number of constraints
Confidence Interval Estimate
Using the t-distribution
m = X  t c .l . S X
Upper limit = X  t c .l .
or
Lower limit = X  t c .l .
S
n
S
n
Confidence Interval Estimate Using
the t-distribution
m
X
tc.l .
= population mean
= sample mean
= critical value of t at a specified confidence
level
SX
S
n
= standard error of the mean
= sample standard deviation
= sample size
Confidence Interval Estimate Using
the t-distribution
m = X  t cl s x
X = 3 .7
S = 2.66
n = 17
upper limit = 3 .7  2 .12 ( 2 .66 17 )
= 5 .07
Lower limit = 3 . 7  2 . 12 ( 2 . 66 17 )
= 2 . 33
Hypothesis Test Using the
t-Distribution
Univariate Hypothesis Test
Utilizing the t-Distribution
Suppose that a production manager believes
the average number of defective assemblies
each day to be 20. The factory records the
number of defective assemblies for each of the
25 days it was opened in a given month. The
mean X was calculated to be 22, and the
standard deviation, S ,to be 5.
H 0 : m = 20
H1 : m  20
SX = S / n
= 5 / 25
=1
Univariate Hypothesis Test
Utilizing the t-Distribution
The researcher desired a 95 percent
confidence, and the significance level becomes
.05.The researcher must then find the upper
and lower limits of the confidence interval to
determine the region of rejection. Thus, the
value of t is needed. For 24 degrees of
freedom (n-1, 25-1), the t-value is 2.064.
Lower limit :

m  tc.l . S X = 20  2.064 5 / 25
= 20  2.0641
= 17.936

Upperlimit :

m  t c.l. S X = 20  2.064 5 / 25
= 20 2.0641
= 20.064

Univariate Hypothesis Test
t-Test
tobs
X m
22  20
=
=
SX
1
2
=
1
=2
Testing a Hypothesis about a
Distribution
• Chi-Square test
• Test for significance in the analysis of
frequency distributions
• Compare observed frequencies with
expected frequencies
• “Goodness of Fit”
Chi-Square Test
(Oi  Ei )²
x² = 
Ei
Chi-Square Test
x² = chi-square statistics
Oi = observed frequency in the ith cell
Ei = expected frequency on the ith cell
Chi-Square Test
Estimation for Expected Number
for Each Cell
E ij =
R iC
n
j
Chi-Square Test
Estimation for Expected Number
for Each Cell
Ri = total observed frequency in the ith row
Cj = total observed frequency in the jth column
n = sample size
Univariate Hypothesis Test
Chi-square Example

O1  E1 
=
2
X
2
E1

O2  E 2 

2
E2
Univariate Hypothesis Test
Chi-square Example

60  50 
=
2
X
2
=4
50

40  50 

2
50
Hypothesis Test of a Proportion
p is the population proportion
p is the sample proportion
p is estimated with p
Hypothesis Test of a Proportion
H0 : p = . 5
H1 : p  . 5
Sp =
0.60.4
100
= .0024
.24
=
100
= .04899
.6  .5
p p
=
Zobs =
.04899
Sp
.1
= 2.04
=
.04899
Hypothesis Test of a Proportion:
Another Example
n = 1,200
p = .20
Sp =
pq
n
Sp =
(.2)(.8)
1200
Sp =
.16
1200
Sp = .000133
Sp = . 0115
Hypothesis Test of a Proportion:
Another Example
n = 1,200
p = .20
Sp =
pq
n
Sp =
(.2)(.8)
1200
Sp =
.16
1200
Sp = .000133
Sp = . 0115
Hypothesis Test of a Proportion:
Another Example
Z=
pp
Sp
.20  .15
.0115
.05
Z=
.0115
Z = 4.348
The Z value exceeds 1.96, so the null hypothesis should be rejected at the .05 level.
Indeed it is significantt beyond the .001
Z=
Download