CHEN CANGJIE

advertisement
Question 1:
Below are the seven main features of the data,
Mean
Median
Standard Deviation
SMARK
64.9488
65
12.9589
ABILITY
25.2929
ALEVELSA
2.65842
3
1.14856
ATTL
82.3828
90
22.2224
ATTC
80.7756
90
22.1796
ATTR
44.6634
50
39.7147
HRSS
3.61782
3
3.5883
24.7567
9.06607
And how the dependent variable SMARK distributed, see the graph below:
SMARK
100
90
80
70
60
50
40
30
20
10
0
50
100
150
Question 2:
a)
Estimate: SMARK= 0.14613ATTL+ 52.91+µ
Dependent Variable is SMARK,
303 observations (1-303) used for estimation.
Estimation Method: Ordinary Least Squares
200
250
300
Estimate
Std. Err.
t Ratio
52.91
2.77619
19.058
0.14613
0.03254
4.491
Intercept
ATTL
Residual Sum of Squares = 47530.6
p-Value
0
0
R-Squared = 0.0628
R-Bar-Squared = 0.0597
Residual SD = 12.5662
The Intercept is 52.91, which corresponds to the percentage of mark in
Statistics and Econometrics exam with a certain amount of student that have
zero percentage of attendance. It means if the proportion of lectures attended
(percent) is zero, then its estimate is 52.91.
The coefficient of ATTL is 0.14613, which means that if the proportion of
lectures attended (percent) increase by 1, then the percentage of mark in
Statistics and Econometrics exam will increase by 10.14613.
b)
Estimate: SMARK=β0 +β1ABILITY+β2ATTL+β3HRSS+ 40.0947+µ
Dependent Variable is SMARK ,
303 observations (1-303) used for estimation.
Estimation Method: Ordinary Least Squares
Estimate
Std. Err.
t Ratio
Intercept
40.0947
3.27431
12.245
0
ABILITY
0.48439
0.07595
6.378
0
ATTL
0.13467
0.03095
4.351
0
HRSS
0.41684
0.19413
2.147
Residual Sum of Squares = 41683.1
R-Bar-Squared = 0.1699
p-Value
0.033
R-Squared = 0.1781
Residual SD = 11.8071
Test the significance of ATTL in this multivariate regression model
H0: β1=0,
H1: β2≠0
tact = 0.13467/0.03095 =4.35
~ t(303-3-1)
t=1.96
Since ∣ tact∣ > t , we reject H0 at 5% level.
Test the overall significance of this regression.
H0: β1 =β2 =β3 =0
H1: H0 is not true
Fact(303-3-1) = 21.597
F=2.63
Since Fact > F , we reject H0 at 5% level.
c)
Estimate: SMARK=β0+β1ABILITY+β2ATTL+β3HRSS+β4ABILITY^2+µ
Data Transformation: ABILITY^2 created.
Dependent Variable is SMARK,
303 observations (1-303) used for estimation.
Estimation Method: Ordinary Least Squares
Estimate
Std. Err.
t Ratio
Intercept
41.4542
5.13688
8.07
ABILITY
0.36837
0.34589
1.065
ATTL
0.13428
0.03101
4.33
HRSS
0.41691
0.19442
2.144
ABILITY^2
0.00223
0.00648
0.344
Residual Sum of Squares = 41666.6
R-Bar-Squared = 0.1674
p-Value
0
0.288
0
0.033
0.731
R-Squared = 0.1784,
Residual SD = 11.8246
An increase in ABILITY^2 will lead to 0.00223 increase in the percentage of
mark in Statistics and Econometrics exam, which is a small change.
Test the significance of ABILITY:
H0: β1 =0
H1: β1≠0
tact= 0.36837
/
t(303-4-1)
0.34589
=1.065
t=1.96
Since ∣ tact∣ < t , we do not reject H0
Test the significance of ABILITY^2
H0: β1 = 0
H1: β1≠0
tact=0.00223 /
t(303-4-1)
0.00648
=0.344
t=1.96
Since ∣ tact∣ < t , we do not reject H0
Question 3:
Estimate: SMARK=β0+ β1A TTL+ β2 Fitted Values^2 +µ ~ See Appendix 3.1
H0: β1 =β2 =0
H1: H0 is not true
F(1,300)=2.21
P-VALUE=0.027
Since P-VALUE < F , we do not reject H0
Estimate: SMARK=β0+ β1ABILITY+β2ATTL+β3HRSS+ Fitted Values7^2+µ
~ See Appendix3.2
H0: β1 =β2 =β3=0
F(1,299)=2.21
H1: H0 is not true
P-VALUE=0.351
Since P-VALUE < F , we do not reject H0
Estimate:
SMARK=
β0+ β1ABILITY+β2ATTL+β3HRSS+β4ABILITY^2+ Fitted Values9^2+µ
~ See Appendix 3.3
H0: β1 =β2 =β3=β4 =0
F(1,298)=2.21
H1: H0 is not true
P-VALUE=0.377
Since P-VALUE < F , we do not reject H0
Question4:
Estimate:
SMARK=
β0+β1ABILITY+β2ALEVELSA+β3ATTR+β4COURSE_A+β5COURSE_B+µ
Dependent Variable is SMARK,
303 observations (1-303) used for estimation.
Estimation Method: Ordinary Least Squares
Estimate
Intercept
ABILITY
37.1141
0.66285
Std. Err.
3.55432
0.08645
t Ratio
p-Value
10.442
0
7.667
0
ALEVELSA
2.3798
0.57726
4.123
0
ATTR
0.08863
0.01996
4.441
0
COURSE_A
2.06324
COURSE_B
-6.67819
1.85657
2.61021
Residual Sum of Squares = 38470.7
R-Bar-Squared = 0.2287
β1 , β2 , β3 , β4
1.111
-2.558
0.267
0.011
R-Squared = 0.2414
Residual SD = 11.3812
have a positive effect on SMARK, which means one
percentage increase in ABILITY or ALEVELSA or ATTR or COURSE_A will
have 0.66285, 2.3798, 0.08863, 2.06324 increase in SMARK respectively.
However, one percentage increase in COURSE_B will cause 6.67819
decreases in SMARK.
Question 5:
First computer the overall regression:
Estimate:
SMARK=β0+β1ABILITY+β2ALEVELSA+β3ATTC+β4COURSE_A+β5COURS
E_B+ABILITY#COURSE_A+ALEVELSA#COURSE_A+ATTC#COURSE_A+
ABILITY#COURSE_B + ALEVELSA#COURSE_B+ ATTC#COURSE_B+µ
~ See Appendix 5.1
Then computer COURSE_A (See Appendix 5.2), COURSE_B(See Appendix
5.3) ,COURSE_C (See Appendix 5.4) separately.
1ST CHOW TEST=5.3803
F (4,295)=2.40
Since Fact > F , we reject null hypothesis at 5% level.
Question 6:
First add residuals of SMARK=β0+β1ABILITY+β2ALEVELSA+β3ATTC to the
data set.
~ See Appendix 6.1
Then estimate: see Appendix 6.2
R-Squared = 0.068
thus,
LM=303×0.068 = 20.604
Since the 5% critical value in chi-square distribution is 16.919, we reject null
hypothesis at 5% level.
Question 7:
Estimate:
SMARK=β0+β1ABILITY+β2ALEVELSA+β3ATTC+β4COURSE_A+β5COURS
E_B+µ
Dependent Variable is SMARK,
303 observations (1-303) used for estimation.
Estimation Method: Ordinary Least Squares
Intercept
ABILITY
Estimate
Std. Err.
t Ratio
p-Value
39.0301
3.54826
11
0
0.44194
0.0733
6.029
0
ALEVELSA
2.19081
0.58467
3.747
0
ATTC
0.11029
0.02996
3.681
0
COURSE_A
1.14572
1.85574
COURSE_B
-7.6333
2.62064
Residual Sum of Squares = 39233.4
R-Bar-Squared = 0.2134
0.617
0.537
-2.913
0.004
R-Squared = 0.2264
Residual SD = 11.4934
Fact= [(39233.4-38628.7)/6]/[ 38628.7/(303-11)]=0.7879
F(6, 292)= 2.13
Since Fact>F , we do not reject null hypothesis at 5% level.
Question 8:
Estimate:
If there is heteroskedasticity in the model from Question5
~
See Appendix8.1
H0: there is heteroskedasticity in the model
H1: H0: is not true
Since C(1)=3.841<18.8203
C(13)=22.362< 25.1903
We reject null hypothesis at 5% level.
So the test is invalid.
Question 9:
Estimate:
SMARK=
β0+β1ABILITY+β2ALEVELSA+β3ATTC+β4COURSE_A+β5COURSE_B+µ
Dependent Variable is SMARK,
303 observations (1-303) used for estimation.
Estimation Method: Ordinary Least Squares
Estimate
Std. Err.
t Ratio
p-Value
Intercept
39.0301
ABILITY
0.44194
3.84663
0.07042
10.147
0
6.276
0
ALEVELSA
2.19081
0.62902
3.483
0.001
ATTC
0.11029
0.03512
3.14
0.002
COURSE_A
1.14572
1.59245
0.719
0.472
COURSE_B
-7.6333
2.89485
-2.637
0.009
Residual Sum of Squares = 39233.4
R-Squared = 0.2264
R-Bar-Squared = 0.2134
Residual SD = 11.4934
Wald Test of Zero Restrictions on:
COURSE_A COURSE_B ChiSq(2) =
10.7372 {0.005}
The value obtained for the Wald robust statistic is 10.7372, with the
corresponding p-value 0.005, As this value is smaller than 0.05, we reject the
null hypothesis at 5% level.
Question 10:
Estimate:
SMARK=
β0 + β1 ABILITY +β2 AGE +β3 ALEVELS +β4 ALEVELSA +β5 ATTL +β6
ATTC + β7 ATTR +β8 EXPALC +β9 HRSS +β10 QUALOTH +β11 MGRAD +µ
Dependent Variable is SMARK,
303 observations (1-303) used for estimation.
Estimation Method: Ordinary Least Squares
Estimate
Intercept
ABILITY
24.3437
0.51776
Std. Err.
16.531
0.09332
t Ratio
1.473
5.548
p-Value
0.142
0
AGE
0.19987
0.81436
0.245
0.806
ALEVELS
0.80189
1.05189
0.762
0.446
ALEVELSA
3.29145
0.76855
4.283
0
ATTL
0.10502
0.0418
2.512
0.013
ATTC
0.00226
0.04174
0.054
0.957
ATTR
0.03521
0.02321
1.517
0.13
EXPALC
-0.03353
0.02881
-1.164
0.245
0.24034
0.19348
1.242
0.215
HRSS
QUALOTH
11.4391
4.67134
2.449
0.015
MGRAD
0.75761
0.36422
2.08
0.038
Residual Sum of Squares =
37216
R-Bar-Squared = 0.2384
R-Squared = 0.2662
Residual SD = 11.3088
As picked eleven main features from the data, we can see that EXPALC has a
negative effect on SMARK, which means one pound increase in alcohols per
week will have 0.03353 decreases in SMARK.
QUALOTH has a large coefficient in the data set, which will cause a huge
change in SMARK.
Test the overall significance:
H0: β1=β2=β3=….. β11=0
H1: H0 is not true
Fact = (0.2662^2/11)/[(1- 0.2662^2)/(303-11-1)] = 2.018
F(11, 291) is between 1.78 to 1.86,
Since Fact > F , we reject H0 at 5% level,
So it is not statistically significant at 5% level.
Download