NAME:______________________ I.D. # : ______________________ ECONOMICS 2900

advertisement
NAME:______________________
I.D. # : ______________________
ECONOMICS 2900
Economics and Business Statistics
SPRING SEMESTER, 2004
MIDTERM EXAMINATION
Tuesday, Feb. 24th
Weight 35%
NOTE : You have 75 minutes to complete the exam. Please answer all questions on this
exam booklet. Calculators used must not have the ability to program alphabetic
characters (whole words or sentences) GOOD LUCK
Question # 1
At a recent Willie Nelson concert, a survey was conducted that asked a random sample
of 20 people their age and how many concerts they have attended since the first of the
year. The following data were collected:
Age
Number of Concerts
62
6
57
5
40
4
49
3
67
5
54
5
43
2
65
6
54
3
41
1
Age
Number of Concerts
44
3
48
2
55
4
60
5
59
4
63
5
69
4
40
2
38
1
52
3
An Excel output follows :
SUMMARY OUTPUT
DESCRIPTIVE STATISTICS
Regression Statistics
Multiple R
0.80203
R Square
0.64326
Adjusted R Square
0.62344
Standard Error
0.93965
Observations
20
Age
Mean
Standard Error
Standard Deviation
Sample Variance
Count
53
2.1849
9.7711
95.4737
20
Concerts
Mean
Standard Error
Standard Deviation
Sample Variance
Count
MS
28.65711
0.88294
F
32.45653
Significance F
2.1082E-05
t Stat
-2.53491
5.69706
P-value
0.02074
0.00002
Lower 95%
-5.50746
0.07934
3.65
0.3424
1.5313
2.3447
20
SPEARMAN RANK CORRELATION COEFFICIENT=0.8306
ANOVA
Regression
Residual
Total
Intercept
Age
df
1
18
19
SS
28.65711
15.89289
44.55
Coefficients Standard Error
-3.01152
1.18802
0.12569
0.02206
A. What is the regression equation? What does it mean?
B. What is the R squared? What does it tell you?
Upper 95%
-0.5156
0.1720
C. What is the Standard Error? What does the value of this statistic mean?
D. Does Age appear to be important when predicting number of concerts attended?
E. Is the linear model appropriate? How can you tell?
E. Predict with 95% confidence the number of concerts attended by a 45 years-old
individual. (Just show the formula –do not calculate)
F. Predict with 95% confidence the average number of concerts attended by all 45
years-old individuals. (Just show the formula –do not calculate)
.
Histogram
3.000
10
2.000
8
1.000
0.000
-1.000
0
1
2
3
4
5
6
Frequency
Residuals
Residuals versus Predicted
6
4
2
0
-1
-2.000
Predicted
0
1
Residuals
G. What is heteroskedasticity? Does it appear to be a problem in this model?
H. What does the Histogram tell you? Why is this important?
2
Question #2
An economist wanted to develop a multiple regression model to enable him to predict
the annual family expenditure on clothes. After some consideration, he developed the
multiple regression model
y   0  1 x1   2 x2   3 x3  
where
y = annual family clothes expenditure (in $1,000)
x1 = annual household income (in $1,000)
x 2 = number of family members
x3 = number of children under 10 years of age
The computer output is shown below.
THE REGRESSION EQUATION IS
y  1.74  0.091x1  0.93x2  0.26 x3
Predictor
Constant
x1
x2
x3
S = 2.06
Coef
1.74
0.091
StDev
0.630
0.025
T
2.762
3.640
0.93
0.290
3.207
0.26
0.180
1.444
R-Sq = 59.6%
ANALYSIS OF VARIANCE
Source of Variation
Regression
Error
Total
A.
df
3
46
49
SS
288
195
483
MS
96
4.239
Is this model useful? (Use 5% significance level)
F
22.647
B.
Test at the 1% significance level to determine whether the number of family
members and annual family clothes expenditure are linearly related.
F.
What is Multicollinearity? Does it appear to be a problem in this model? How can
you tell?
Question # 3
An avid football fan was in the process of examining the factors that determine the
success or failure of football teams. He noticed that teams with many rookies and teams
with many veterans seem to do quite poorly. To further analyze his beliefs he took a
random sample of 20 teams and proposed a second-order model with one independent
variable. The selected model is
y   0  1 x   2 x 2  
where
y = winning team’s percentage
x = average years of professional experience
The computer output is shown below.
THE REGRESSION EQUATION IS
y  32.6  5.96 x  0.48 x 2
Predictor
Constant
x
x2
S = 16.1
Coef
32.6
5.96
-0.48
StDev
19.3
2.41
0.22
T
1.689
2.473
-2.182
R-Sq = 43.9%
ANALYSIS OF VARIANCE
Source of Variation
Regression
Error
Total
A.
df
2
17
19
SS
3452
4404
7856
MS
1726
259.059
F
6.663
Suggest a reason why this fan would choose to include a variable like X2?
What is the meaning of this variable and how do you know whether it should
be retained as part of the model?
Question # 4
A professor of accounting wanted to develop a multiple regression model to predict the
students’ grades in her fourth-year accounting course. She decides that the two most
important factors are the student’s grade point average in the first three years and the
student’s major. She proposes the model
y   0  1 x1   2 x2   3 x3  
where
y = Fourth-year accounting course mark (out of 100)
x1 = G.P.A. in first three years (range 0 to 12)
x2 = 1 if student’s major is accounting
= 0 if not
x3 = 1 if student’s major is finance
= 0 if not
The computer output is shown below.
THE REGRESSION EQUATION IS
y  9.14  6.73x1  10.42 x2  5.16 x3
Predictor
Constant
x1
x2
x3
S = 15.0
Coef
9.14
6.73
StDev
7.10
1.91
T
1.287
3.524
10.42
4.16
2.505
5.16
3.93
1.313
R-Sq = 44.2%
ANALYSIS OF VARIANCE
Source of Variation
df
Regression
3
Error
96
Total
99
SS
17098
21553
38651
MS
5699.333
224.510
F
25.386
Rank the students, according to their major, in order of who tends to score the highest
in the accounting course.
Question # 5
A. What is Autocorrelation?
B. The only information you are given about a regression is as follows:
d = 1.75, n = 20, k = 2, and   0.05.
Test to see if Autocorrelation is a problem
C.
How else can you tell if Autocorrelation exists?
Download