4/27/01 252x0144 ECO252 QBA2 Name

advertisement
4/27/01 252x0144
ECO252 QBA2
FINAL EXAM
May 3, 2001
Name
Hour of Class Registered (Circle)
I. (16+ points) Do all the following.
1.
Hand in your fourth regression problem (2 points)
Remember: Y = Company profit in millions of dollars, X1 = CEO's yearly income in thousands of
dollars (X1 = 1000 means a million dollar annual income) , X2 = Percentage of stock owed by CEO
(X2 = 3 means the CEO owns 3.0% of the stock)
Use a significance level of 10% in this problem.
2.
Answer the following questions.
a. For the regression of Y against X1 and X2 only, what does the ANOVA tell us? Which of the
coefficients are significant? What tells you this? (3)
b. Do an F test to show if the addition of X1 and X2 improves the regression over your results with X3
alone. (4)
c. Based on your regression of Y against X1, X2, and X3,
(i) What evidence is there that CEO income and stock percentage interact? (1)
(ii) What change does this equation predict for every one thousand dollars of CEO income when
the CEO owns 6% of the company's stock? (3)
(iii) What profit does the equation predict for a firm where the CEO earns $1.3 million and owns
38% of the stock? What might this lead you to suspect about this equation? (2)
(iv) Based only on the adjusted R-squared and the significance of the coefficients, is there an
equation that seems to work better than the equation with three independent variables? Why? (3)
4/27/01 252x0144
II. Do at least 4 of the following 7 Problems (at least 15 each) (or do sections adding to at least 60 points Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H1 where
applicable. Use a significance level of 5% unless noted otherwise. Do not answer questions without citing
appropriate statistical tests.
1. (Black, p532) A researcher wishes to predict the price of a meal in New Orleans ( y ) on the basis of
location ( x1 - a dummy variable, 1 if the restaurant is in the French Quarter, 0 otherwise) and the
probability of being seated on arrival. ( x 2 ). The data is below (Use   .10 ) .
Row
price
FQ
prob
y
x1
x2
1
8.52
0
0.62
2
21.45
1
0.43
3
16.18
1
0.58
4
6.21
0
0.74
5
12.19
1
0.19
6
25.62
1
0.49
7
13.90
0
0.80
8
18.66
1
0.75
9
5.25
0
0.37
10
8.02
0
0.63
The following are given to help you.
y  136 .00 ,
y 2  2271 .32,
x  5,


 1  x12  5,  x 2  5.6,  x22  3.4658,
 x y  ?,  x y  75.4657 ,  x x  ? and n  10 .
1
2
1 2
You do not need all of these.
a. Compute a simple regression of price against x1 .(7)
b. On the basis of this regression, what price do you expect to pay for a meal in the French Quarter? Outside
the French Quarter? (2)
b. Compute R 2 (4)
c. Compute s e (3)
d. Compute s b0 ( the std deviation of the intercept) and do a confidence interval for  0 .(3)
f. Do a confidence interval for the price of a meal in the French Quarter. (3)
2
4/27/01 252x0144
2. Data from the previous problem is repeated. below . (Use   .10 ) .
Row
price
FQ
prob
y
x1
x2
1
8.52
0
0.62
2
21.45
1
0.43
3
16.18
1
0.58
4
6.21
0
0.74
5
12.19
1
0.19
6
25.62
1
0.49
7
13.90
0
0.80
8
18.66
1
0.75
9
5.25
0
0.37
10
8.02
0
0.63
The following are given to help you.
y  136 .00 ,
y 2  2271 .32,
x  5,


 1  x12  5,  x 2  5.6,  x22  3.4658,
 x y  ?,  x y  75.4657 ,  x x  ? and n  10 .
1
2
1 2
a. Do a multiple regression of price against x1 and x 2 . (12)
b. Compute R 2 and R 2 adjusted for degrees of freedom for both this and the previous problem. Compare
the values of R 2 adjusted between this and the previous problem. Use an F test to compare R 2 here with
the R 2 from the previous problem.(4)
c. Compute the regression sum of squares and use it in an F test to test the usefulness of this regression. (5)
d. Use your regression to predict the price of a meal in the French Quarter sold when the probability of
being seated on arrival is 45%(2)
e. Use the directions in the outline to make this estimate into a confidence interval and a prediction interval.
(4)
3
4/27/01 252x0144
3. An airline wants to select a computer package for its reservation system. Over 20 weeks it tries the four
commercially available reservation system packages and records as x1 , x 2 , x3 , and x 4 , the number of
passengers bumped by each system. It will choose the package with the smallest average bumps, assuming
that there is a significant difference between the median or average number of bumps. The data below are
in the columns labeled x, the original numbers and, in the r columns, their ranks on a 1 to 20 scale. Below
this I have given you the sums of the columns, the number of items in each column, the means for each
columns and the sums of the squared numbers (ssq) in each column. The columns are independent samples.
Use a 5% significance level.
Row
1
2
3
4
5
6
x1
64.0
5
12.8
866.0
P1
x1
12
14
9
11
18
r1
17
19
9
15
20
x2
17.0
5
3.4
79.0
P2
x2
2
4
7
3
1
x3
68.000
6
11.333
934.000
r2
2
4
7
3
1
P3
x3
22
9
5
10
12
10
x4
40
4
10
454
r3
21.5
9.0
5.5
12.5
17.0
12.5
P4
x4
7
6
15
12
r4
5.5
12.5
17.0
12.5
sum
count
mean
ssq
a. Assume that the underlying distribution is Normal and test for a significant difference between the means.
(7)
b. Assume that the underlying distribution is not normal and test for a significant difference between the
medians. (5).
c. Find the mean and standard deviation for column P3 and test column P3 for a Normal distribution. (5)
4
4/27/01 252x0144
4. The data from the previous page is repeated.
Use a 5% significance level.
Row
1
2
3
4
5
6
x1
64.0
5
12.8
866.0
P1
x1
12
14
9
11
18
r1
17
19
9
15
20
x2
17.0
5
3.4
79.0
P2
x2
2
4
7
3
1
x3
68.000
6
11.333
934.000
r2
2
4
7
3
1
P3
x3
22
9
5
10
12
10
x4
40
4
10
454
r3
21.5
9.0
5.5
12.5
17.0
12.5
P4
x4
7
6
15
12
r4
5.5
12.5
17.0
12.5
sum
count
mean
ssq
a. Assume that the underlying distribution is Normal and test columns 1 and 3 for differences in means.
Assume identical variances. Use a (i) test ratio, (ii) a critical value and (iii) a confidence interval (6)
b. Assume that the underlying distribution is not normal and test for a significant difference between the
medians of columns 1 and 3(4)
c. Assume again that the distributions are Normal and test that the variances are the same. (3)
d. Test column P3 to see if its standard deviation is 7. (3).
5
4/27/01 252x0144
5. a. A machine fills a sample of 100 one-pound boxes of a product and they are later tested to see how
many are over or under the desired one-pound size. The manufacturer wishes to test whether exactly half of
the population of boxes is over the one-pound mark and that the occurrence of boxes that are 'over' and
'under' is random. In the sample there are 60 boxes that are 'over' and 40 that are under and there are 45 runs
of 'overs' or 'unders'.
(i) Test that the proportion of 'overs' is 50%. (2)
(ii) Test that the sequence of 'overs' and 'unders' is random. (5)
b. A series of 24 observations are used to calculate a simple regression with four variables. We calculate a
Durbin-Watson statistic of 0.471. Is Autocorrelation present? Is it positive or negative? (3)
c. We are testing to see if the mean of a normally distributed population with a known variance of 20 is 5.
We take a sample of 100 and find that the mean is 10.5. Given these results, what is the p-value of our result
if
(i) the Null hypothesis is H 0 :   5 , (ii) the Null hypothesis is H 0 :   5 , (iii) The Null hypothesis is
H 0 :   5 (6)
6
4/27/01 252x0144
6. An electronics chain reports the following data on number of households, sales volume and number of
customers for 10 stores.
Row hshlds sales
cust
x1
x2
x3
1
149
462
298
2
80
98
47
3
123
269
198
4
108
150
98
5
152
733
248
6
221
9430
503
7
167
1516
348
8
192
2993
448
9
220
5716
398
10
89
188
149
x
1
x
1
 1598.0,
 1501.0,
x
2
1
x
2
1
 273738,
 248413,
x
x
2
2
 1676.0,
 21555,
x
x
2
2
2
2
 302788,
x x
 133744400,
1 2
x x
 287019
1 2
 4423491
a) Compute the correlation between households and sales and test it for significance. (5)
b) Test the same correlation to see if it is .86 (5)
c) Compute the rank correlation between households and sales and test it for significance. (5)
d) Compute Kendall's W for households, sales and customers and test it for significance (6)
e) Extra credit: If I do a regression with sales as the dependent variable and households and customers as
independent variaables, what sort of results would I be likely to get? Why? (3)
7
4/27/01 252x0144
7. A producer of filters is getting complaints about the quality of the filters it is producing. It thus examines
1000 filters from each of its three shifts and discovers for shift 1 36 defects, for shift 2 40 defects and for
shift 3 55 defects.
a) Test the hypothesis that the proportion of defective filters is the same for all three shifts at the 95% level.
(7)
b) Test the hypothesis that the defect rate is higher for the third shift than the first. (3)
c) Find a p-value for your result in b) (2)
d) Do a confidence interval for the difference between the proportion defective for shifts 1 and shift 2. (4)
8
Download