Uploaded by Huy Tran Quang

Review

advertisement
A. Multiple choice question
Question 1: Discrete Components, Inc. manufactures a line of electrical resistors. Presently, the carbon
composition line is producing 100 ohm resistors. The population variance of these resistors "must not
exceed 4" to conform to industry standards. Periodically, the quality control inspectors check for
conformity by randomly selecting 10 resistors from the line and calculating the sample variance. The last
sample had a variance of 4.36. Assume that the population is normally distributed. Using 𝛼 = 0.05, the
null hypothesis is _________________.
a) 𝜎 ! = 100
b) 𝜎 = 10
c) 𝑠 ! = 4
d) 𝜎 ! = 4
Question 2: David Desreumaux, VP of Human Resources of American First Banks (AFB), is reviewing
the employee training programs of AFB banks. Based on a recent census of personnel, David knows that
the variance of teller training time in the Southeast region is 8, and he wonders if the variance in the
Southwest region is the same number. His staff randomly selected personnel files for 15 tellers in the
Southwest Region, and determined that their mean training time was 25 hours and that the standard
deviation was 4 hours. Assume that teller training time is normally distributed. Using 𝛼 = 0.10, the
critical values of chi-square are ________.
a) 7.96 and 26.30
b) 6.57 and 23.68
c) -1.96 and 1.96
d) -1.645 and 1.645
Question 3: Suppose the fat content of a hotdog follows normal distribution. Ten random measurements
give a mean of 21.77 and standard deviation of 3.69. The 90% confidence interval for the population
variance of fat content of a hotdog is ________
a) 5.2 to 21.3
b) 7.3 to 36.9
c) 19.63 to 23.91
d) 19.85 to 23.69
160
180
y
200
220
240
Question 4: According to the following graphic, x and y have ____________
10000
a) strong negative correlation
b) virtually no correlation
c) strong positive correlation
20000
30000
x
40000
50000
d) moderate negative correlation
Question 5: One of the assumptions made in simple regression is that ______________.
a) the error terms are exponentially distributed
b) the error terms have unequal variances
c) the model is linear
d) the error terms are dependent
Question 6: A manager wishes to predict the annual cost (y) of an automobile based on the number of
miles (x) driven. The following model was developed: y = 1,550 + 0.36x. If a car is driven 15,000 miles,
the predicted cost is ____________.
a) 2090
b) 3850
c) 7400
d) 6950
Question 7: A researcher believes that a variable is Poisson distributed across six categories. To test this,
the following random sample of observations is collected:
Category
0
1
2
3
4
5
Observed
47
56
39
22
18
10
Using 𝛼 = 0.10, the observed chi-square value for this goodness-of-fit test is ____.
a) 2.28
b) 14.56
c) 17.43
d) 1.68
Question 8: Use the following set of observed frequencies to test the independence of the two variables.
Variable one has values of 'A' and 'B'; variable two has values of 'C', 'D', and 'E'.
C
D
E
A
12
10
8
B
20
24
26
Using 𝛼 = 0.05, the critical chi-square value is _______.
a) 9.488
b) 1.386
c) 8.991
d) 5.991
Question 9: Sam Hill, Director of Media Research, is analyzing subscribers to the Life West of the Saline
magazine. He wonders whether subscriptions are influenced by the head of household’s employment
classification. His staff prepared the following contingency table from a random sample of 300
households.
Head of Household Classification
Clerical
Managerial
Professional
Subscribes
Yes
10
90
60
No
60
60
20
Using 𝛼 = .05, the appropriate decision is ______________.
a) reject the null hypothesis and conclude the two variables are independent
b) do not reject the null hypothesis and conclude the two variables are independent
c) reject the null hypothesis and conclude the two variables are not independent
d) do not reject the null hypothesis and conclude the two variables are not independent
Question 10: A multiple regression analysis produced the following tables.
Predictor
Intercept
x1
x2
Coefficients
752.0833
11.87375
1.908183
Standard Error
336.3158
5.32047
0.662742
t Statistic
2.236241
2.231711
2.879226
p-value
0.042132
0.042493
0.01213
Source
df
SS
MS
F
p-value
Regression
2
203693.3
101846.7
6.745406 0.010884
Residual
12
181184.1
15098.67
Total
14
384877.4
These results indicate that ____________.
a) none of the predictor variables are significant at the 5% level
b) each predictor variable is significant at the 5% level
c) x1 is the only predictor variable significant at the 5% level
d) x2 is the only predictor variable significant at the 5% level
30000
-5
-10
-20000
-10000
Residuals
0
Residuals
0
5
10000
20000
10
Question 11: Among the three following figures, which residual plots show that the fitted models
should be revised?
32000
34000
36000
26
28
30
32
Fitted values
Fitted values
Residual plot 2
-2
-1
Residuals
0
1
2
Residual plot 1
52
54
56
58
Fitted values
Residual plot 3
a) The first and the second plots
b) The first plot only
c) The first and the third plots
d) The third plot only.
60
62
34
36
Question 12: Large correlations between two or more independent variables in a multiple regression
model could result in the problem of ________.
a) multicollinearity
b) autocorrelation
c) zero mean
d) non-normality
B. Practice question:
Question 1:
Data: Medgpa.csv
a. Write the logistic regression equation relating x (GPA) to y (Acceptance).
b. Using Stata to compute the estimated logit.
c. What is the interpretation of P(Acceptance= 1) when GPA = 3.67?
d. What is the estimate of the odds ratio? What is its interpretation?
Question 2: Consumer Reports provided extensive testing and ratings for 24 treadmills. An overall
score, based primarily on ease of use, ergonomics, exercise range, and quality, was developed for
each treadmill tested. In general, a higher overall score indicates better performance. The following
data show the price, the quality rating, and overall score for the 24 treadmills (Consumer Reports,
February 2006).
To incorporate the effect of quality, a categorical variable with three levels, we used two
dummy variables: Quality-E and Quality-VG. Each variable was coded 0 or 1 as follows.
Data: Treadmills.xlsx
a. Develop an estimated regression equation that could be used to estimate the overall score given
the price and the quality rating.
b. For the estimated regression equation developed in part (a), test for overall significance using
𝛼 = 0.10.
c. For the estimated regression equation developed in part (a), use the t test to determine the
significance of each independent variable. Use 𝛼 = 0.10.
d. Check 4 assumptions with stdres is standardize residual.
e. Estimate the overall score for a treadmill with a price of $2000 and a good quality rating. How
much would the estimate change if the quality rating were very good? Explain.
f. Find a 95% confidence interval and a 95% prediction interval for a treadmill with a price of
$2000 and a good quality rating.
Question 3: A study investigated the relationship between audit delay (Delay), the length of time
from a company’s fiscal year-end to the date of the auditor’s report, and variables that describe the
client and the auditor. Some of the independent variables that were included in this study follow.
Industry: A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a
bank, savings and loan, or insurance company.
Public: A dummy variable coded 1 if the company was traded on an organized exchange or over
the counter; otherwise coded 0.
Quality: A measure of overall quality of internal controls, as judged by the auditor, on a five-point
scale ranging from “virtually none” (1) to “excellent” (5).
Finished: A measure ranging from 1 to 4, as judged by the auditor, where 1 indicates “all work
performed subsequent to year-end” and 4 indicates “most work performed prior to year-end.”
Data: Audit.csv
a. Develop the estimated regression equation using all of the independent variables.
b. Did the estimated regression equation developed in part (a) provide a good fit? Explain.
c. On the basis of your observations about the relationship between Delay and Finished, develop
an alternative estimated regression equation to the one developed in (a) to explain as much of the
variability in Delay as possible.
d. Consider a model in which only Industry is used to predict Delay. At a .01 level of significance,
test for any positive autocorrelation in the data.
Download