Uploaded by rosiestanforever

Past Practice Exam ans

advertisement
Multiple Choice Question
1. b
2. e
3. d
4. a
5. c
6. c
Question 7
(7.a) (i) Stratified sampling. This is because the business owners have been stratified
by business types. There are 5 strata; that is stratum 1 = retailers, stratum 2 =
agriculture, stratum 3 = manufacturing, stratum 4 = financial services and stratum 5 =
advertising. For each stratum we select a sample size of 6 owners.
(ii) Simple random sampling. This is because it uses a random number generator to
select the business owners from the state Directory of Businesses.
(7.b) Running multiple t tests of each population mean against all the others separately
will lead to a build-up of type I errors, i.e. for each test there is a probability of rejecting
the null hypothesis of equality when in fact the null hypothesis is true (our chosen α
value equals the type I error). Thus the probability of rejecting the null that all the
population means are equal is much higher than α when doing these multiple tests.
(7.c) The F-statistic is
𝑆𝑆𝑆𝑆𝑆𝑆
9600
1920
π‘˜π‘˜
5
𝐹𝐹 =
=
=
= 27.2
𝑆𝑆𝑆𝑆𝑆𝑆
2400
70.59
𝑛𝑛 − π‘˜π‘˜ − 1
40 − 5 − 1
(7.d) True. When every observation is on the regression line, 𝑦𝑦𝑖𝑖 = 𝑦𝑦�𝑖𝑖 and thus the sum
of squares for error 𝑆𝑆𝑆𝑆𝑆𝑆 = ∑𝑛𝑛𝑖𝑖=1(𝑦𝑦𝑖𝑖 − 𝑦𝑦�𝑖𝑖 )2 = 0.
This means that the standard error of estimate π‘†π‘†πœ€πœ€ = οΏ½
and thus 𝑅𝑅2 = 1 −
𝑆𝑆𝑆𝑆𝑆𝑆
𝑆𝑆𝑆𝑆𝑦𝑦
= 1.
𝑆𝑆𝑆𝑆𝑆𝑆
𝑛𝑛−π‘˜π‘˜−1
=0
(7.e) (i) Since nA and nB are less than 10, the Wilcoxon Rank Sum test T = TA. It is a
one-tailed test and 𝛼𝛼 = 0.025. Thus the decision rule is reject H0 if 𝑇𝑇 = 𝑇𝑇𝐴𝐴 ≤ 𝑇𝑇𝐿𝐿 = 12.
(ii) since nA and nB are more than 10, the Wilcoxon Rank Sum test T = z. It is a onetailed test and 𝛼𝛼 = 0.05. Thus the decision rule is reject H0 if 𝑇𝑇 = |𝑧𝑧| > 𝑧𝑧𝛼𝛼/2 =
𝑧𝑧0.025 = 1.96.
(7.f) We can remove the variation that we are not interested in. For example, in Week
4 we ask the question whether the new design tyres last longer, on average, than existing
tyres. Independent samples can lead to more variability in our outcome as different
drivers (for the new and existing tyres) may drive different ways. That is, some drivers
drove in a way that extended the life of the tyre, while others drove faster and braked
harder, resulting in shorter tyre lives. But, we are not interested in the variation among
drivers, that is we are not interested in knowing whether drivers actually differ in the
way they drive. We can eliminate that source of variation by designing a matched pairs
experiment -- same drivers and cars were used in both new and existing tyres samples.
This makes it easier to determine if tyre brand represented a real source of variation,
and thus that one tyre design is superior to another tyre design.
(7.g) (i) 𝑝𝑝̂ = 0.55 and sampling error 𝑒𝑒 = 0.03. So the 95% confidence interval is
𝑝𝑝̂ ± 𝑒𝑒
0.55 ± 0.03
The lower confidence limit is 0.52 (0.55 minus 0.03) and the upper confidence limit is
0.58 (0.55 plus 0.03).
(ii) The confidence interval is entirely above 50% because it is (0.52 , 0.58) , so it is
reasonable to say that more than 50% of the population thinks their weight is about
right.
Question 8
(8.a) The test is called the Pooled-Variance-t statistic. This test is appropriate for
quantitative data (i.e. total spending in dollars on hair care products), independent
samples (i.e. 210 adults live in Melbourne and 250 adults live in Sydney) and
assumed unknown equal population variances.
(8.b) You could construct a histogram of the data in the samples and observe the
pattern. If the histogram reveals a nice bell-shaped and symmetrical distribution, then
we can be confident that the data is normally distributed.
(8.c)
Let πœ‡πœ‡1 be the population mean for the spending on hair care products in Melbourne.
Let πœ‡πœ‡2 be the population mean for the spending on hair care products in Sydney.
H0 : µ1 – µ2 = 0
HA : µ1 – µ2 < 0
(8.d)
Pooled-Variance-t statistic
(1) Construct the sample means and sample variances for Sydney π‘₯π‘₯Μ„ 2 , 𝑠𝑠22 and
Melbourne π‘₯π‘₯Μ„1 , 𝑠𝑠12
(2) Construct the Pooled-Variance-t test
(π‘₯π‘₯Μ„ 1 − π‘₯π‘₯Μ„ 2 )
𝑑𝑑 =
1
1
�𝑠𝑠𝑝𝑝2 οΏ½ + οΏ½
𝑛𝑛1 𝑛𝑛2
where
𝑠𝑠𝑝𝑝2 =
(𝑛𝑛1 − 1)𝑠𝑠12 + (𝑛𝑛2 − 1)𝑠𝑠22
𝑛𝑛1 + 𝑛𝑛2 − 2
Question 9
(9.a) The estimated logit model is
οΏ½ 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, π‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿ)
𝑃𝑃𝑃𝑃(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 1|π‘Žπ‘Žπ‘Žπ‘Žπ‘Žπ‘Ž, 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒,
= 𝐹𝐹(1.656 − 0.016𝐴𝐴𝐴𝐴𝐴𝐴 − 0.111𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 − 0.004𝐢𝐢𝐢𝐢𝐢𝐢𝐢𝐢𝐢𝐢𝐢𝐢𝐢𝐢 − 0.466 𝑅𝑅𝑅𝑅 𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑)
(9.b) (i) The negative coefficient on AGE implies that the probability that an
individual smokes is lower for older people, holding EDUC, CIGPRIC and
RESTAURN fixed.
(ii) The negative coefficient on EDUC implies that more educated people (more years
of schooling) are less likely to smoke, holding AGE, CIGPRIC and RESTAURN
fixed.
(9.c) The negative coefficient on CIGPRIC implies that the probability that an
individual smokes is lower for people in states where the price of cigarettes including
taxes is higher, holding AGE, EDUC and RESTAURN fixed.
Step 1
Step 2
𝐻𝐻0 : 𝛽𝛽3 = 0
𝐻𝐻𝐴𝐴 : 𝛽𝛽3 < 0
οΏ½
𝛽𝛽
3
Test statistic is 𝑑𝑑 =
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
The t-statistic has a standard normal distribution when 𝑛𝑛 is large.
Step 3
α = 0.05
Step 4
Decision rule: Reject H0 if t < -zα = -z0.05 = -1.645
Step 5
From the EViews output
𝑑𝑑 =
οΏ½3
𝛽𝛽
−0.0035
=
= −0.23
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
0.0155
Step 6
Since t = -0.23 > -1.645, we fail to reject the null hypothesis.
The data does not support the hypothesis that increasing the cost (higher price
including taxes) of smoking reduces the probability of people smoking. This analysis
suggests that increasing cigarette taxes to discourage smoking may not be an effective
government policy.
(9.d) The probability that Ms. A smokes is
οΏ½ 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, π‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿ)
𝑃𝑃𝑃𝑃(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 1|π‘Žπ‘Žπ‘”π‘”π‘”π‘”, 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒,
= 𝐹𝐹(1.656 − 0.016(40) − 0.111(13) − 0.004(60) − 0.466)
= 𝐹𝐹(−1.133)
1
=
= 0.24361 or 24.36%
1 + 𝑒𝑒 −(−1.133)
(9.e) The probability that Ms. B smokes is
οΏ½ 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, π‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿπ‘Ÿ)
𝑃𝑃𝑃𝑃(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 1|π‘Žπ‘Žπ‘Žπ‘Žπ‘Žπ‘Ž, 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒,
= 𝐹𝐹�1.656 − 0.016(40) − 0.111(13) − 0.004(60)οΏ½
= 𝐹𝐹(−0.667)
1
= 0.33917 or 33.92%
=
−(−0.667)
1 + 𝑒𝑒
(9.f) Yes because Mr. A has a lower probability of smoking than Mr. B. Note that
they are both identical in every way (ie same educ, cigpric and age) except the state in
which they live.
Question 10
(10.a) In 2005Q2, 𝑑𝑑 = 122 (because (2004 − 1975 + 1) × 4 + 2 = 122) and so
𝑦𝑦�122 = 4713.07 + 37.1(122) − 582.23 = 8657.0, which is $8657 millions.
(10.b) Not enough. From the AR(1) model, we calculate the forecast for the period
2005Q2 as follows:
οΏ½ 2005𝑄𝑄2 = 48.04 − 0.26Δ𝑦𝑦2005𝑄𝑄1
Δ𝑦𝑦
So to answer this question I need to construct Δ𝑦𝑦𝑑𝑑 = 𝑦𝑦𝑑𝑑 − 𝑦𝑦𝑑𝑑−1 for the final period
2005Q1. That is, I need to calculate Δ𝑦𝑦2005𝑄𝑄1 = 𝑦𝑦2005𝑄𝑄1 − 𝑦𝑦2004𝑄𝑄4 .
I know 𝑦𝑦2005𝑄𝑄1 = 9138 from the table. But the table does not provide a value for
𝑦𝑦2004𝑄𝑄4 . Thus, I cannot calculate Δ𝑦𝑦2005𝑄𝑄1 and subsequently I cannot forecast the
value of 𝑦𝑦𝑑𝑑 for the period 2005Q2. In short, the table needs to provide a value for
𝑦𝑦2004𝑄𝑄4 .
Question 11
(11.a) The estimated coefficient on LAND_SIZE equals 0.112. This tells us that a one
metre squared increase in the size of the house block is associated with, on average, a
$112 increase in house selling price, holding number of bedrooms and street number
fixed.
(11.b) The coefficient on SN4 equals -50.486. This implies that the average sale price
of houses with the number “4" in their street number is $50,486 less than that for
houses without a “4" in their street number, holding land size and number of
bedrooms fixed. This negative sign is what we would expect, as the number “4" is
supposed to bring bad luck, reducing demand for such houses and hence a lower
house sale price.
(11.c) The regression model is
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑒𝑒𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙_𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑖𝑖 + 𝛽𝛽2 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑠𝑠𝑖𝑖 + 𝛽𝛽3 𝑆𝑆𝑆𝑆4 + 𝛽𝛽4 𝑆𝑆𝑆𝑆8 + πœ€πœ€π‘–π‘–
Testing for ‘good luck’ according to the sales agent’s theory and so this is a right onetailed test. Let 𝛽𝛽4 denote the coefficient for SN8. So we expect 𝛽𝛽4 > 0 because street
number containing “8” will have a higher sale price than house without the street
number “8”.
Step 1
Step 2
The test statistic is the t-statistic
Step 3. 𝛼𝛼 = 0.05
𝐻𝐻0 : 𝛽𝛽4 = 0
𝐻𝐻𝐴𝐴 : 𝛽𝛽4 > 0
𝑑𝑑 =
𝛽𝛽̂4
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
Because it is a one-tail test (and not a two-tail test), we cannot use the p-value
approach. See Week 8 lecture slides. So we will use the critical value approach.
Step 4. Decision Rule. Reject 𝐻𝐻0 if 𝑑𝑑 > 𝑑𝑑𝛼𝛼,𝑛𝑛−π‘˜π‘˜−1 = 𝑑𝑑0.05,28−4−1 = 1.714
Step 5. The value of the t-test from the EViews output is
𝛽𝛽̂4
38.97
𝑑𝑑 =
=
= 1.51
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 25.895
Step 6. Fail to reject the null hypothesis since t = 1.51 < 1.714 and so insufficient
evidence in support of the agent's theory here.
(11.d) The hypothesis that BEDROOMS has an effect on Price and so we will
conduct a two-tail test since it says “an effect” without saying positive effect or
negative effect.
Step 1
𝐻𝐻0 : 𝛽𝛽2 = 0
𝐻𝐻𝐴𝐴 : 𝛽𝛽2 ≠ 0
Step 2
The test statistic is the t-statistic
𝑑𝑑 =
𝛽𝛽̂2
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
Critical value can be obtained from the student t-table.
Step 3. 𝛼𝛼 = 0.05
Because we are testing the null hypothesis 𝛽𝛽2 equals to zero and it is a two-tail test,
we can either use the critical value approach or the p-value approach. See Week 8
lecture slide.
However the EViews output does not give us the p-value and so we will use the
critical value approach.
Step 4. Decision Rule. Reject 𝐻𝐻0 if |𝑑𝑑| > 𝑑𝑑𝛼𝛼/2,𝑛𝑛−π‘˜π‘˜−1 = 𝑑𝑑0.05/2,28−4−1 = 2.069
Step 5. We can calculate the value of the t-test as
𝛽𝛽̂2
86.39
𝑑𝑑 =
=
= 12.99
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
6.65
Step 6. Reject the null hypothesis since t =12.99 >2.069 and so sufficient evidence
that BEDROOMS has an effect on PRICE.
(11.e)
Step 1
𝐻𝐻0 : 𝛽𝛽1 = 𝛽𝛽2 = 𝛽𝛽3 = 𝛽𝛽4 = 0
𝐻𝐻𝐴𝐴 : 𝐴𝐴𝐴𝐴 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 π‘œπ‘œπ‘œπ‘œπ‘œπ‘œ 𝛽𝛽𝑖𝑖 (𝑖𝑖 = 1,2,3,4) 𝑖𝑖𝑖𝑖 𝑛𝑛𝑛𝑛𝑛𝑛 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑑𝑑𝑑𝑑 𝑧𝑧𝑧𝑧𝑧𝑧𝑧𝑧
Step 2
The test statistic is the F-statistic
𝑆𝑆𝑆𝑆𝑆𝑆/π‘˜π‘˜
𝑆𝑆𝑆𝑆𝑆𝑆/(𝑛𝑛 − π‘˜π‘˜ − 1)
Critical value can be obtained from the F-table.
𝐹𝐹 =
Step 3. 𝛼𝛼 = 0.05
Step 4. Decision Rule. Reject 𝐻𝐻0 if 𝐹𝐹 > 𝐹𝐹𝛼𝛼,π‘˜π‘˜,𝑛𝑛−π‘˜π‘˜−1 = 𝐹𝐹0.05,4,28−4−1 = 2.8
Step 5. Since F-test is not given, we can however calculate the F-test by using
information from the 𝑅𝑅2 . Recall from Tutorial 8 (Week 9) question B2. To compute
the F-test, we need to know the values of SSR and SSE.
From the EViews output, SSE = 38847.96 (which is the Sum squared resid)
To calculate SSR, we note from the formula sheet that 𝑅𝑅2 is
𝑆𝑆𝑆𝑆𝐸𝐸
𝑅𝑅2 = 1 −
𝑆𝑆𝑆𝑆𝑦𝑦
From the EViews output 𝑅𝑅2 = 0.904.
𝑆𝑆𝑆𝑆𝑆𝑆
𝑆𝑆𝑆𝑆𝑦𝑦
38847.96
0.904 = 1 −
𝑆𝑆𝑆𝑆𝑦𝑦
𝑅𝑅2 = 1 −
After re-arranging, we have
𝑆𝑆𝑆𝑆𝑦𝑦 =
We can now calculate SSR as
38847.96
= 404666.25
1 − 0.904
𝑆𝑆𝑆𝑆𝑦𝑦 = 𝑆𝑆𝑆𝑆𝑆𝑆 + 𝑆𝑆𝑆𝑆𝑆𝑆
404666.25 = 𝑆𝑆𝑆𝑆𝑆𝑆 + 38847.96
After re-arranging, we have 𝑆𝑆𝑆𝑆𝑅𝑅 = 365818.29
From the EViews output, n = 28 and k = 4. Now the F-test can be calculated as
𝑆𝑆𝑆𝑆𝑆𝑆/π‘˜π‘˜
365818.29/4
𝐹𝐹 =
=
= 54.15
𝑆𝑆𝑆𝑆𝑆𝑆/(𝑛𝑛 − π‘˜π‘˜ − 1) 38847.96/23
Step 6. Reject the null hypothesis since F = 54.15 > 2.8 and so sufficient evidence to
infer that the model is useful.
(11.f) The standard error of regression is denoted as π‘ π‘ πœ€πœ€ , which is the S.E. of regression
in the EViews output. See Tutorial 7 (Week 8) or Week 7 lecture slides. From the
formula sheet,
𝑆𝑆𝑆𝑆𝑆𝑆
38847.96
=οΏ½
= 41.098
π‘ π‘ πœ€πœ€ = οΏ½
𝑛𝑛 − π‘˜π‘˜ − 1
28 − 4 − 1
The standard error of regression is $41,098 (as stated in Week 8 lecture slides π‘ π‘ πœ€πœ€ is
measured in the units of dependent variable and in this case the dependent variable is
measured in thousands of dollars). This means that the standard deviation of the
residuals 𝑒𝑒̂𝑖𝑖 is $41,098.
(11.g)
Step 1
𝐻𝐻0 : 𝛽𝛽1 = 0
𝐻𝐻𝐴𝐴 : 𝛽𝛽1 > 0
Step 2
The test statistic is the t-statistic
𝑑𝑑 =
𝛽𝛽̂2
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
Critical value can be obtained from the student t-table.
Step 3. 𝛼𝛼 = 0.05
We will use the critical value approach.
Step 4. (same as question 11.c above) Decision Rule. Reject 𝐻𝐻0 if 𝑑𝑑 > 𝑑𝑑𝛼𝛼,𝑛𝑛−π‘˜π‘˜−1 =
𝑑𝑑0.05,28−4−1 = 1.714
Step 5. We cannot calculate the t-test because
0.11
𝛽𝛽̂1
=
= 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
𝑑𝑑 =
𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝑁𝑁𝑁𝑁𝑁𝑁 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺
“Not enough”. To answer this question, we need to know the Std. Error in order to
calculate the t-test. Using the t-test we can determine whether we reject 𝐻𝐻0 or not.
Download