Statistics & Econometrics Exam Questions and Solutions

Multiple Choice Question 1. b 2. e 3. d 4. a 5. c 6. c Question 7 (7.a) (i) Stratified sampling. This is because the business owners have been stratified by business types. There are 5 strata; that is stratum 1 = retailers, stratum 2 = agriculture, stratum 3 = manufacturing, stratum 4 = financial services and stratum 5 = advertising. For each stratum we select a sample size of 6 owners. (ii) Simple random sampling. This is because it uses a random number generator to select the business owners from the state Directory of Businesses. (7.b) Running multiple t tests of each population mean against all the others separately will lead to a build-up of type I errors, i.e. for each test there is a probability of rejecting the null hypothesis of equality when in fact the null hypothesis is true (our chosen α value equals the type I error). Thus the probability of rejecting the null that all the population means are equal is much higher than α when doing these multiple tests. (7.c) The F-statistic is 𝑆𝑆𝑆𝑆𝑆𝑆 9600 1920 𝑘𝑘 5 𝐹𝐹 = = = = 27.2 𝑆𝑆𝑆𝑆𝑆𝑆 2400 70.59 𝑛𝑛 − 𝑘𝑘 − 1 40 − 5 − 1 (7.d) True. When every observation is on the regression line, 𝑦𝑦𝑖𝑖 = 𝑦𝑦�𝑖𝑖 and thus the sum of squares for error 𝑆𝑆𝑆𝑆𝑆𝑆 = ∑𝑛𝑛𝑖𝑖=1(𝑦𝑦𝑖𝑖 − 𝑦𝑦�𝑖𝑖 )2 = 0. This means that the standard error of estimate 𝑆𝑆𝜀𝜀 = � and thus 𝑅𝑅2 = 1 − 𝑆𝑆𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆𝑦𝑦 = 1. 𝑆𝑆𝑆𝑆𝑆𝑆 𝑛𝑛−𝑘𝑘−1 =0 (7.e) (i) Since nA and nB are less than 10, the Wilcoxon Rank Sum test T = TA. It is a one-tailed test and 𝛼𝛼 = 0.025. Thus the decision rule is reject H0 if 𝑇𝑇 = 𝑇𝑇𝐴𝐴 ≤ 𝑇𝑇𝐿𝐿 = 12. (ii) since nA and nB are more than 10, the Wilcoxon Rank Sum test T = z. It is a onetailed test and 𝛼𝛼 = 0.05. Thus the decision rule is reject H0 if 𝑇𝑇 = |𝑧𝑧| > 𝑧𝑧𝛼𝛼/2 = 𝑧𝑧0.025 = 1.96. (7.f) We can remove the variation that we are not interested in. For example, in Week 4 we ask the question whether the new design tyres last longer, on average, than existing tyres. Independent samples can lead to more variability in our outcome as different drivers (for the new and existing tyres) may drive different ways. That is, some drivers drove in a way that extended the life of the tyre, while others drove faster and braked harder, resulting in shorter tyre lives. But, we are not interested in the variation among drivers, that is we are not interested in knowing whether drivers actually differ in the way they drive. We can eliminate that source of variation by designing a matched pairs experiment -- same drivers and cars were used in both new and existing tyres samples. This makes it easier to determine if tyre brand represented a real source of variation, and thus that one tyre design is superior to another tyre design. (7.g) (i) 𝑝𝑝̂ = 0.55 and sampling error 𝑒𝑒 = 0.03. So the 95% confidence interval is 𝑝𝑝̂ ± 𝑒𝑒 0.55 ± 0.03 The lower confidence limit is 0.52 (0.55 minus 0.03) and the upper confidence limit is 0.58 (0.55 plus 0.03). (ii) The confidence interval is entirely above 50% because it is (0.52 , 0.58) , so it is reasonable to say that more than 50% of the population thinks their weight is about right. Question 8 (8.a) The test is called the Pooled-Variance-t statistic. This test is appropriate for quantitative data (i.e. total spending in dollars on hair care products), independent samples (i.e. 210 adults live in Melbourne and 250 adults live in Sydney) and assumed unknown equal population variances. (8.b) You could construct a histogram of the data in the samples and observe the pattern. If the histogram reveals a nice bell-shaped and symmetrical distribution, then we can be confident that the data is normally distributed. (8.c) Let 𝜇𝜇1 be the population mean for the spending on hair care products in Melbourne. Let 𝜇𝜇2 be the population mean for the spending on hair care products in Sydney. H0 : µ1 – µ2 = 0 HA : µ1 – µ2 < 0 (8.d) Pooled-Variance-t statistic (1) Construct the sample means and sample variances for Sydney 𝑥𝑥̄ 2 , 𝑠𝑠22 and Melbourne 𝑥𝑥̄1 , 𝑠𝑠12 (2) Construct the Pooled-Variance-t test (𝑥𝑥̄ 1 − 𝑥𝑥̄ 2 ) 𝑡𝑡 = 1 1 �𝑠𝑠𝑝𝑝2 � + � 𝑛𝑛1 𝑛𝑛2 where 𝑠𝑠𝑝𝑝2 = (𝑛𝑛1 − 1)𝑠𝑠12 + (𝑛𝑛2 − 1)𝑠𝑠22 𝑛𝑛1 + 𝑛𝑛2 − 2 Question 9 (9.a) The estimated logit model is � 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟) 𝑃𝑃𝑃𝑃(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 1|𝑎𝑎𝑎𝑎𝑎𝑎, 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒, = 𝐹𝐹(1.656 − 0.016𝐴𝐴𝐴𝐴𝐴𝐴 − 0.111𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 − 0.004𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 − 0.466 𝑅𝑅𝑅𝑅 𝑠𝑠 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡) (9.b) (i) The negative coefficient on AGE implies that the probability that an individual smokes is lower for older people, holding EDUC, CIGPRIC and RESTAURN fixed. (ii) The negative coefficient on EDUC implies that more educated people (more years of schooling) are less likely to smoke, holding AGE, CIGPRIC and RESTAURN fixed. (9.c) The negative coefficient on CIGPRIC implies that the probability that an individual smokes is lower for people in states where the price of cigarettes including taxes is higher, holding AGE, EDUC and RESTAURN fixed. Step 1 Step 2 𝐻𝐻0 : 𝛽𝛽3 = 0 𝐻𝐻𝐴𝐴 : 𝛽𝛽3 < 0 � 𝛽𝛽 3 Test statistic is 𝑡𝑡 = 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 The t-statistic has a standard normal distribution when 𝑛𝑛 is large. Step 3 α = 0.05 Step 4 Decision rule: Reject H0 if t < -zα = -z0.05 = -1.645 Step 5 From the EViews output 𝑡𝑡 = �3 𝛽𝛽 −0.0035 = = −0.23 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 0.0155 Step 6 Since t = -0.23 > -1.645, we fail to reject the null hypothesis. The data does not support the hypothesis that increasing the cost (higher price including taxes) of smoking reduces the probability of people smoking. This analysis suggests that increasing cigarette taxes to discourage smoking may not be an effective government policy. (9.d) The probability that Ms. A smokes is � 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟) 𝑃𝑃𝑃𝑃(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 1|𝑎𝑎𝑔𝑔𝑔𝑔, 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒, = 𝐹𝐹(1.656 − 0.016(40) − 0.111(13) − 0.004(60) − 0.466) = 𝐹𝐹(−1.133) 1 = = 0.24361 or 24.36% 1 + 𝑒𝑒 −(−1.133) (9.e) The probability that Ms. B smokes is � 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟) 𝑃𝑃𝑃𝑃(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 1|𝑎𝑎𝑎𝑎𝑎𝑎, 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒, = 𝐹𝐹�1.656 − 0.016(40) − 0.111(13) − 0.004(60)� = 𝐹𝐹(−0.667) 1 = 0.33917 or 33.92% = −(−0.667) 1 + 𝑒𝑒 (9.f) Yes because Mr. A has a lower probability of smoking than Mr. B. Note that they are both identical in every way (ie same educ, cigpric and age) except the state in which they live. Question 10 (10.a) In 2005Q2, 𝑡𝑡 = 122 (because (2004 − 1975 + 1) × 4 + 2 = 122) and so 𝑦𝑦�122 = 4713.07 + 37.1(122) − 582.23 = 8657.0, which is $8657 millions. (10.b) Not enough. From the AR(1) model, we calculate the forecast for the period 2005Q2 as follows: � 2005𝑄𝑄2 = 48.04 − 0.26Δ𝑦𝑦2005𝑄𝑄1 Δ𝑦𝑦 So to answer this question I need to construct Δ𝑦𝑦𝑡𝑡 = 𝑦𝑦𝑡𝑡 − 𝑦𝑦𝑡𝑡−1 for the final period 2005Q1. That is, I need to calculate Δ𝑦𝑦2005𝑄𝑄1 = 𝑦𝑦2005𝑄𝑄1 − 𝑦𝑦2004𝑄𝑄4 . I know 𝑦𝑦2005𝑄𝑄1 = 9138 from the table. But the table does not provide a value for 𝑦𝑦2004𝑄𝑄4 . Thus, I cannot calculate Δ𝑦𝑦2005𝑄𝑄1 and subsequently I cannot forecast the value of 𝑦𝑦𝑡𝑡 for the period 2005Q2. In short, the table needs to provide a value for 𝑦𝑦2004𝑄𝑄4 . Question 11 (11.a) The estimated coefficient on LAND_SIZE equals 0.112. This tells us that a one metre squared increase in the size of the house block is associated with, on average, a $112 increase in house selling price, holding number of bedrooms and street number fixed. (11.b) The coefficient on SN4 equals -50.486. This implies that the average sale price of houses with the number “4" in their street number is $50,486 less than that for houses without a “4" in their street number, holding land size and number of bedrooms fixed. This negative sign is what we would expect, as the number “4" is supposed to bring bad luck, reducing demand for such houses and hence a lower house sale price. (11.c) The regression model is 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑒𝑒𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙_𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑖𝑖 + 𝛽𝛽2 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑠𝑠𝑖𝑖 + 𝛽𝛽3 𝑆𝑆𝑆𝑆4 + 𝛽𝛽4 𝑆𝑆𝑆𝑆8 + 𝜀𝜀𝑖𝑖 Testing for ‘good luck’ according to the sales agent’s theory and so this is a right onetailed test. Let 𝛽𝛽4 denote the coefficient for SN8. So we expect 𝛽𝛽4 > 0 because street number containing “8” will have a higher sale price than house without the street number “8”. Step 1 Step 2 The test statistic is the t-statistic Step 3. 𝛼𝛼 = 0.05 𝐻𝐻0 : 𝛽𝛽4 = 0 𝐻𝐻𝐴𝐴 : 𝛽𝛽4 > 0 𝑡𝑡 = 𝛽𝛽̂4 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 Because it is a one-tail test (and not a two-tail test), we cannot use the p-value approach. See Week 8 lecture slides. So we will use the critical value approach. Step 4. Decision Rule. Reject 𝐻𝐻0 if 𝑡𝑡 > 𝑡𝑡𝛼𝛼,𝑛𝑛−𝑘𝑘−1 = 𝑡𝑡0.05,28−4−1 = 1.714 Step 5. The value of the t-test from the EViews output is 𝛽𝛽̂4 38.97 𝑡𝑡 = = = 1.51 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 25.895 Step 6. Fail to reject the null hypothesis since t = 1.51 < 1.714 and so insufficient evidence in support of the agent's theory here. (11.d) The hypothesis that BEDROOMS has an effect on Price and so we will conduct a two-tail test since it says “an effect” without saying positive effect or negative effect. Step 1 𝐻𝐻0 : 𝛽𝛽2 = 0 𝐻𝐻𝐴𝐴 : 𝛽𝛽2 ≠ 0 Step 2 The test statistic is the t-statistic 𝑡𝑡 = 𝛽𝛽̂2 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 Critical value can be obtained from the student t-table. Step 3. 𝛼𝛼 = 0.05 Because we are testing the null hypothesis 𝛽𝛽2 equals to zero and it is a two-tail test, we can either use the critical value approach or the p-value approach. See Week 8 lecture slide. However the EViews output does not give us the p-value and so we will use the critical value approach. Step 4. Decision Rule. Reject 𝐻𝐻0 if |𝑡𝑡| > 𝑡𝑡𝛼𝛼/2,𝑛𝑛−𝑘𝑘−1 = 𝑡𝑡0.05/2,28−4−1 = 2.069 Step 5. We can calculate the value of the t-test as 𝛽𝛽̂2 86.39 𝑡𝑡 = = = 12.99 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 6.65 Step 6. Reject the null hypothesis since t =12.99 >2.069 and so sufficient evidence that BEDROOMS has an effect on PRICE. (11.e) Step 1 𝐻𝐻0 : 𝛽𝛽1 = 𝛽𝛽2 = 𝛽𝛽3 = 𝛽𝛽4 = 0 𝐻𝐻𝐴𝐴 : 𝐴𝐴𝐴𝐴 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜𝑜𝑜 𝛽𝛽𝑖𝑖 (𝑖𝑖 = 1,2,3,4) 𝑖𝑖𝑖𝑖 𝑛𝑛𝑛𝑛𝑛𝑛 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡𝑡𝑡 𝑧𝑧𝑧𝑧𝑧𝑧𝑧𝑧 Step 2 The test statistic is the F-statistic 𝑆𝑆𝑆𝑆𝑆𝑆/𝑘𝑘 𝑆𝑆𝑆𝑆𝑆𝑆/(𝑛𝑛 − 𝑘𝑘 − 1) Critical value can be obtained from the F-table. 𝐹𝐹 = Step 3. 𝛼𝛼 = 0.05 Step 4. Decision Rule. Reject 𝐻𝐻0 if 𝐹𝐹 > 𝐹𝐹𝛼𝛼,𝑘𝑘,𝑛𝑛−𝑘𝑘−1 = 𝐹𝐹0.05,4,28−4−1 = 2.8 Step 5. Since F-test is not given, we can however calculate the F-test by using information from the 𝑅𝑅2 . Recall from Tutorial 8 (Week 9) question B2. To compute the F-test, we need to know the values of SSR and SSE. From the EViews output, SSE = 38847.96 (which is the Sum squared resid) To calculate SSR, we note from the formula sheet that 𝑅𝑅2 is 𝑆𝑆𝑆𝑆𝐸𝐸 𝑅𝑅2 = 1 − 𝑆𝑆𝑆𝑆𝑦𝑦 From the EViews output 𝑅𝑅2 = 0.904. 𝑆𝑆𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆𝑦𝑦 38847.96 0.904 = 1 − 𝑆𝑆𝑆𝑆𝑦𝑦 𝑅𝑅2 = 1 − After re-arranging, we have 𝑆𝑆𝑆𝑆𝑦𝑦 = We can now calculate SSR as 38847.96 = 404666.25 1 − 0.904 𝑆𝑆𝑆𝑆𝑦𝑦 = 𝑆𝑆𝑆𝑆𝑆𝑆 + 𝑆𝑆𝑆𝑆𝑆𝑆 404666.25 = 𝑆𝑆𝑆𝑆𝑆𝑆 + 38847.96 After re-arranging, we have 𝑆𝑆𝑆𝑆𝑅𝑅 = 365818.29 From the EViews output, n = 28 and k = 4. Now the F-test can be calculated as 𝑆𝑆𝑆𝑆𝑆𝑆/𝑘𝑘 365818.29/4 𝐹𝐹 = = = 54.15 𝑆𝑆𝑆𝑆𝑆𝑆/(𝑛𝑛 − 𝑘𝑘 − 1) 38847.96/23 Step 6. Reject the null hypothesis since F = 54.15 > 2.8 and so sufficient evidence to infer that the model is useful. (11.f) The standard error of regression is denoted as 𝑠𝑠𝜀𝜀 , which is the S.E. of regression in the EViews output. See Tutorial 7 (Week 8) or Week 7 lecture slides. From the formula sheet, 𝑆𝑆𝑆𝑆𝑆𝑆 38847.96 =� = 41.098 𝑠𝑠𝜀𝜀 = � 𝑛𝑛 − 𝑘𝑘 − 1 28 − 4 − 1 The standard error of regression is $41,098 (as stated in Week 8 lecture slides 𝑠𝑠𝜀𝜀 is measured in the units of dependent variable and in this case the dependent variable is measured in thousands of dollars). This means that the standard deviation of the residuals 𝑒𝑒̂𝑖𝑖 is $41,098. (11.g) Step 1 𝐻𝐻0 : 𝛽𝛽1 = 0 𝐻𝐻𝐴𝐴 : 𝛽𝛽1 > 0 Step 2 The test statistic is the t-statistic 𝑡𝑡 = 𝛽𝛽̂2 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 Critical value can be obtained from the student t-table. Step 3. 𝛼𝛼 = 0.05 We will use the critical value approach. Step 4. (same as question 11.c above) Decision Rule. Reject 𝐻𝐻0 if 𝑡𝑡 > 𝑡𝑡𝛼𝛼,𝑛𝑛−𝑘𝑘−1 = 𝑡𝑡0.05,28−4−1 = 1.714 Step 5. We cannot calculate the t-test because 0.11 𝛽𝛽̂1 = = 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡 = 𝑆𝑆𝑆𝑆𝑆𝑆. 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝑁𝑁𝑁𝑁𝑁𝑁 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 “Not enough”. To answer this question, we need to know the Std. Error in order to calculate the t-test. Using the t-test we can determine whether we reject 𝐻𝐻0 or not.

Statistics & Econometrics Exam Questions and Solutions

Related documents

Products

Support

Statistics & Econometrics Exam Questions and Solutions

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib