Uploaded by Houda Boubaker

MGCR 271 Business Statistics Final Exam

advertisement
MGCR 271
Winter 2011
Business Statistics
MGCR 271: Business Statistics
Final Exam Solutions
April 2011
Examiner:
Ramnath Vaidyanathan
Assoc Examiner:
Student Name
McGill ID
INSTRUCTIONS
1. Please write your NAME and STUDENT NUMBER on the exam paper.
2. This examination is PRINTED ON BOTH SIDES of the paper.
3. This examination consists of a total of 10 QUESTIONS.
4. This examination consists of a total of 14 pages, including the cover page.
5. SPACE IS PROVIDED on the examination to answer all questions.
6. MARKS alloted to each question appear next to question numbers.
7. This is a CLOSED BOOK examination and counts for 50% of your final grade.
8. TABLES and SELECTED FORMULAE are handed out separately.
9. Your are permitted one double-sided CRIB SHEET printed or hand-written.
10. Show LOGIC since part marks will be given for method and partial results.
11. If you feel that a question is ambiguous, briefly explain your interpretation and state your assumptions.
12. You are permitted TRANSLATION dictionaries ONLY.
13. STANDARD CALCULATOR permitted ONLY.
14. This examination and the tables provided MUST BE RETURNED.
15. GOOD LUCK!
Page 1 of 14
MGCR 271
Winter 2011
Business Statistics
1. (10 points) The federal government is putting pressure on hospitals to shorten the average length of
stay of patients. A random sample of 61 hospitals in one state had a mean length of stay in 1995 of 4.3
days, with a standard deviation of 2.1 days.
(a) (1 point) Use this information to construct a 80% confidence interval to estimate the mean length
of stay of all hospitals in this particular state.
The 80% confidence interval can be calculated to be x̄ ± z⇤ ⇥
1.282. Substituting for
z⇤ ,
ps .
n
From the table, we get z⇤ =
we get the confidence interval to be (3.96, 4.6446).
(b) (2 points) If you picked another random sample of 61 hospitals and constructed a confidence
interval using the margin of error in (a) would you expect that there is an 80% probability that
the mean length of stay for all the hospitals is within this interval calculated using the new
sample mean? If not, then what would this probability be?
No. The Confidence Interval calculated is for the mean length of stay for all hospitals in the state
and not for a sample. Hence, I would not expect 80% of the hospitals to have a length of stay
within the interval calculated.
(c) (2 points) If the sample had included 135 hospitals (and all else turned out to be the same), how
would the confidence interval have been affected? Explain and write down the new confidence
interval?
The sample size has increased from n = 61 to n = 135. Hence the width of the confidence
interval would decrease. The new margin of error can be calculated to be moe = z⇤ psn , which
gives us moe = 0.232. Hence, the new confidence interval can be calculated as 4.3 ± 0.232 which
gives us (4.07, 4.5316).
(d) (2 points) If you wanted the margin of error for the confidence interval calculated in (c) to be the
same as that calculated in (a), what confidence level would you choose for the interval in (c).
The margin of error calculated in (a) is 0.345. Now, the margin of error in (c) can be expressed
as
p
z⇤ p2.1 . For the two to be equal, we needz⇤ p2.1 = 0.345. Hence, we have z⇤ =
135
can be calculated as
z⇤
135
0.345⇥ 135
,
2.1
which
= 1.91. From the table, we get that the corresponding confidence level is
given by C = 94.39%
(e) (3 points) Suppose that µ is the mean length of stay across all hospitals. You randomly sample a
group of 25 hospitals and find that the sample mean is x̄ = 3.6. Using a two-tailed test, you find
that you can reject the null hypothesis H0 : µ = 3.9 at the 5% significance level. What can you
conclude about the standard deviation of the sample.
The null hypothesis H0 : µ = 3.9 is rejected at the 5% level if z =
x̄ µ0
p
s/ n
3.6 3.9
p
s/ 25
1
s
s
x̄ µ0
p
s/ n
> z0.025 . In other words,
> z0.025
>
>
<
=
1.96
1.96
p
0.3 ⇥ 25
p
0.3 ⇥ 25
1.96
0.765
Page 2 of 14
MGCR 271
Winter 2011
Business Statistics
2. (10 points) Is Caffeine Dependence Real? Our subjects are 11 people diagnosed as being dependent on
caffeine. Each subject was barred from coffee, colas, and other substances containing caffeine. Instead
they took capsules containing their normal caffeine intake. During a different time period, they took
placebo capsules. The order in which the subjects took caffeine and the placebo was randomized. The
subjects were asked to press a button 200 times as quickly as possible both while deprived of caffeine
(placebo) and while not deprived of caffeine (caffeine capsule containing normal caffeine intake).
Their heart rate was recorded in beats per minute. Examine the differences in beats per minute with
and without caffeine. Does caffeine lead to a raise in a person’s heart beat?
Hear Rate (in beats)
Subject
Caffeine
Placebo
Diff
1
281
201
80
2
284
262
22
3
300
283
17
4
421
290
131
5
240
259
-19
6
294
291
3
7
377
354
23
8
345
346
-1
9
303
283
20
10
11
340
408
391
411
-51
-3
Average
326.6
306.5
20.2
Stdev
56.8
62.3
48.7
(a) (1 point) Clearly state the hypothesis described in the problem using mathematical notation only.
• H0 : µd  0
• Ha : µd > 0 (caffeine leads to a raise in a person’s heartbeat)
(b) (2 points) How would you test the hypothesis stated in (a)? Explain your choice of test.
• (1 point) We would use a Matched Pair t
test
• (1 point) We choose this test since the two samples are matched (they contain the same set
of people)
(c) (3 points) Compute the values of the standard error and the test score.
s
48.7
• (1 point) Standard Error: se = pd = p = 14.684
n
11
x̄d d0
20.2 0
• (2 points) Test Score: t =
=
= 1.376
se
14.684
(d) (2 points) Compute the p-value of the test statistic and state your conclusions.
• (1 point) p = 0.099
• (1 point) Comparing the p value with the chosen value of a, we can make the appropriate
conclusions (reject H0 if p < a and fail to reject H0 if p > a)
Page 3 of 14
MGCR 271
Winter 2011
Business Statistics
(e) (2 points) Compute a 90% confidence interval for the increase in heart rate caused by consumption of caffeine.
⇤ ⇥ se.
• (1 point) 90% confidence interval is given by [ L, U ] = x̄d ± z0.05
• (1 point) Plugging the values, we get [ L, U ] = 20.2 ± 1.812 ⇥ 14.684 = [ 6.413, 46.813]
Page 4 of 14
MGCR 271
Winter 2011
Business Statistics
3. (10 points) The output below is from a multiple regression analysis of the annual sales (SALES, measured in dollars) of 25 specialty-gift stores, predicted from the store’s location (coded by a dummy
variable: MALL=1 if the store is in a shopping mall; MALL=0 if the store is not in a shopping mall),
and a measure of the store’s number of annual customers (CUSTOMERS).
Variable
Estimate
Std. Error
t Stat
(Intercept)
-36589
82957
-0.44
CUSTOMERS
10.33
MALL
209475
p-Value
2.30
77012
(a) (2 points) Describe the meaning of the coefficient for MALL in the regression output.
The coefficient b mall = 209475 implies that a store located in a mall has additional of $209475 as
compared to a store not located in a mall having the same number of customers.
(b) (3 points) Test the hypothesis that being located in a mall positively impacts the annual sales of
a store. Clearly state the hypothesis (mathematically), test score, p-value and conclusions of the
test.
• H0 : b mall  0, Ha : b mall > 0
• t=
209475
77012
= 2.72
• 0.005 < p < 0.01
• Conclusion: There is sufficient evidence that being located in a mall positively impacts annual sales (a = 5%)
(c) (3 points) Determine a 95% confidence interval for the additional sales associated with a customer, for stores located at malls.
• Additional sales associated with an customer at any store is given by b customer
• 95% Confidence Interval for b customer = bcustomer ± t⇤ ⇥ se(bcustome ) = 10.33 ± 2.064 ⇥ 4.49
(d) (2 points) Suppose that Store A is located in a Mall while Store B is not. If both stores have the
same average sales, what can you say about the difference in number of customers visiting the
two stores.
For Stores A and B to have the same average sales, the difference in number of customers d
should be such that 10.33 ⇥ d = 209475, which implies d = 2.028 ⇥ 104
Page 5 of 14
MGCR 271
Winter 2011
Business Statistics
4. (10 points) Consider a random variable X that takes on one of two values 0 and 1 such that Pr( X =
0) = p and Pr( X = 1) = 1
p. Let us suppose that we wish to test the null hypothesis H0 : p =
3
4.
against the alternate hypothesis Ha : p =
1
4
We take four independent observations X1 , X2 , X3 and
X4 of this random variable and set the following decision rule: “Reject H0 if X1 + X2 + X3 + X4
2”
and “Do Not Reject H0 if X1 + X2 + X3 + X4 < 2”.
(a) (3 points) Determine the probability of a Type I Error
⇣
⌘
• Type I Error = Pr(Rej H0 | H0 is TRUE) = Pr X1 + X2 + X3 + X4 2 | p = 14
⇣
⌘
• Pr X1 + X2 + X3 + X4 2 | p = 14 = 1 Pr( X1 + X2 + X3 + X4 = 0) Pr( X1 + X2 + X3 +
X4 = 0 )
• This can be simplified as 1
⇣ ⌘4
(31)
1
4
✓ ◆3 ✓ ◆
1
3
= 0.961
4
4
(a) (2 points) Determine the Type II Error
• Type I Error = Pr(Dont Rej H0 | Ha is TRUE) = Pr X1 + X2 + X3 + X4 < 2 | p =
• Pr X1 + X2 + X3 + X4 < 2 | p =
X4 = 0 )
• This can be simplified as
3 4
4
3
4
3
4
= Pr( X1 + X2 + X3 + X4 = 0) + Pr( X1 + X2 + X3 +
+ (31)
✓ ◆3 ✓ ◆
3
1
= 0.367
4
4
(b) (5 points) Suppose instead of only four independent observations, we take 100 independent
observations X1 , X2 , . . . , X100 of this random variable, and use the decision rule “Reject H0 if
X1 + X2 + . . . X100
30”, then calculate the power of the test. (Hint. Use the fact that X1 + X2 +
. . . + Xn follows a normal distribution with mean µ = np and variance s2 = np(1
p), where p is the
probability of any of the Xi0 s being equal to 1)
• (1 point) Type II Error = Pr(Dont Rej H0 | H0 is FALSE) = Pr X1 + X2 + . . . + X100 < 30 | p =
⇣
⌘
p
• (3 points) Binomial Approximation X1 + X2 + . . . + Xn ⇠ N nq, nq(1 q)
✓
◆
30 100⇤0.25
p
• (1 point) Type II Error = Pr z <
• z = 1.15
100⇤0.66⇤(1 0.66)
Page 6 of 14
3
4
MGCR 271
Winter 2011
Business Statistics
5. (10 Points) Bon Air Elementary School has 300 students. The principal of the school thinks that the
average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to
20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard
deviation of 10. Based on these results, should the principal accept or reject her original hypothesis?
Assume a significance level of 0.01.
(a) (2 points) State the null and alternative hypotheses only using letters and mathematical symbols.
• H0 : µ  110
• Ha : µ > 110
(b) (2 points) Identify the population, sample, parameter and statistic for the hypothesis being
tested.
• Population: Students from Bon Air
• Sample: Sample of Students from Bon Air
• Parameter: Mean IQ of the Population
• Statistic: Mean IQ of the Sample
(c) (3 points) Test the hypothesis using appropriate methodology and state your conclusions in the
context of this problem.
• Use a 1-sample t-test
(d) (3 points) Draw the diagram of the distribution of the test statistic and clearly indicate the distribution, values of the critical test statistic(s), the test score and shade the area corresponding to
the p-value.
• Draw sampling distribution and indicate appropriate values
Page 7 of 14
MGCR 271
Winter 2011
Business Statistics
6. (10 Points) The concept of using a seatbelt in a moving vehicle was first thought of in 1849. However,
it did not become popular until the three-point seatbelt was patented by the Swedish inventor Nils
Bohlin and introduced by Volvo in 1959. This model, used in the modern day, consists of three attachment points, the shoulder and both hips. Its job is to protect an occupant from injury in the event
of a car accident. A researcher at McGill is interested in determining if wearing seat belts actually
helps save lives in car accidents. He collected data from government records of seatbelt usage and
corresponding fatality rates in car accidents, which is summarized in the table below.
Characteristic
No. Belted
No. Unbelted
Total = 1321
Total = 2057
16-20
241
479
21-30
281
692
31-50
304
550
>50
Age Group (Years)
495
336
Male
709
1366
Female
Occupant Died
612
657
691
1287
(a) (2 points) State the null and alternate hypotheses that you would test.
• H0 : p NB
• Ha : p NB
p B  0 (or p NB
p B = 0)
pB > 0
(b) (4 points) Test the hypothesis stated in (a) using the appropriate methodology and state your
conclusions in the context of this problem, along with the p-value.
r
⇣
⌘
• (2 points) se = p̂(1 p̂) n1 + n1B = 0.017
A
• (1 point) z =
( p̂ B p̂ A ) 0
se
= 7.36
• (1 point) p = 0
(c) (4 points) Are men more likely to wear seat belts as compared to women? Use an appropriate
confidence interval to arrive at the best possible conclusion.
• (1 point) Let pm denote the proportion of men wearing seat belts and pw denote the proportion of women. We have p̂m = 0.34 and p̂w = 0.47.
• (1 point) The standard error for pm
pw can be computed as se =
0.017
q
• (1 point) Constructing a 95% confidence interval for the difference pm
– L=
0.13
1.96 ⇥ 0.017 =
– U=
0.13 + 1.96 ⇥ 0.017 =
p̂m (1 p̂m )
nm
+
p̂w (1 p̂w )
nw
=
pw , we have
0.16
0.09
• (1 point) From the 95% confidence interval for the difference in proportions we can conclude
that men are less likely to wear seat belts than women
Page 8 of 14
MGCR 271
Winter 2011
Business Statistics
7. (10 points) Justin Cruanes is looking to buy a used Toyota Camry. He checks the Internet and finds a
huge list of Camrys for sale in his area. He selects a random sample of 20 cars, ranging in age from
2 years old to 15 years old. For each car, he enters the age (in years) and the offered sales price (in
thousands) into Excel. He runs a regression predicting price from age, and gets the following (edited)
output:
Variable
Estimate
Std. Error
(Intercept)
12.10
0.60
Age
-0.80
0.07
t Stat
p-Value
(a) (1 point) How long does it take for the price of a car to drop to $6,000?
• (1 point) The equation for average price of a car (in thousands) is given by Price = 12.1
0.80 ⇥ Age
• (1 point) For price to drop to $6,000, we require Age =
12.1 6
0.80
= 7.625 years.
(b) (2 points) Car #5 in the sample was 15 years old and cost $5,000. Determine the predicted price
and the residual for this car
• Predicted Price = (12.1
0.80 ⇥ 15) ⇥ 1000 = 100 dollars
• Residual = Actual Price - Predicted Price =5000
100 = 4900
(c) (2 points) Construct a 90% confidence interval for the yearly drop in price.
• Yearly drop in price is given by
• 90% CI for b Age is given by
• t⇤ =
1.73
b Age
b Age ± t⇤ se(b Age )
• CI equals 0.80 ± 1.734 ⇥ 0.07 = (0.679, 0.921)
(d) (3 points) What percentage of variation in the price of a car is explained by its age?
r (n
• If we conducted a test of correlation, we would get a t-statistic equal to t = p
1
2)
r2
0.80
• This t-statistic would be the same as that of the slope of the regression. So t =
0.07
• Equating the two, we can solve for r and hence r2 which will equal the coefficient of determination R2 .
(e) (2 points) What conditions do the errors need to satisfy for the simple regression model to be
valid?
• Equal Variance
• Independence
• Normality
Page 9 of 14
MGCR 271
Winter 2011
Business Statistics
8. (10 points) A professor asked his research students to anonymously rate how well they liked statistics.
He is interested in testing if males tend to give higher evaluations. The results of the evaluations are
summarized below.
Group
Count
Mean
Variance
Males
12
5.25
6.57
Females
31
4.37
7.55
(a) (2 points) Clearly state the null and alternate hypothesis to test.
• H0 : µ M
• Ha : µ M
µF  0
µF > 0
(b) (4 points) Compute the values of the standard error and the test score.
• (2 points) se =
r
s21
n1
+
s22
n2
= 0.89
• (2 points) t = 0.99
(c) (2 points) Compute the p-value for the test statistic and state your conclusions (assume a = 5%).
• (1 point) p = 0.172
• (1 point) As the p value is greater than a = 5%, we CANNOT reject the null hypothesis.
Hence there is not enough evidence to prove that males tend to give higher valuations.
(d) (2 points) Construct a 90% confidence interval for the difference in evaluation scores across the
two groups.
• (1 point) 90% CI is given by ( x̄1
x̄2
) ± t⇤
r
s21
n1
+
s22
n2
• (1 point) Computing gives us 0.88 ± 1.8 ⇥ 0.89 = ( 0.72, 2.48)
Page 10 of 14
MGCR 271
Winter 2011
Business Statistics
9. (10 points) Multiple Choice Questions
(a) (2 points) A test to screen for a serious but curable disease is similar to hypothesis testing, with
a null hypothesis of no disease, and an alternative hypothesis of disease. If the null hypothesis
is rejected treatment will be given. Otherwise, it will not. Assuming the treatment does not have
serious side effects, in this scenario it is better to increase the probability of making
i. A Type 1 error, providing treatment when it is not needed.
ii. A Type 1 error, not providing treatment when it is needed.
iii. A Type 2 error, providing treatment when it is not needed.
iv. A Type 2 error, not providing treatment when it is needed.
(b) (2 points) A test of H0 : µ = 0 versus Ha : µ > 0 is conducted on the same population independently by two different researchers. They both use the same sample size and the same value of
a = 0.05. Which of the following will be the same for both researchers?
i. The p-value of the test.
ii. The power of the test if the true µ = 6.
iii. The value of the test statistic.
iv. The decision about whether or not to reject the null hypothesis.
(c) (2 points) Consider a random sample of 100 females and 100 males. Suppose 15 of the females
are left-handed and 12 of the males are left-handed. What is the estimated difference between
proportions of females and males who are left-handed (females - males)? Select the choice with
the correct notation and numerical value.
i. p1
p2 = 3
ii. p1
p2 = 0.03
iii. p̂1
p̂2 = 3
iv. p̂1
p̂2 = 0.03
(d) (2 points) A hypothesis test is done in which the alternative hypothesis is that more than 10% of
a population is left-handed. The p-value for the test is calculated to be 0.25. Which statement is
correct?
i. We can conclude that more than 10% of the population is left-handed.
ii. We can conclude that more than 25% of the population is left-handed.
iii. We can conclude that exactly 25% of the population is left-handed.
iv. We cannot conclude that more than 10% of the population is left-handed.
(e) (2 points) Two researchers wanted to determine if aspirin reduced the chance of a heart attack.
Researcher 1 studied the medical records of 500 patients. For each patient, he recorded whether
the person took aspirin every day and if the person had ever had a heart attack. Researcher 2
also studied 500 people. He randomly assigned half (250) of the patients to take aspirin every
day and the other half to take a placebo everyday. Suppose that both researchers found that
there is a statistically significant difference in the heart attack rates for the aspirin users and the
non-aspirin users and that aspirin users had a lower rate of heart attacks. Can both researchers
conclude that aspirin caused the reduction?
Page 11 of 14
MGCR 271
Winter 2011
Business Statistics
i. Yes, because aspirin users had a lower heart attack rate in both studies.
ii. Yes, because aspirin is known to reduce heart attacks.
iii. No, only researcher 1 can conclude this.
iv. No, only researcher 2 can conclude this.
Page 12 of 14
#"$%&'()"*+"+%&,(-"&).,/,0)"'1-$,/,%0$"21*"1%3)"4))0"2*56,07"%8"3%5,*9$"/(,0,/$"+*5"82*"-)%5$"%+8)5"4),07"
4*%5.:/)58,+,)."1%$"$1*20"%".,++)5)0/)",0"81)"%3)5%7)"$%(%5,)$"81%8"%5)"4),07"'%,."8*"&%()$"3)5$9$"+)&%()$;""<1)"
5)$'*0$)",$"81)"$%(%5-"%+8)5"82*"-)%5$"*+")&'(*-&)08"=>#?#@AB",0"81*9$%0.$"*+".*((%5$C;""#"'*$$,4()"
/*0+*90.,07"3%5,%4()"%'')%5$"8*"4)"81)"09&4)5"*+"1*95$"=D@>C"*+"%.3%0/)."85%,0,07"81%8")%/1".*/8*5"1%$"
MGCR 271
Winter 2011
Business Statistics
5)/),3).",0"81),5"+,5$8"82*"-)%5$"*+"4),07"/)58,+,).E""*0"%3)5%7)B"81)"&%()".*/8*5$"8)0."8*")%50"&*5)"%.3%0/)."
85%,0,07"/5).,8$"81%0"81)"+)&%()"'1-$,/,%0$B"%0."%"$/%88)5'(*8"*+">#?#@A"3$"D@>"$1*2$"%"(,0)%5"%$$*/,%8,*0"
2,81"/*55)(%8,*0";FGH;""<1)"%5)%"*+",08)5)$8",$"21)81)5"81)".,++)5)0/)",0"81)"%&*908"*+"%.3%0/)."85%,0,07"2*9(."
10. (10 points)
A sample of family medicine physicians who have been working at various clinics for
)I'(%,0"81)".,++)5)0/)",0"81)"%3)5%7)"$%(%5,)$"4)82))0"&%()"%0."+)&%()"'1-$,/,%0$;""#0"%0%(-$,$"*+"/*3%5,%0/)",$"
two years')5+*5&).B"9$,07"%".9&&-"3%5,%4()"JKLMK@"21,/1",.)08,+,)$"21)81)5"81)".*/8*5",$"&%()"*5"+)&%();""N)"/*.)"
after being board-certified has shown a difference in the average salaries that are being
JKLMK@"%$"H"+*5"&%()$B"%0."G"+*5"+)&%()$;""<1)"*81)5")I'(%0%8*5-"3%5,%4()",$"D@>B"%0."81)",08)5%/8,*0",$"D@>"
paid to males
versus females. The response is the salary after two years of employment (SALARY,
I"JKLMK@;""<1)"/*0.,8,*0$"+*5"%"&9(8,'()"5)75)$$,*0"1%3)"4))0"&)8;""<1)"5)$9(8$"%5)"$1*20"4)(*2E"
in thousands of dollars). A possible confounding variable appears to be the number of hours (HRS)
@)75)$$,*0"#0%(-$,$"
of advanced
training that each doctor has received in their first two years of being certified: on av-
erage, the male doctors tend to earn more advanced training credits than the female physicians, and
a scatterplot of SALARY vs"""""""""""@O""
HRS shows
a linear association with correlation .701. The area of interG;PQH""
est is whether the difference in the amount of advanced training would explain the difference in the
#.R9$8)."@O"" G;PST""
"""""""""""""0""" PS""
average salaries between male and female physicians. A multiple regression is performed, using a
G;UGH"" whether
""""""""""""6"""
T"" is male or female. We code GENDER
dummy variable GENDER """"""""""@"""
which identifies
the doctor
as 1 for males, and 0 for females.
The other explanatory
variable is HRS, and the interaction is HRS x
"">8.;"K55*5""" HG;VUT""
M)';"W%5;"" ;<=<>?
GENDER. The conditions for a multiple regression have been met. The results are shown below:
#LXW#"8%4()"
!"#$%&
!!'
()'
@)75)$$,*0 "HHBPGT;TSSP"
@)$,.9%(
*!
+
,-./0#&
T" TBUPF;FFQS"
TQ;VT
P;HYK:HT
"PBQYP;HPHT"
VU"
HHS;GGSU"
<*8%( "HUBGYY;QUTY"
PH"""
""
""
@)75)$$,*0"*98'98"
%"1)2(&1%&'213&$./0'
./$2/40&5
'%"&))2%2&135'
53(6'&$$"$
Z08)5/)'8
HTF;GVVG"
D@>
'''3'7()89:;
,-./0#&
<9='0">&$
<9='#,,&$
T;FPGT"
"TP;QQU" H;HTK:QH
HSY;VSFY"
HQQ;VUSH"
G;FSUH"
G;HQFT"
"Q;YQH" P;YQK:GP
G;QTTH"
H;GSTG"
JKLMK@
HH;PYQS"
V;SHGQ"
"S;SQQ"
;GSUP
H;SPQV"
SS;HSTY"
D@>IJKLMK@
G;GPYS"
G;HYHG"
"G;TPS"
;FHUV
:G;THTH"
G;QVHV"
!
(a) (2 points) What is the estimating equation for the salary of a female doctor after two years
Estimating equation for salary of female doctor after two years is given by ŷ = 137.05 + 0.728 ⇤
HRS
© 2011
Pearson
Education
Publishing
Addison-Wesley.
(b) (2 points) What is Copyright
the average
salary
estimate
for Inc.
a male
doctoraswith
20 hours of advance train-
ing?
Average salary estimate for a male doctor with 20 hours of training is given by ŷ = 137.05 +
0.728 ⇤ 20 + 11.69 + 0.0692 ⇤ 20 = 164.684
(c) (2 points) Do male doctors earn more on an average as compared to female doctors? Explain.
Yes. The coefficient for GENDER is positive and is highly significant at the 5% level. Hence male
doctors do earn more on an average as compared to female doctors.
(d) (2 points) Do male doctors earn more per hour of advance training as compared to female doctors? Explain.
Page 13 of 14
MGCR 271
Winter 2011
Business Statistics
No. The coefficient for HRSxGENDER is not significant at the 5% level. Hence there is no evidence at the 5% level that male doctors earn more per hour of advance training as compared to
female doctors
(e) (2 points) What percentage of variation in salaries remains unexplained by the regression?
The percentage of variation in salaries unexplained by the regression is given by 1
0.36
R2 =
SSE
SST
=
Page 14 of 14
Download