Faculty of Economy International Business and Development

advertisement
Faculty of Economy
International Business and Development
November 24 2011 3st TEST (Type AB)
Economic Statistics
Duration – 50 minutes
Examination Aids: Calculator
Point Value
EXERCISE 1
EXERCISE 2
Total
8
2
10
Point Earned
In the calculations use no more than two decimal
Remember always of Commenting the results obtained
EXERCISE 1 –Type AB (eight points)
A consumer products company wants to measure the effectiveness of different types of
advertising media in the promotion of its products. Specifically the company is interested in
the effectiveness of radio advertising (RADIO) and newspaper advertising (NEWS). A
sample of 22 cities with approximately equal populations is selected for study during a test
period of one month. Each city is allocated a specific expenditure level both for radio
advertising and for newspaper advertising. The SALES of the product (in thousands of
dollar) and also the levels of media expenditure (in thousands of dollars) during the test
month are recorded, with the following results stored in this table. With Gretl we have
obtained the following result.
Correlation coefficients, using the observations 1 - 22
SALES
1.0000
RADIO
0.6966
1.0000
NEWS
0.5021
-0.0921
1.0000
SALES
RADIO
NEWS
Model 1:OLS, using observations 1-22
Dependent variable: SALES
Coefficient
156.43
13.0807
16.7953
const
RADIO
NEWS
Mean dependent var
Sum squared resid
R-squared
F
Std. Error
126.758
1.75937
2.96338
VIF
1.009
1.009
1225.136
479760
S.D. dependent var
S.E. of regression
345.5701
158.9041
40.15823
P-value(F)
1.50e-07
ANOVA Analysis of Variance:
Sum of squares
Regression
Residual (error)
Total
t-ratio
2028030
479760
2507790
df
Mean square
2
19
21
(A) [point 1] State the multiple regression equation in conventional term and
interpret the meaning of the slopes, b2 and b3 in this problem (points 1)
(B) [point 1] First, Predict the mean Sales for an expenditure in Radio Advertising
of 65000 $ and in Newspaper Advertising of 35000 $ and then evaluate the
residual.
(C) [points 2] Which type of advertising is more effective? Explain
(D) [point 1] Determine whether there is a significant relationship between Sales
and the two independent variables (radio advertising and newspaper
advertising) at the 0.05 level of significance. Interpret the meaning of the pvalue.
(E) [points 2] At the 0.05 level of significance, determine whether each independent
variable makes a significant contribution to the regression model. On the basis
of these result, indicate the independent variables to include in this model.
(F) [point 1] Show how to obtain R squared (R2) from the sums of squares in the
ANOVA table. Interpret it.
SOLUTION
a) State the multiple regression equation in conventional term and interpret the
meaning of the slopes, b2 and b3 in this problem (points 1)
Sales_hat = 156 + 13.1*Radio + 16.8*Newspaper
(127) (1.76)
(2.96)
R-squared = 0.809
(standard errors in parentheses)
In this model, the regression coefficients are interpreted as follows:
1)Holding constant the spending in newspaper advertising , for each increase of 1.0
thousand dollars in radio advertising , the Sales is estimated to increase by 13.1
thousand dollars (i.e., $13100).
2)Holding constant the spending in Radio advertising , for each increase of 1.0
thousand dollars in Newspaper advertising , the Sales is estimated to increase by
16.8 thousand dollars (i.e., $16800).
3)The sample Y intercept (b1 = 156) estimate the value of Sales when there is no
money spent on radio advertising and newspaper advertising. Because these value
of promotion are outside the range of RADIO and Newspaper used in this market
study, and are nonsensical, the value of b1 has little or no practical interpretation.
b) First, Predict the mean Sales for an expenditure in Radio Advertising of 65000 $
and in Newspaper Advertising of 35000 $ and then evaluate the residual.
Sales_hat = 156 + 13.1*65 + 16.8*35= 1637.5 (1000$)
.
c) Which type of advertising is more effective? Explain
Holding the other independent variable constant, newspaper advertising seems to
be more effective because its slope is greater. But in such case if the variability of
the two independent variables is different, standardized versions of the regression
coefficients provide more meaningful comparisons. In our case we do not know the
variability and is better to compute the standardized partial coefficients.
beta2 
beta3 
ry 2  ry 3r23
1 r
2
23
ry 3  ry 2 r23
1 r
2
23

0.6966  0.5021*(0.0921)
 0.7492
1  (0.09212 )

0.5021  0.6966*(0.0921)
 0.5711
1  (0.09212 )
The type of advertising more effective is radio advertising
d) Determine whether there is a significant relationship between Sales and the two
independent variables (radio advertising and newspaper advertising) at the 0.05
level of significance. Interpret the meaning of the p-value.
Our next task is to test the "significance" of this model based on that F-ratio using
the standard five step hypothesis testing procedure.
Hypotheses: H0: all coefficients are zero
H1: almost one is different from 0
Critical value: an F-value based on (k-1) numerator df and (n - k) denominator df
gives us F(2, 19) at 0.05 = 3.52
Calculated Value: From above the F-ratio is 40.18
Compare: F-calc > F-crit and thus we reject H0.
Conclusion: This model has explanatory power with respect to Y. In other words the
set of X variables in this model help us explain or predict the Y variable. This model
is SIGNIFICANT.
The p-value associated to F-calc is 1.50e-07, that is much less than α. So, in another
way we can say that the value of F-crit falls in the rejection zone of the null
hypothesis.
e) [points 2] At the 0.05 level of significance, determine whether each independent
variable makes a significant contribution to the regression model. On the basis
of these result, indicate the independent variables to include in this model.
Our step is to test the significance of the individual coefficients in the equation. We
will conduct a t-test for each b associated with an X variable. Mechanically the
actual test is going to be the value of b1 (or b2, b3.....bi) over SEb1 (or SEb1...SEbi)
compared to a t-critical with n - k ) df (the Error df from the ANOVA table). Or we
consider the p-values to determine whether to reject or accept Ho. The Ho being
tested by this test is βi = 0. which means this variable is not related to Y. We
consider each variable separately and thus must conduct as many t-tests as there
are X variables.
What NULL are we considering?
Hypotheses: we are testing H0: βi=0 This variable is unrelated to the dependent
variable at alpha=0.05.
With the actual values of the b's and the SEb's, we obtain the t-value (one for each
X variable ):
tRADIO = 13.0807/1.75937 = 7.435
tNewspaper = 16.7953/2.96338 =5.6676
and comparing them with t-critical value (it is the same for each t-test within a single
model) to determine whether to reject or accept the Ho associated with each X.
tcritical = 2.093 with 19 df
At the 0.05 significance level, reject H0 if t ≥2.093 or t 2.093. Do not reject H0 if
2.093 t 2.093.
The critical value from the t-table is t = 2.093 with 19 degrees of freedom.
Compare the t statistics ( 7.435 and 5.6676) to the critical value X2 and X3 are
significant independent variable .
Conclusion: Variables X2 (RADIO) and X3 /Newspaper are significant and
contributes to the model's explanatory power
f) [point 1] Show how to obtain R squared (R2) from the sums of squares in the
ANOVA table. Interpret it.
R2 = ESS/TSS or 1-RSS/TSS
R2 = 2028030/2507790 = 0.81
81% of the variation in Sales can be explained by variation in the amount of Radio
Advertising and Newspaper Advertising.
EXERCISE 2 – Type AB (two points)
A) [1 point] Standardized multiple regression coefficient: definition, interpretation and use
The sizes of regression coefficients in multiple regression models depend on the
units of measurement for the variables. To compare the relative effects of two
explanatory variables, it is appropriate to compare their coefficients only if the
variables have the same units. Otherwise, standardized versions of the regression
coefficients provide more meaningful comparisons.
The standardized regression coefficient for an explanatory variable represents the
change in Y, in Y standard deviations, for a one standard deviation increase in that
variable, controlling for the other explanatory variables in the model. We denote
them by βeta2, βeta3.
If |βeta3| > |βeta2| , for example, then a standard deviation increase in X3 has a
greater partial effect on Y than does a standard deviation increase in X2.
We standardize the partial regression coefficients by adjusting for the differing
standard deviation of Y and each Xj. Let sy denote the sample standard deviation of
Y , and let sx2 ; sx3…. sxk denote the sample standard deviations of the explanatory
variables.
The estimates of the standardized regression coefficients are
beta2  b2
sX 2
sY
, beta3  b3
sX3
sY
.........., betak  bk
sX k
sY
B) [1 point] Define the following term: R2 and Adjusted R2
R2 and adjusted R2
R2 is the amount of variance in Y explained by the set of X independent variables. It
is expressed as a percentage and thus goes from values of 0 - 100% (or 0 - 1 when
expressed in decimal form).
Adjusted R2 is "adjusted" for the number of X variables (k-1, in the formula) and the
sample size (n in the formula). Both R2 and adjusted R2 are easily calculated. R2 is
ESS/TSS and these can be pulled right out of the ANOVA table. The adjusted R2
formula is shown
RAdj 2  1 
RSS /(n  k )
TSS /(n  1)
Again both of these can be calculated from the ANOVA table are always provided as
part of the computer output.
Download