Department of Business Administration FALL 2010-11 Demand Estimation by Assoc. Prof. Sami Fethi Ch 4 : Demand Estimation Demand Estimation To use these important demand relationship in decision analysis, we need empirically to estimate the structural form and parameters of the demand function-Demand Estimation. Qdx= (P, I, Pc, Ps, T) (-, + , - , +, +) The demand for a commodity arises from the consumers’ willingness and ability to purchase the commodity. Consumer demand theory postulates that the quantity demanded of a commodity is a function of or depends on the price of the commodity, the consumers’ income, the price of related commodities, and the tastes of the consumer. 2 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Demand Estimation In general, we will seek the answer for the following qustions: How much will the revenue of the firm change after increasing the price of the commodity? How much will the quantity demanded of the commodity increase if consumers’ income increase What if the firms double its ads expenditure? What if the competitors lower their prices? Firms should know the answers the abovementioned questions if they want to achieve the objective of maximizing thier value. 3 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation The Identification Problem The demand curve for a commodity is generally estimated from market data on the quantity purchased of the commodity at various price over time (i.e. Timeseries data) or various consuming units at one point in time (i.e. Cross-sectional data). Simply joinning priced-quantity observations on a graph does not generate the demand curve for a commodity. The reason is that each priced-quantity observation is given by the intersection of a different and unobserved demand and supply curve of commodity. In other words, The difficulty of deriving the demand curve for a commodity from observed priced-quantity points that results from the intersection of different and unobserved demand and supply curves for the commodity is referred to as the identification 4 problem. © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. The Identification Problem © 2004, Managerial Economics, Dominick Salvatore Ch 4 : Demand Estimation In the following demand curve, Observed price-quantity data points E1, E2, E3, and E4, result respectively from the intersection of unobserved demand and supply curves D1 and S1, D2 and S2, D3 and S3, and D4 and S4. Therefore, the dashed line connecting observed points E1, E2, E3, and E4 is not the demanded curve for the commodity. The derived a demand curve for the commodity, say, D2, we allow the supply to shift or to be different and correct, through regression analysis, for the forces that cause demand curve D2 to shift or to be different as can be seen at points E2, E'2. This is done by 5 regression analysis. © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Demand Estimation: Marketing Research Approaches Consumer Surveys Observational Research Consumer Clinics Market Experiments These approaches are usually covered extensively in marketing courses, however the most important of these are consumer surveys and market experiments. 6 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Demand Estimation: Marketing Research Approaches o Consumer surveys: These surveys require the questioning of a firm’s customers in an attempt to estimate the relationship between the demand for its products and a variety of variables perceived to be for the marketing and profit planning functions. These surveys can be conducted by simply stopping and questioning people at shopping centre or by administering sophisticated questionnaires to a carefully constructed representative sample of consumers by trained interviewers. 7 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Demand Estimation: Marketing Research Approaches Major advantages: they may provide the only information available; they can be made as simple as possible; the researcher can ask exactly the questions they want Major disadvantages: consumers may be unable or unwilling to provide reliable answers; careful and extensive surveys can be very expensive. 8 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Demand Estimation: Marketing Research Approaches Market experiments: attempts by the firm to estimate the demand for the commodity by changing price and other determinants of the demand for the commodity in the actual market place. 9 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Demand Estimation: Marketing Research Approaches Major advantages: consumers are in a real market situation; they do not know that they being observed; they can be conducted on a large scale to ensure the validity of results. Major disadvantages: in order to keep cost down, the experiment may be too limited so the outcome can be questionable; competitors could try to sabotage the experiment by changing prices and other determinants of demand under their control; competitors can monitor the experiment to gain very useful information about the firm would prefer not to 10 disclose. © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Purpose of Regression Analysis Regression Analysis is Used Primarily to Model Causality and Provide Prediction Predict the values of a dependent (response) variable based on values of at least one independent (explanatory) variable Explain the effect of the independent variables on the dependent variable The relationship between X and Y can be shown on a scatter diagram 11 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Scatter Diagram It is two dimensional graph of plotted points in which the vertical axis represents values of the dependent variable and the horizontal axis represents values of the independent or explanatory variable. The patterns of the intersecting points of variables can graphically show relationship patterns. Mostly, scatter diagram is used to prove or disprove cause-and-effect relationship. In the following example, it shows the relationship between advertising expenditure and its sales revenues. 12 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Scatter Diagram-Example Year X Y 1 10 44 2 9 40 3 11 42 4 12 46 5 11 48 6 12 52 7 13 54 8 13 58 9 14 56 10 15 60 Managerial Economics Scatter Diagram 13 © 2008/09, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Scatter Diagram Scatter diagram shows a positive relationship between the relevant variables. The relationship is approximately linear. This gives us a rough estimates of the linear relationship between the variables in the form of an equation such as Y= a+ b X 14 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Regression Analysis In the equation, a is the vertical intercept of the estimated linear relationship and gives the value of Y when X=0, while b is the slope of the line and gives an estimate of the increase in Y resulting from each unit increase in X. The difficulty with the scatter diagram is that different researchers would probably obtain different results, even if they use same data points. Solution for this is to use regression analysis. 15 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Regression Analysis Regression analysis: is a statistical technique for obtaining the line that best fits the data points so that all researchers can reach the same results. Regression Line: Line of Best Fit Regression Line: Minimizes the sum of the squared vertical deviations (et) of each point from the regression line. This is the method called Ordinary Least Squares (OLS). 16 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Regression Analysis Year X Y 1 10 44 2 9 40 3 11 42 4 12 46 5 11 48 6 12 52 7 13 54 8 13 58 9 14 56 10 15 60 In the table, Y1 refers actual or observed sales revenue of $44 mn associated with the advertising expenditure of $10 mn in the first year for which data collected. In the following graph, Y^1 is the corresponding sales revenue of the firm estimated from the regression line for the advertising expenditure of $10 mn in the first year. The symbol e1 is the corresponding vertical deviation or error of the actual sales revenue estimated from the regression line in the first year. This can be expressed as e1= Y1- Y^1. 17 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Regression Analysis In the graph, Y^1 is the corresponding sales revenue of the firm estimated from the regression line for the advertising expenditure of $10 mn in the first year. The symbol e1 is the corresponding vertical deviation or error of the actual sales revenue estimated from the regression line in the first year. This can be expressed as e1= Y1- Y^1. 18 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Regression Analysis Since there are 10 observation points, we have obviously 10 vertical deviations or error (i.e., e1 to e10). The regression line obtained is the line that best fits the data points in the sense that the sum of the squared (vertical) deviations from the line is minimum. This means that each of the 10 e values is first squared and then summed. © 2004, Managerial Economics, Dominick Salvatore 19 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Simple Regression Analysis Now we are in a position to calculate the value of a ( the vertical intercept) and the value of b (the slope coefficient) of the regression line. Conduct tests of significance of parameter estimates. Construct confidence interval for the true parameter. Test for the overall explanatory power of the regression. 20 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Simple Linear Regression Model Regression line is a straight line that describes the dependence of the average value of one variable on the other Slope Coefficient Y Intercept Random Error Yi X i i Dependent (Response) Variable Regression Line Independent (Explanatory) Variable 21 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Ordinary Least Squares (OLS) Model: Yt a bX t et ˆ Yˆt aˆ bX t et Yt Yˆt 22 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Ordinary Least Squares (OLS) Objective: Determine the slope and intercept that minimize the sum of the squared errors. n n n t 1 t 1 t 1 2 2 ˆ )2 ˆ ˆ e ( Y Y ) ( Y a bX t t t t t 23 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Ordinary Least Squares (OLS) Estimation Procedure n bˆ (X t 1 t X )(Yt Y ) n 2 ( X X ) t ˆ â Y bX t 1 24 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ordinary Least Squares (OLS) Ch 4 : Demand Estimation Estimation Example Time Xt Yt Xt X Yt Y ( X t X )(Yt Y ) ( X t X )2 1 2 3 4 5 6 7 8 9 10 10 9 11 12 11 12 13 13 14 15 120 44 40 42 46 48 52 54 58 56 60 500 -2 -3 -1 0 -1 0 1 1 2 3 -6 -10 -8 -4 -2 2 4 8 6 10 12 30 8 0 2 0 4 8 12 30 106 4 9 1 0 1 0 1 1 4 9 30 n 10 n X t 1 n X 120 X t 12 10 t 1 n t 120 n n (X Yt 500 t 1 t 1 n Y 500 Y t 50 10 t 1 n © 2004, Managerial Economics, Dominick Salvatore n (X t 1 t t X ) 2 30 X )(Yt Y ) 106 106 bˆ 3.533 30 aˆ 50 (3.533)(12) 7.60 25 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Ordinary Least Squares (OLS) Estimation Example n X t 120 X 12 10 t 1 n n 10 n X t 120 t 1 n (X t 1 t 1 Y t 1 t 500 n Yt 500 Y 50 10 t 1 n t X ) 30 106 bˆ 3.533 30 t X )(Yt Y ) 106 aˆ 50 (3.533)(12) 7.60 2 n (X n © 2004, Managerial Economics, Dominick Salvatore 26 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation The Equation of Regression Line The equation of the regression line can be constructed as follows: Yt^=7.60 +3.53 Xt When X=0 (zero advertising expenditures), the expected sales revenue of the firm is $7.60 mn. In the first year, when X=10mn, Y1^= $42.90 mn. Strictly speaking, the regression line should be used only to estimate the sales revenues resulting from advertising expenditure that are within the range. 27 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Crucial Assumptions Error term is normally distributed. Error term has zero expected value or mean. Error term has constant variance in each time period and for all values of X. Error term’s value in one time period is unrelated to its value in any other period. 28 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance: Standard Error To test the hypothesis that b is statistically significant (i.e., advertising positively affects sales), we need first of all to calculate standard error (deviation) of b^. The standard error can be calculated in the following expression: 29 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance Standard Error of the Slope Estimate sbˆ 2 ˆ (Yt Y ) ( n k ) ( X t X ) 2 e (n k ) ( X 2 t t X )2 30 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Tests of Significance Ch 4 : Demand Estimation Example Calculation Time Xt Yt Yˆt et Yt Yˆt et2 (Yt Yˆt )2 ( X t X )2 1 10 44 42.90 1.10 1.2100 4 2 9 40 39.37 0.63 0.3969 9 3 11 42 46.43 -4.43 19.6249 1 4 12 46 49.96 -3.96 15.6816 0 5 11 48 46.43 1.57 2.4649 1 6 12 52 49.96 2.04 4.1616 0 7 13 54 53.49 0.51 0.2601 1 8 13 58 53.49 4.51 20.3401 1 9 14 56 57.02 -1.02 1.0404 4 10 15 60 60.55 -0.55 0.3025 9 65.4830 30 (Y Yˆ ) ( n k ) ( X X ) 2 sbˆ t t 2 65.4830 0.52 (10 2)(30) n n e (Yt Yˆt )2 65.4830 t 1 2 t t 1 Yt^=7.60 +3.53 Xt =7.60+3.53(10)= 42.90 © 2004, Managerial Economics, Dominick Salvatore n (X t 1 t X ) 2 30 31 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance Example Calculation n n t 1 t 1 2 2 ˆ e ( Y Y ) t t t 65.4830 n 2 ( X X ) 30 t t 1 2 ˆ (Yt Y ) 65.4830 sbˆ 0.52 2 ( n k ) ( X t X ) (10 2)(30) 32 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance Calculation of the t Statistic bˆ 3.53 t 6.79 sbˆ 0.52 Degrees of Freedom = (n-k) = (10-2) = 8 Critical Value (tabulated) at 5% level =2.306 33 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Confidence interval We can also construct confidence interval for the true parameter from the estimated coefficient. Accepting the alternative hypothesis that there is a relationship between X and Y. Using tabular value of t=2.306 for 5% and 8 df in our example, the true value of b will lies between 2.33 and 4.73 t=b^+/- 2.306 (sb^)=3.53+/- 2.036 (0.52) 34 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance Decomposition of Sum of Squares Total Variation = Explained Variation + Unexplained Variation 2 2 ˆ ˆ (Yt Y ) (Y Y ) (Yt Yt ) 2 35 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance Decomposition of Sum of Squares 36 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Coefficient of Determination Coefficient of Determination: is defined as the proportion of the total variation or dispersion in the dependent variable that explained by the variation in the explanatory variables in the regression. In our example, COD measures how much of the variation in the firm’s sales is explained by the variation in its advertising expenditures. 37 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance Coefficient of Determination 2 ˆ (Y Y ) Explained Variation R 2 TotalVariation ( Y Y ) t 2 373.84 R 0.85 440.00 2 38 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Coefficient of Correlation Coefficient of Correlation (r): The square root of the coefficient of determination. This is simply a measure of the degree of association or co-variation that exists between variables X and Y. In our example, this mean that variables X and Y vary together 92% of the time. The sign of coefficient r is always the same as the sign of coefficient of b^. © 2004, Managerial Economics, Dominick Salvatore 39 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Tests of Significance Coefficient of Correlation r R 2 withthe sign of bˆ 1 r 1 r 0.85 0.92 40 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Analysis Model: Y a b1 X1 b2 X 2 bk ' X k ' 41 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Analysis Relationship between 1 dependent & 2 or more independent variables is a linear function Y-intercept Slopes Random error Yi X1i X 2i k X ki i Dependent (Response) variable Independent (Explanatory) variables © 2004, Managerial Economics, Dominick Salvatore 42 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Analysis Yi = 0 + 1X1i + 2X2i + i Y Response Plane X1 (Observed Y) 0 i X2 (X1i,X2i) Y|X = 0 + 1X1i + 2X2i 43 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Analysis Too complicated by hand! Ouch! 44 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Model: Example Develop a model for estimating heating oil used for a single family home in the month of January, based on average temperature and amount of insulation in inches. Oil (Gal) Temp Insulation 275.30 40 3 363.80 27 3 164.30 40 10 40.80 73 6 94.30 64 6 230.90 34 6 366.70 9 6 300.60 8 10 237.80 23 10 121.40 63 3 31.40 65 10 203.50 41 6 441.10 21 3 323.00 38 3 52.50 58 10 45 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Model: Example Yˆi b0 b1 X1i b2 X 2i Excel Output Intercept X Variable 1 X Variable 2 bk X ki Coefficients 562.1510092 -5.436580588 -20.01232067 Yˆi 562.151 5.437 X1i 20.012 X 2i For each degree increase in temperature, the estimated average amount of heating oil used is decreased by 5.437 gallons, holding insulation constant. © 2004, Managerial Economics, Dominick Salvatore For each increase in one inch of insulation, the estimated average use of heating oil is decreased by 20.012 gallons, holding temperature constant. 46 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Analysis Adjusted Coefficient of Determination (n 1) R 1 (1 R ) (n k ) 2 2 rY2,12 R e g re ssi o n S ta ti sti c s M u lt ip le R 0.982654757 R S q u a re 0.965610371 A d ju s t e d R S q u a re 0.959878766 S t a n d a rd E rro r 26.01378323 O b s e rva t io n s SSR SST 15 47 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Interpretation of Coefficient of Multiple Determination 96.56% of the total variation in heating oil can be explained by temperature and amount of insulation 2 Y 12 r SSR .9656 SST 95.99% of the total fluctuation in heating oil can be explained by temperature and amount of insulation after adjusting for the number of explanatory variables and sample size 2 radj .9599 48 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Testing for Overall Significance Shows if Y Depends Linearly on All of the X Variables Together as a Group Use F Test Statistic Hypotheses: H0: … k = 0 (No linear relationship) H1: At least one i ( At least one independent variable affects Y ) The Null Hypothesis is a Very Strong Statement The Null Hypothesis is Almost Always Rejected 49 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multiple Regression Analysis Analysis of Variance and F Statistic Explained Variation /(k 1) F Unexplained Variation /(n k ) R /(k 1) F 2 (1 R ) /(n k ) 2 50 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation k = 3, no of parameters Test for Overall Significance Excel Output: Example ANOVA df Regression Residual Total k -1= 2, the number SS MS F Significance F 2 228014.6 114007.3 168.4712 1.65411E-09 12 8120.603 676.7169 14 236135.2 n-1 p-value of explanatory variables and dependent variable 51 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Test for Overall Significance: Example Solution H0: 1 = 2 = … = k = 0 Test Statistic: H1: At least one j 0 = .05 F 168.47 df = 2 and 12 Critical Value: = 0.05 0 3.89 © 2004, Managerial Economics, Dominick Salvatore Decision: Reject at = 0.05. Conclusion: There is evidence that at least one independent variable affects Y. 52 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation t Test Statistic Excel Output: Example t Test Statistic for X1 (Temperature) Coefficients Standard Error t Stat Intercept 562.1510092 21.09310433 26.65094 Temp -5.436580588 0.336216167 -16.1699 Insulation -20.01232067 2.342505227 -8.543127 bi t Sbi P-value 4.77868E-12 1.64178E-09 1.90731E-06 t Test Statistic for X2 (Insulation) 53 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation t Test : Example Solution Does temperature have a significant effect on monthly consumption of heating oil? Test at = 0.05. Test Statistic: H0: 1 = 0 t Test Statistic = -16.1699 H1: 1 0 Decision: df = 12 Reject H0 at = 0.05. Critical Values: Reject H 0 There is evidence of a significant Reject H 0 .025 -2.1788 Conclusion: .025 0 2.1788 © 2004, Managerial Economics, Dominick Salvatore effect of temperature on oil consumption holding constant the effect of insulation. 54 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Problems in Regression Analysis Multicollinearity: Two or more explanatory variables are highly correlated. Heteroskedasticity: Variance of error term is not independent of the Y variable. Autocorrelation: Consecutive error terms are correlated. Functional form: Misspecified by the omission of a variable Normality: Residuals are normally distributed or not 55 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Practical Consequences of Multicollinearity Large variance or standard error Wider confidence intervals Insignificant t-ratios A high R2 value but few significant tratios OLS estimators and their Std. Errors tend to be unstable Wrong signs for regression coefficients 56 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Multicollinearity How can Multicollinearity be overcome? Increasing number of observation Acquiring additional data A new sample Using an experience from a previous study Transformation of the variables Dropping a variable from the model This is the simplest solution, but the worse one referring an economic model (i.e., model specification error) 57 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Heteroskedasticity Heteroskedasticity: Variance of error term is not independent of the Y variable or unequal/non-constant variance. This means that when both response and explanatory variables increase, the variance of response variables does not remain same at all levels of explanatory variables (cross-sectional data). Homoscedasticity: when both response and explanatory variables increase, the variance of response variable around its mean value remains same at all levels of explanatory variables (equal variance). 58 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Residual Analysis for Homoscedasticity Y Y X SR X SR X Heteroscedasticity © 2004, Managerial Economics, Dominick Salvatore X Homoscedasticity 59 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Autocorrelation or serial correlation Autocorrelation: Correlation between members of observation ordered in time as in time series data (i.e., residuals are correlated where consecutive errors have the same sign). Detecting Autocorrelation: This can be detected by many ways. The most common used is DW statistics. 60 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Durbin-Watson Statistic Test for Autocorrelation n d 2 ( e e ) t t 1 t 2 n 2 e t t 1 If d=2, autocorrelation is absent. 61 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Residual Analysis for Independence The Durbin-Watson Statistic – Used when data is collected over time to detect autocorrelation (residuals in one time period are related to residuals in another period) – Measures violation of independence assumption n D 2 ( e e ) i i1 i 2 n e i 1 2 i Should be close to 2. If not, examine the model for autocorrelation. 62 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Residual Analysis for Independence Graphical Approach Not Independent e Independent e Time Cyclical Pattern Time No Particular Pattern Residual is Plotted Against Time to Detect Any Autocorrelation 63 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Using the Durbin-Watson Statistic H0 : No autocorrelation (error terms are independent) H1 : There is autocorrelation (error terms are not) Reject H0 (positive autocorrelation) 0 dL Inconclusive Accept H0 (no autocorrelation) dU © 2004, Managerial Economics, Dominick Salvatore 2 4-dU Reject H0 (negative autocorrelation) 4-dL 4 64 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Steps in Demand Estimation Model Specification: Identify Variables Collect Data Specify Functional Form Estimate Function Test the Results 65 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Functional Form Specifications Linear Function: QX a0 a1PX a2 I a3 N a4 PY Power Function: QX a( PXb1 )( PYb2 ) e Estimation Format: ln QX ln a b1 ln PX b2 ln PY 66 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Dummy-Variable Models When the explanatory variables are qualitative in nature, these are known as dummy variables. These can also defined as indicators variables, binary variables, categorical variables, and dichotomous variables such as variable D in the following equation: Q x c 0 c1 Px c 2 I c3 D ...... e 67 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Dummy-Variable Models Categorical Explanatory Variable with 2 or More Levels Yes or No, On or Off, Male or Female, Use Dummy-Variables (Coded as 0 or 1) Only Intercepts are Different Assumes Equal Slopes Across Categories Regression Model Has Same Form Can the dependent variable be dummy? 68 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Dummy-Variable Models Given: Yˆi b0 b1 X1i b2 X 2i Y = Assessed Value of House X1 = Square Footage of House X2 = Desirability of Neighbourhood = 0 if undesirable 1 if desirable Desirable (X2 = 1) Yˆi b0 b1 X1i b2 (1) (b0 b2 ) b1 X1i Undesirable (X2 = 0) Yˆi b0 b1 X1i b2 (0) b0 b1 X1i Same slopes 69 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Simple and Multiple Regression Compared: Example 70 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation 71 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation 72 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation 73 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Regression Analysis in Practice Suppose we have an Employment (Labor Demand) Function as follows: N=Constant+K+W+AD+P+WT N: employees in employment K: capital accumulation W: value of real wages AD: aggregate deficit P: effect of world manufacturing exports on employment WT: the deviation of world trade from trend. 74 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Output by Microfit v4.0w Ordinary Least Squares Estimation ******************************************************************************* Dependent variable is LOGN 39 observations used for estimation from 1956 to 1994 ******************************************************************************* Regressor Coefficient Standard Error T-Ratio[Prob] CON 4.9921 .98407 5.0729[.000] LOGK .040394 .012998 3.1078[.004] LOGW .024737 .010982 2.2526[.032] AD -.9174E-7 .1587E-6 .57798[.567] LOGP .026977 .0099796 2.7032[.011] LOGWT -.053944 .024279 2.2219[.034] ******************************************************************************* R-Squared .82476 F-statistic F( 6, 33) 20.8432[.000] R-Bar-Squared .78519 S.E. of Regression .012467 Residual Sum of Squares .0048181 Mean of Dependent Variable 10.0098 S.D. of Dependent Variable .026899 Maximum of Log-likelihood 120.1407 DW-statistic 1.8538 ******************************************************************************* 75 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Diagnostic Tests ****************************************************************************** * * Test Statistics * LM Version * F Version * ****************************************************************************** * * * * * * A:Serial Correlation *CHI-SQ( 1)= .051656[.820]*F(1,30)=.039788[.843]* * * * * * B:Functional Form *CHI-SQ( 1)= .056872[.812]*F(1,30)=.043812[.836]* * * * * * C:Normality *CHI-SQ( 2)= 1.2819[.527]* Not applicable * * * * * * D:Heteroscedasticity *CHI-SQ( 1)= 1.0065[.316]*F( 1,37)=.98022[.329]* ****************************************************************************** * A:Lagrange B:Ramsey's C:Based on D:Based on multiplier test of residual serial correlation RESET test using the square of the fitted values a test of skewness and kurtosis of residuals the regression of squared residuals on squared fitted values 76 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Dependent Variable: LOGN Explanatory Variables CON LOGK LOGW AD LOGP LOGWT R2 bar R2 4.9921 (5.07) 0.40394 (3.10) 0.0247 (2.25) -0.9174 (-0.577) 0.0269 (2.70) -0.0539 (-2.22) 0.87 0.83 DW 2.16 SER 0.021 X2SC .05165[.820] X2FF 05687[.812] X2NORM 1.2819[.527] X2HET 1.0065[.316] © 2004, Managerial Economics, Dominick Salvatore 77 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Interpretation t-test (individual significance) Let’s first see the significance of each variable; n=39 k=6 and hence d.f.=39-6=33 =0.05 (our confidence level is 95%). With =0.05 and d.f.=33, ttab=2.045 Our Hypothesis are: Ho:s=0 (not significant) H1: s0 (significant) This is t- distribution and using this distribution, you can decide whether individual t-values (calculated or estimated) of the existing variables are significant or not according to the tabulated t-values as appears in the fig above. © 2004, Managerial Economics, Dominick Salvatore 78 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation F-test (overall significance) Our result is F(6,33)=20.8432 k-1=5 and n-k=33 = 0.05 (our confidence level is 95%). With = 0.05 and F(6,33), the Ftab=2.34 Our hypothesis are Ho:R2s=0 (not significant) H1: R20 (significant) 79 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Diagnostic Tests: Serial Correlation: Ho:=0(existence of autocorrelation ) H1:0 (no autocorrelation) Since CHI-SQ(1)=0.051656< X2=3.841, we reject Ho that estimate regression does not have first order serial correlation or autocorrelation. 80 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Functional Form: Ho:=0 (existence misspecification) H1: 0 (no of misspecification) The estimated LM version of CHISQ is 0.0568721 and with = 0.05 the tabular value is X2=3.841. Because CHI-SQ (1)=0.0568721< X2=3.841, then we reject the null hypothesis that there is functional misspecification. 81 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Normality: Ho:ut=0 (residuals normally distributed) H1:ut0(residuals are distributed) are not normally Our estimated result of LM version for normality is CHISQ(2)=1.28191, and the tabular value with 2 restrictions with = 0.05 is X2=5.991. Since CHI-SQ(2)=1.28191< X2=5.991, the test result shows that the null hypothesis of normality of the residuals is accepted. 82 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation Heteroscedasticity: Ho:yt2=2 (heteroscedasticity) H1: yt22(homoscedasticity) LM version of our result for the heteroscedasticity is CHISQ(1)=1.00651 and table critical value with 1 restriction with = 0.05 is X2=3.841. Since CHISQ(1)=1.00651< X2=3.841, we accept the null hypothesis that error term is constant for all the independent variables. © 2004, Managerial Economics, Dominick Salvatore 83 © 2010/11, Sami Fethi, EMU, All Right Reserved. Ch 4 : Demand Estimation The End Thanks 84 © 2004, Managerial Economics, Dominick Salvatore © 2010/11, Sami Fethi, EMU, All Right Reserved.