Sample Project Dr. Jantzen ECO 310 Spring 2010 Term Project An Econometric Analysis of the US Consumption INTRODUCTION The US economy is consumer driven. Personal Consumption Expenditures (PCE) accounts for 70% of the United States’ Gross Domestic Product (GDP). In other words, more than ⅔ of all goods and services produced each year in the US are represented by levels of consumption. Therefore, the United States economy relies very heavily on its consumers. After the economic meltdown that began in 2007, most economic measures focused on the American consumer. Today, more than ever, personal consumption expenditures are vital to attaining a sustainable economic recovery in America. From an economic standpoint, personal consumption expenditures depend on a series of factors. Using multiple regression analysis, this paper will examine the effects of three explanatory variables: Disposable Personal Income (DSPI), Inflation Rate (CPI) and Bank Prime Loan Rate (MPRIME) on Consumption Expenditure (PCE) in the United States, from January 1992 to January 2001. PCE and DSPI are measured in billions of dollars, while CPI and MPRIME are measured as percentages. LITERATURE REVIEW In 2007, Michael Curran conducted a similar econometric study in order to examine the relationship between Consumption Expenditures and two explainers, Income and Interest Rates. Michael’s study sought to investigate the Keynes’ Consumption Function theory, focusing primarily on the effects of real income per capita and nominal interest rates (adjusted for inflation) on real consumption expenditure per capita in the United States.1 In his analysis, Michael used data from the first quarter in 1949 through the third quarter in 2006. 2 First, Michael Curran plotted personal consumption expenditure (PCE) against personal disposable income (PDI) to show the close relationship between the two variables and the upward trend- more disposable income indicates higher consumption expenditure. Yet, when 1, 2 Curran, M. (2007). "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" Student Economic Review, Volume 21:59-72. Michael plotted PCE against the bank prime loan rate (PRIME), it became very hard to identify a clear correlation between the two variables.3 Lastly, when Michael plotted PDI against PRIME, the graph depicted a scattered behavior. Thus, he concluded that there was no multicollinearity between his explanatory variables.4 Below are the graphical illustrations from Michael Curran’s analysis: PCE against PDI PCE against PRIME 3, 4 "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" Changes in PDI against changes in PRIME Michael’s econometric study used time-series data. After employing his regression models and examining his estimated coefficients (β), Michael observed that his three variables PCE, PDI and PRIME were non-stationary.5 In other words, the results showed that there was no observable trend in his time series variables – no constant mean or variance was detected. According to Michael’s findings, the averages for PCE and PDI “respectively rise over time, and the variation in PRIME also changes over time.6 After running a Durbin-Watson test, Michael obtained a D-W sample value of .245. He noted that because of OLS (Ordinary Least Squares), the estimated results were bogus (biased).7 Since Michael was not able to reject the null hypothesis, he concluded that the correlation was serial, and therefore, OLS results should not be used. In order account for the misleading results, Michael re-estimated his model using a new equation: ΔPCEt = β0 + β1 ΔPDIt + βt ΔPRIMEt + ut^5 where: ΔPCE = quarterly change in real personal consumption expenditure per person employed. ΔPDI = quarterly change in real personal disposable income per person employed. ΔPRIME = quarterly change in bank prime loan rate. 4, 5 "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" 5,6 7 "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" u = residual.8 After running descriptive statistics in his estimated model, Michael came across the following summary, followed by the new regression results: Variable PCE (US$) PDI (US$) PRIME (%) Maximum 31,895.80 33,483.40 20.3233 Minimum 12,654.10 13,619.20 2 Mean 22,166.20 24,551.40 7.1076 5,101.00 3.4355 Std. Deviation4,904.80 Avg. Growth 12 0.004079 0.003881 0.015892 Regression Results: Regressor Coefficient Standard Error T-Ratio [Prob] CONSTANT 52.9116 11.7046 4.5206 [.000] ΔPDI 0.38058 0.043485 8.7521 [.000] ΔPRIME -34.5743 10.8581 3.1842 [.002] Relevant Statistics: Statistic Value R-Squared 0.30257 R-Bar-Squared 0.29643 F-Statistic F (2,227): 49.2406 [.000] DW-statistic 2.3422 The R-squared value of .30257 indicated that only 30% of all variations in PCE were tied to Personal Disposable Income and Interest Rates. Michael concluded that there was “sufficient evidence of closeness to fit”9. However, his model represents low fitness, as 70% of the variations in his dependent variable are not “explained” by his explainers. Furthermore, Michael conducted T-Tests on his population coefficients and found that the regression coefficients β0, β1 and β2 were all different than zero, indicating that all of his explainers have an effect on PCE, the dependent variable. 9 "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" According to Michael, the new Durbin-Watson test displayed evidence of a negative autocorrelation10. He further used the Breusch-Godfrey test and concluded that serial correlation was present in his econometric model.11 Towards the end of his study, Michael used a 95% Confidence Interval for all of the population coefficients in order to determine the range under which the real population coefficient would lie. In his conclusion, Michael Curran stated that Keynes would have been proud of his linear regression results, provided that he agreed with his interpretation.12 DATA AND METHODOLOGY In order to examine the effects of disposable personal income, inflation rate and bank prime loan rate on personal consumption expenditures in the US, a database containing monthly data from 1992 to 2003 was used. Since this is a time series analysis, the following estimated model was used: Yt= β0 + β1DSPIt + β2CPIt + β3MPRIMEt + Et where: Dependent variable= Y - Personal Consumption Expenditure (PCE) Explainers= DSPI (Disposable Personal Income), CPI (Inflation Rate) and MPRIME (Bank Prime Loan Rate) Residual | Error term= Et β0 = Constant – expected value for Y if explainers are equal to zero. β1, β2, β3= population regression coefficients- shows how many units the dependent variable will change if explainer changes by one unit. The model suggests that there is a relationship between the dependent variable and the explainers. It is believed that disposable income directly impacts consumer spending. As evidenced by the recent recession, during times of economic uncertainty, consumer spending significantly declines. It is also anticipated that consumption expenditure is sensitive to inflation because higher inflation reduces the consumer’s buying power. With regard to interest rate, the Prime Rate is also expected to impact consumer spending as the rate is directly tied to lending and access to credit. 10, 11, 12 "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" This analysis will further assume the presence of statistical problems such as Multicollinearity and Autocorrelation, and make corrections where necessary in order to improve the statistical results. EMPIRICAL RESULTS Obtaining descriptive statistics is among the first steps in a regression analysis, because it summarizes the data in an easy and understandable way. Descriptive Statistics: Variable Mean Std.Dev. Minimum Maximum Cases =============================================================================== ------------------------------------------------------------------------------All observations in current sample ------------------------------------------------------------------------------PCE 5667.44060 1016.68993 4108.50000 7477.50000 133 DSPI 6127.94361 1006.34991 4633.30000 8024.30000 133 CPI 2.56917293 .650894542 1.10000000 3.80000000 133 MPRIME 7.44037594 1.44765244 4.25000000 9.50000000 133 The above table depicts the averages, standard deviations, minimums and maximums, for a 133-month period in the United States. According to the table, the average for consumption expenditure is 5667.4 billions of dollars, 6127.9 billions of dollars for personal disposable income, 2.57% (annualized) for CPI inflation, and 7.44% for bank prime rate. The standard deviations show how much the numbers differ from the average. The minimum and maximum amounts indicate the smallest and highest value (in billions of dollars or percentages) for each variable. Since the explainers DSPI, CPI and MPRIME do not have strong correlation coefficients (≥|.9|), there is no problem of multicollinearity with this analysis. See correlation results below: Correlation Matrix for Listed Variables: PCE DSPI CPI PCE 1.00000 .99812 -.30542 DSPI .99812 1.00000 -.31543 CPI -.30542 -.31543 1.00000 MPRIME -.06624 -.08853 .28436 MPRIME -.06624 -.08853 .28436 1.00000 Ordinary Least Squares Regression Results: +-----------------------------------------------------------------------+ | Ordinary least squares regression Weighting variable = none | | Dep. var. = PCE Mean= 5667.440602 , S.D.= 1016.689929 | | Model size: Observations = 133, Parameters = 4, Deg.Fr.= 129 | | Residuals: Sum of squares= 444215.9562 , Std.Dev.= 58.68164 | | Fit: R-squared= .996744, Adjusted R-squared = .99667 | | Model test: F[ 3, 129] =13164.64, Prob value = .00000 | | Diagnostic: Log-L = -728.2810, Restricted(b=0) Log-L = -1109.1498 | | LogAmemiyaPrCrt.= 8.174, Akaike Info. Crt.= 11.012 | | Autocorrel: Durbin-Watson Statistic = .79803, Rho = .60099 | +-----------------------------------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Constant -660.3852885 49.462718 -13.351 .0000 DSPI 1.011677878 .53484069E-02 189.155 .0000 6127.9436 CPI 6.938780130 8.5913921 .808 .4208 2.5691729 PRIME 14.85059038 3.6801083 4.035 .0001 7.4403759 13 13 Note: E+nn or E-nn means multiply by 10 to + or -nn power. After plotting the residuals against the constant variable and observing the shape of the graph, it became evident that the errors could be serially correlated. Serially correlated errors imply that the errors follow a certain pattern. For this analysis it means that if last month had a negative correlation, this month will most likely be negative too (this period depends on what happened last period). From the graph, we clearly see the pattern: all negative, then all positive, then all negative again… Assuming that the errors are serially correlated, an adjusted model was used: Yt= β0 + β1DSPIt + β2CPIt + β3MPRIMEt + Et where Et= ρEt-1 + Vt % of last month’s error terms + random factor With serial correlation, T-stats are usually too big, because standard errors are too small. Serial correlation Durbin-Watson Test: In order to test whether the OLS regression results suffer from serial correlation, the Durbin-Watson test was used: Hypothesis: Null HO: Alternative HA: ρ=0 ρ >0 Sample D-W Statistics: .798 Critical D-W Statistics: dL= 1.61 dU= 1.74 @ 5% significance level Decision: Since sample number is < dL, the null is rejected and we can be 95% confident that there is a positive correlation between last month and this month’s term. Therefore, we have serial correlation and cannot rely on or use OLS regression results. When the terms are serially correlated, there are consequences associated with using OLS estimated results. First, the estimated errors of the coefficients (β) will be biased downward (too small) causing the sample T’s to be too big. Hence, we are inclined to reject the null hypothesis more than we should. Second, the formulas used for finding the OLS coefficients, while unbiased, are inefficient. In other words, there is a lot of dispersion around the true betaβ. Aiming to correct the serial correlation, the Prais-Winsten correction was employed. The Prais-Winsted correction gets rid of serial correlation and generates betas that are unbiased and have the correct error correlation- new sample T’s and standard errors are calculated after the test is ran. +---------+--------------+----------------+--------+---------+----------+14 |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Constant -570.4907167 94.985620 -6.006 .0000 DSPI 1.002101497 .10593107E-01 94.599 .0000 6127.9436 CPI 5.844973323 13.445927 .435 .6638 2.5691729 MPRIME 10.84200799 7.1723606 1.512 .1306 7.4403759 RHO .6225894017 .68112120E-01 9.141 .0000 Finally, after comparing the Prais-Winsten corrected results to the OLS results, it was concluded that the OLS standard errors and t-ratios are unreliable for this type of analysis. The estimated coefficients were also significantly different between OLS and the corrected results. Estimated Coefficients and Confidence Intervals: Estimated coefficients show how many units the dependent variable will change if the explainer changes by one unit. Based on the re-estimated regression with the correction for serial correlation, one could be tempted to interpret the coefficients as the following: 14 If DSPI changes by 1 billion of dollars, PCE will change by 1 billion of dollars. Note: E+nn or E-nn means multiply by 10 to + or -nn power. If CPI changes by 1 percent (annualized), PCE will change by 5.84 billions of dollars. If MPRIME changes by 1 percent, PCE will change by 10.84 billions of dollars. However, such interpretation can only be valid given that the explainers have a real effect upon the dependent variable. Therefore, T-tests were conducted on the population regressions coefficients to make sure that all of the explainers had an effect on the PCE. The results were very surprising. HO: HA: Sample T: Critical T: Decision: β1= 0 β1≠ 0 β2= 0 β2≠ 0 β3= 0 β3≠ 0 94.6 0.44 1.5 1.98 @ 5% Significance Level β1: Reject the null. 95% sure that the coefficient on DSPI is ≠ 0. DSPI has an effect on PCE. β2: Can't reject the null. There's not enough evidence that CPI affects PCE. β3: Can't reject the null. There's not enough evidence that MPRIME affects PCE. As evidenced by the above T-tests results, only Disposable Personal Income was found to have an effect on PCE. There was not enough evidence to state that the coefficients on CPI and MPRIME were different than zero, and therefore, we cannot state with any confidence that CPI and MPRIME have an effect on Personal Consumption Expenditure. Moreover, the Confidence Interval (CI) results also supported the T-test findings. The formula used for CI in this economic study was B ± (critical t x standard error of B) . The 90% Confidence Interval for the population B1 went from 0.98 to 1.02. Therefore, we are 90% sure that the real population coefficient for B1 is between .98 and 1.02. More disposable income means more consumption. On the contrary, the 90% Confidence Interval for the population B2 was between -16.49 and 28.08. Hence, we are 90% confident that the real population coefficient is among the range obtained, and there is no evidence that CPI has an effect on PCE (could be positive, could be negative or could be zero). Similarly, the 90% Confidence Interval for the population B3 also indicated that there is no evidence that MPRIME has an effect on consumer spending, since the range was -1.15 to 22.75. Goodness of Fit: The quality of the model can be assessed through the R-squared. The R-squared measures the proportion of the variation in the dependent variable that is determined by differences in the explainer’s number. The formula for computing the R-squared is: 1 – (Unexplained Variation/ Total Variation) = .9966 Total Variation= (Standard Deviation of Y)^2 x (N-1) ; where N= sample size (136445613.5) Unexplained Variation= (e (t))^2 x (N- # coefficients) (462853.29) **The numbers in green are the answers to the mathematical computations. There no major difference between the OLS R-squared and the R-squared for the corrected results. But the above R-squared value of .997 indicates that this model has high fitness, as 99% of the variations in PCE are tied to differences in DSPI, CPI, and MPRIME. However, since there’s no evidence that CPI and MPRIME affect PCE, most of the variations in PCE may be attributed to changes in personal disposable income alone. Standardized Coefficients: Standardized coefficients show how many standard deviations the dependent variable will change if the explainer changes by one standard deviation. In short, standardized coefficients are used in picking which explainer is the most important; the closer to 1, the stronger the relationship between the dependent variable and explainer. Formula: Β* = β x (standard deviation of explainer/ standard deviation of dependent variable) After making calculations, the following results were collected for the estimated coefficients: Β1= .98 Consumption expenditure will change .98 standard deviations if personal income changes by one standard deviation. B2= .0037 Consumption expenditure will change .0037 standard deviations if CPI inflation changes by one standard deviation. B3= .015 Consumption expenditure will change .015 standard deviations if the bank prime rate changes by one standard deviation. Thus, disposable personal income (DSPI) has the greatest effect on personal consumption expenditures (PCE). Based on the standardized coefficients results, CPI and MPRIME seem to have a relatively small effect on the dependent variable. Specification Bias: Leaving out an important explanatory variable can lead to biased results. In this analysis, the true model is Yt= β0 + β1DSPIt + β2CPIt + β3MPRIMEt + Et. But let’s suppose that we decided to isolate the DSPI variable after the standardized coefficient results. The equation then becomes Yt= β0 + β2CPIt + β3MPRIMEt + Et The sign of the bias on any explainer’s coefficient = sign of the omitted explainer’s coefficient x partial correlation between omitted and examined explainers. Therefore, B2 = + * - = negative Any coefficient we get for B2 will be too small. B3= + * - = negative Any coefficient we get for B3 will be too small. Specification errors emphasize the unreliability of OLS estimated results. The bias can be observed in two forms: First, if we omit one or more relevant explainers, and secondly, if we include one or more irrelevant explainers. We should not be worried about including variables that do not belong. However, it is imperative that we do not leave out important explanatory variables like disposable personal income. CONCLUSION After a thorough econometric analysis on the effects of DSPI, CPI and MPRIME on Personal Consumption Expenditures, a few important implications were drawn. First and foremost, one cannot rely on Ordinary Least Squares regression results, because it can be deceptive and bias. Furthermore, time series data can often result in serially correlated error terms, also known as Autocorrelation. Accordingly, it is important to identify the pattern and make the necessary corrections. Based on the R-squared value obtained through the corrected regression results, one can argue that this is a high quality model. Nevertheless, the T-Test results on the population coefficients and the Confidence Intervals findings indicate a lack of evidence to explain the effect of two out of three explanatory variables (CPI and MPRIME) on PCE. It is possible that collecting more data could improve the results and the model itself. Overall, it became obvious that disposable income has the biggest effect on US consumption expenditure, with an almost perfect (=one) correlation. Reference: Curran, M. (2007). "Keynes Re-interpreted - An Econometric Investigation of Keynes' Consumption Function Theory in Postwar America" Student Economic Review, Volume 21:59-72.