Notes: Using a Dummy Variable

Using a Dummy Variable Because models describe generalities, a model will not usually predict an unusual change in a variable, resulting in a large “error” or difference between the model and the actual level of the variable. Because the “t-stat” associated with an estimated coefficient is equal to the value of the coefficient divided by the standard error, any increase in the error makes in more likely that the t-stat will be close to zero, which implies it is more likely that the coefficient is judged “insignificantly different than zero.” By reducing the error associated with unusual changes in a variable, the introduction of a “dummy variable” can increase t-stats and thereby increase the likelihood that a coefficient is judged “significantly different than zero.” This set of notes demonstrates the use of a dummy variable in the process of determining whether or not the U.S. employment growth rate is decreasing. Exponential Growth and the Log Difference If a variable y is growing at a constant rate, starting at an initial level y 0 , then we can write (1) y t  y 0 e rt This implies for some earlier point in time t  1 , we can write (2) yt 1  y0 e r t 1 Taking the natural log of (1), we have ln  yt   ln  y0   rt . Taking the natural log of (2), we have ln  yt 1   ln  y0   r t  1 . Subtracting the latter equation from the former, we can construct the “log difference” of the variable, ln  yt   ln  yt 1  , and doing so, we find (3) ln  yt   ln  yt 1   ln  y0   rt   ln  yt 1   r t  1 , which reduces to (4) ln  yt   ln  yt 1   r . From conditions (1)-(4), we learn the log difference is equal to the (continuously compounded) growth rate of a variable over the time period. It is common to use the log difference as the measure of the growth rate of a variable, rather the percentage change yt  yt 1  / yt 1 . 1 The Growth Rate of Employment in the U.S. Letting Lt denote the level of employment during period t , so ln Lt   ln Lt 1  is the growth rate of employment. The following figure presents the plot of this employment growth rate for the U.S. over the 1948-2006 period. U.S. Employment Growth Rate 8.0% Percent Change from Previous Year 6.0% 4.0% 2.0% 0.0% 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 -2.0% -4.0% One question of interest is whether the employment growth rate is changing on average over time. Clearly, examining the figure above, the growth rate exhibits some volitility, meaning it changes dramattically on occasion. However, is the employment growth rate trending up or down? To answer this last question, we fit the following first order polynomial model to the growth rate data. (5) g Lt  a0  a1t , where g Lt is the employment growth rate during period t as measured by the log difference. Regressing this growth rate on the time variable t , we obtain (6) g Lt  0.022  (0.005)*** 0.00014t , (0.0002) R 2  .0157 where the standard error is presented in parenthesis under each estimated coefficient. The estimated model (6) indicates that the growth rate of employment is decreasing slightly over time, but this decreasing trend is “not significant.” In particular, the standard error of 0.0002 is 2 large relative to the coefficient estimate of -0.00014 on the t variable, implying a relatively small t-stat of 0.0002/(-0.00014)=-0.94. (The t-stat is reported by the regression package, but since it can be obtained by dividing as we have done here it is common practice to present the standard error or t-stat but not both.) Roughly, t-stats in the range [-2,2] indicate nonsignificance, while t-stats outside this range indicate significance. Thus, the “test statistic” of 0.94 indicates non-significance. The p-value reported for the -0.00014 estimate is .35, confirming the non-significance conclusion. We have only 65 percent confidence that the -0.00014 estimate is significantly different than zero, and we want at least 90 percent confidence. The p-value is not reported, but the absence of asterisks on the standard error indicates that we have not obtained the 90 percent confidence level. Introducing a Dummy Variable Notice in the figure above that the employment growth rate is negative in only a few instances. Examining the data, we find a negative employment growth rate for the years 1949, 1954, 1958, 1970, 1971, 1975, 1982, 1991, 2001, 2002, and 2003. If we think of these years as being “unusual” recessionary years, then it is reasonable that we try to adjust our thinking about the trend so that we do not give these unusual periods undue influence. A way to make an adjustment is by introducing a “dummy variable.” Define a new variable D that is equal to 1 during a year when the employment growth rate is negative and equal to zero for all the other years. This dummy variable is also referred to as an “indicator” variable because the value of 1 indicates something, in this case negative employment growth. When we include the dummy variable in the regression our polynomial model for the employment growth rate becomes (7) g Lt  a0  a1t  as D . Estimating this model, we obtain (8) g Lt  0.027  0.00012t (0.003)*** (0.0001)  0.037D , (0.0046)*** R 2  .5483 Comparing (6) and (8), notice that the introduction of the dummy variable dramatically increased the R 2 . This indicates that much of the error in the model (6) is due to the unusual years where the employment growth rate dipped negative. The coefficient on the time variable t still hints at a downward trend in the employment growth rate, but the fact that there are no asterisks on the standard error indicates that this downward trend is not significant. Thus, while the introduction of the dummy variable can change the significance of another coefficient, it did not do so in this case. Note, however, that the addition of the dummy variable did increase the value of the intercept term from 0.021 in model (6) to 0.027 in model (8). This is important. Because we do not find significant evidence of a time trend, we should drop the variable t from the model. 3 This implies the model (5) reduces to g Lt  a0 , which indicates the growth rate is constant. The estimate of this constant is simply the average of the growth rates. For the data we have, we find (9) g Lt  0.176 Dropping the variable t from the model (7) and estimating the resulting model by regression the employment growth rate on the dummy variable we find (10) g Lt  0.0240  0.037D , (0.0019)*** (0.0046)*** R 2  .5372 The figure below shows what can also be seen in the estimated models (9) and (10). If we do not adjust for the unusual down time periods, our estimate for the average growth rate of employment is 1.76 percent per year. Alternatively, when we adjust for the unusual down times, our estimate for the average employment growth rate is 2.40 percent per year. This is a rather large quantitative difference. In our tests for trends in the models (6) and (8), we cannot find evidence of a significant downward trend in the employment growth rate. 4

Notes: Using a Dummy Variable

Related documents

Products

Support

Notes: Using a Dummy Variable

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib