FUNCTIONAL FORMS OF REGRESSION
MODELS
• A functional form refers to the algebraic form of the relationship
between a dependent variable and the regressor(s).
• The simplest functional form is the linear functional form, where the
relationship between the dependent variable and an independent
variable is graphically represented by a straight line.
7-0
EXAMPLES
• ARDL Models
• Linear models
• Differenced variables
• The log-linear model
• Semilog models
• Reciprocal models
• The logarithmic reciprocal model
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-1
Choosing a Functional Form
• After the independent variables are chosen, the next step is to
choose the functional form of the relationship between the
dependent variable and each of the independent variables.
• Let theory be your guide! Not the data!
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-2
Functional Form
• A further assumption we make about the
econometric model is that it has the correct
functional form.
• This requires the most appropriate variables in the
model and that they are in the most suitable
format, i.e. logarithms etc.
• One of the most important considerations with
financial data is that we need to model the
dynamics appropriately, with the most appropriate
lag structure.
Functional Form
• It is important we include all the relevant variables in the
model, if we exclude an important explanatory variable, the
regression has ‘omitted variable bias’. This means the
estimates are unreliable and the t and F statistics can not be
relied on.
• Equally there can be a problem if we include variables that
are not relevant, as this can reduce the efficiency of the
regression, however this is not as serious as the omitted
variable bias.
• The Ramsey Reset test can be used to determine if the
functional form of a model is acceptable.
Lagged Variables
• A possible source of any problem with the functional form is the lack
of a lagged structure in the model.
• One way of overcoming autocorrelation is to add a lagged dependent
variable to the model.
• However although lagged variables can produce a better functional
form, we need theoretical reasons for including them.
Inclusion of Lagged variables
• Inertia of the dependent variable, whereby a
change in an explanatory variable does not
immediately effect the dependent variable.
• The overreaction to ‘news’, particularly common in
asset markets and often referred to as
‘overshooting’, where the asset ‘overshoots’ its
long-run equilibrium position, before moving back
towards equilibrium
• To allow the model to produce dynamic forecasts.
Types of Lag
• Autoregressive refers to lags in the dependent variable
• Distributed lag refers to lags of the explanatory variables
• Moving average refers to lags in the error term (covered later).
ARDL Models
• An Autoregressive Distributed lag model or ARDL
model refers to a model with lags of both the
dependent and explanatory variables. An ARDL(1,1)
model would have 1 lag on both variables:
yt 0 1xt 2 xt 1 3 yt 1 ut
Differenced Variables
• A differenced or ‘change’ variable is used to
model the change in a variable from one time
period to the next. This type of variable is often
used with lagged variables to model the short
run.
yt yt yt 1
The long-run static equilibrium
• In econometrics the long and short run are
modelled differently. (later we will use
cointegration to model this).
• The long-run equilibrium is defined as when the
variables have attained some steady-state values
and are no longer changing.
• In the long-run we can ignore the lags as:
yt yt 1 yt 2 y *
Long-Run
• To obtain the long-run steady-state solution from any given model we
need to:
- Remove all time subscripts, including lags
- Set the error term equal to its expected value of 0.
- Remove the differenced terms
- Arrange the equation so that all x and y terms are on the same
side.
Long-run
• For example given the following model, we can
use the previous rules to form a long-run steadystate solution:
yt 0 1xt 2 yt 1 3 xt 1 ut
0 0 2 y * 3 x *
2 y* 0 3 x *
0 3
y*
x*
2 2
Potential Problems with Lagged
Variables
• The main problem is deciding how many lags to
include in a model.
• The use of lagged dependent variables can produce
some econometric problems.
• With a number of lags, there can be problems of
multicollinearity between the lags
• There can be difficulties with interpreting the
coefficients on the lags and offering a theoretical
reason for their inclusion
Alternative Functional Forms
• An equation is linear in the variables if plotting the function in terms of X and Y
generates a straight line
• For example, Equation 7.1:
Y = β 0 + β 1X + ε
(7.1)
is linear in the variables but Equation 7.2:
Y = β 0 + β 1 X2 + ε
(7.2)
is not linear in the variables
• Similarly, an equation is linear in the coefficients only if the coefficients appear in
their simplest form—they:
• are not raised to any powers (other than one)
• are not multiplied or divided by other coefficients
• do not themselves include some sort of function (like logs or exponents)
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-14
Alternative Functional Forms
(cont.)
• For example, Equations 7.1 and 7.2 are linear in the coefficients,
while Equation 7:3:
(7.3)
is not linear in the coefficients
• In fact, of all possible equations for a single explanatory variable,
only functions of the general form:
(7.4)
are linear in the coefficients β0 and β1
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-15
Linear Form
• This is based on the assumption that the slope of the
relationship between the independent variable and the
dependent variable is constant:
• For the linear case, the elasticity of Y with respect to X (the
percentage change in the dependent variable caused by a 1percent increase in the independent variable, holding the other
variables in the equation constant) is:
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-16
Double-Log Form
• Assume the following:
• Taking nat. logs Yields:
Yi 0 X1i1 X2i2 ei
ln Yi ln 0 1 ln X1i 2 lnX 2i i
• Or
ln Bo ln Yi 1 ln X1i 2lnX 2i i
• Where
• this is linear in the parameters and linear in the logarithms of the explanatory
variables hence the names log-log, double-log or log-linear models
7-17
• Here, the natural log of Y is the dependent variable and the
natural log of X is the independent
variable:
• In a double-log equation, an individual regression coefficient
can be interpreted as an elasticity because:
• Note that the elasticities of the model are constant and the
slopes are not
• This is in contrast to the linear model, in which the slopes
are constant but the elasticities are not
• Interpretation:
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-18
Interpretation of double-log functions
• In this functional form 1 and 2are the elasticity coefficients.
• A one percent change in x will cause a % change in y,
• e.g., if the estimated coefficient is -2 that means that a 1% increase in x will
generate a 2% decrease in y.
7-19
C-D production function
• where:
Y AL K
• Y = total production (the monetary value of all
goods produced in a year)
• L = labour input (the total number of person-hours
worked in a year)
• K = capital input (the monetary worth of all
machinery, equipment, and buildings)
• A = total factor productivity
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-20
• α and β are the output elasticities of labour and capital,
respectively. These values are constants determined by available
technology.
• Output elasticity measures the responsiveness of output to a
change in levels of either labour or capital used in production,
ceteris paribus. For example if α = 0.15, a 1% increase in labour
would lead to approximately a 0.15% increase in output.
• Further, if:
• α + β = 1, the production function has constant returns to scale:
Doubling capital K and labour L will also double output Y. If
• α + β < 1, returns to scale are decreasing, and if
• α + β > 1 returns to scale are increasing.
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-21
Semilog Form
• The semilog functional form is a variant of the double-log
equation in which some but not all of the variables
(dependent and independent) are expressed in terms of their
natural logs.
• It can be on the right-hand side, as in:
lin-log model: Yi = β0 + β1lnX1i + β2X2i + εi (7.7)
• Or it can be on the left-hand side, as in:
log-lin:
lnY = β0 + β1X1 + β2X2 + ε
(7.9)
7-22
Measuring growth rate (log-lin model)
• May be interested in estimating the growth rate of population, GNP,
Money supply, etc.
• Recall the compound interest formula
• Where r=compound rate of growth of Y,
• Is the value at time t and Y0 is the initial value
Yt Y0 (1 r )
t
7-23
• Taking natural logs
•
lnYt ln Y0 t ln(1 r )
• We can rewrite (1) as
let
1 ln Y0
2 ln(1 r )
lnYt 1 2t ut
7-24
interpretation
• The slope coefficient ( )measures the constant
2
proportional or relative change
in Y for a given absolute
change in the value of the regressor (in this case t)
• In this functional form( 2 ) is interpreted as follows. A one
unit change in x will cause a 2 (100)% change in y,
• This is the growth rate or sem-ielasticity
• e.g.,
• if the estimated coefficient is 0.05 that means that a one
unit increase in x will generate a 5% increase in y.
©
7-25
Consider the following reg. results for expenditure on
services over the quarterly period 2003-I to 2006-III
ln EXTt 8.3226
0.00705t
se
(0.0016)
(0.00018)
t
(5201.6)
(39.1667)
r 2 0.9919
• -Expenditure on services grow at a quarterly rate of
0.705% {ie. (0.00705)*100}
• Service expenditure at the start of 2003 is $4115.96
billion {ie. antilog of the intercept (8.3226)}
7-26
Instantaneous vs. compound rate of growth
•2 Gives the instantaneous (at a point in time)rate of
growth and not compound rate of growth (ie.
Growth over a period of time).
• We can get the compound growth rate as
• [(Antilog 2 )-1]*100
• or [(exp 2 )-1]*100
• ie. [exp(0.00705)-1]*100=0.708%
7-27
Lin-log models
[Yi = β0 + β1lnX1i + β2X2i + εi]
• Divide slope coefficient by 100 to interpret
• Application: Engel expenditure model
• Engel postulated that; “the total expenditure that is
devoted to food tends to increase in arithmatic
progression as total expenditure increases in
geometric progression”.
7-28
Consider results of food expenditure India
FoodExpi 1283.912 257.2700 ln TotalExpi
• See
• A 1% increase in total expenditure leads to 2.57
rupees increase in food expenditure
• Ie. Slope divided by 100
7-29
Polynomial Form
• Polynomial functional forms express Y as a function of the independent
variables, some of which are raised to powers other than 1
• For example, in a second-degree polynomial (also called a quadratic)
equation, at least one independent variable is squared:
Yi = β0 + β1X1i + β2(X1i)2 + β3X2i + εi
(7.10)
• The slope of Y with respect to X1 in Equation 7.10 is:
(7.11)
• Note that the slope depends on the level of X1
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-30
Figure 7.4
Polynomial Functions
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-31
Inverse (reciprocal) Form
• The inverse functional form expresses Y as a function of the reciprocal
(or inverse) of one or more of the independent variables (in this case,
X1):
Yi = β0 + β1(1/X1i) + β2X2i + εi
• So X1 cannot equal zero
• This functional form is relevant when the impact of a particular
independent variable is expected to approach zero as that
independent variable approaches infinity
• The slope with respect to X1 is:
(7.13)
(7.14)
• The slopes for X1 fall into two categories, depending on the sign of β1
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-32
Properties of reciprocal forms
• As the regressor increases indefinitely the regressand approaches its
limiting or asymptotic value (the intercept).
7-33
Example: relationship b/n child mortality
(CM) & per capita GNP (PGNP)
1
ˆ 81.79436 27, 237.17
CM
• Now
PGNPi
• As PGNP increases indefinitely CM reaches its
asymptotic value of 82 deaths per thousand.
7-34
Table 7.1 Summary of Alternative
Functional Forms
© 2011 Pearson Addison-Wesley. All rights
reserved.
7-35
7-36