Uploaded by إسماعيل أحمد أحمد

FMT Lecture 1 Economitrics

advertisement
Lecture 1:
 Reading:
• Brooks 2019, Chapters 1, 2, 3.
Objectives





Understand simple linear regression
Understand model evaluation criteria
Limitations of simple linear regression models
Multiple regression analysis
Checking for model adequacy using residual
analysis
Techniques in Data Modelling
 Perform an ‘exploratory data analysis’
i.e. plot data over time to check that the
series conforms to your expectations and
allow data entry errors to be spotted
 Obtain summary statistics and simple
graphical displays such as histograms for
each variable to get a feel for the data
you are modelling
Relationships between
variables: Correlation
 Does a relationship exist between two
variables?
 How strong is the relationship?
 Is the correlation positive or negative?
 Useful to know prior to model
construction
Cause & Effect Relationships
 Patterns will exist between different
variables and coincidence can imply very
good correlation
 Correlation does not imply causality
 Scatter diagrams and correlation
coefficients are useful, but for business
forecasting purposes a more precise
relationship is required
Simple Linear Regression
Model
 Fitting a straight line through the data
we have at our disposal
y = constant + ßx + error
where
axis
ß = slope of the line
constant = intercept with vertical
error = random fluctuations
 In our example:
equity price = 4.91 + (9.33 x GNP)
Prediction From Regression
Line
 Solve for any values of x
 How good the prediction is depends upon
i) the value of the correlation coefficient
ii) the value of x used
 Confidence limits are associated with
model predictions
Prediction From Regression
Line cont....
 S.E. is the standard deviation of the
errors. It is a good indicator of
forecasting accuracy when comparing
two models.
 in our model S.E. = 4.823
Model Evaluation Using
Eviews
Useful indicators of model adequacy are:
 t-ratio
 p value
 F statistic
 R2 (adjusted R2)
Model Evaluation cont ...
t ratio & p value
 Fit of parameter
t ratio = coefft ÷ st dev
 Compare with t tables, (T-2) df, or use
rule of thumb: greater than +2; less than
-2
 p value - level of the test in order to
accept the hypothesis that the parameter
is significant.
If greater than 0.05 then reject.
Model Evaluation cont ...Fit
of Model - F statistic
Analysis of variance
source of
variance
DF
SS
MS
F
regression
p-1
SSR
SSR÷ (p-1)= MSR
MSR
MSE
error
n-p
SSE
SSE÷(n-p)= MSE
total
n-1
SSTO
Model evaluation... Fit of
model - R2 statistic
 The R2 statistic tells you what percentage
of the variation in the response variable
has been explained by your model.
 Derived from SSR ÷ SSTO
 For our model = 8651.0 ÷ 9162.6 =
94.4% of variation explained
 R2 adjusted for throwing in irrelevant
parameters (greater restriction on
degrees of freedom)
Limitation of the Simple
Linear Regression Model
 Isolating key explanatory variables is a
time-consuming process requiring a
thorough understanding of the underlying
processes affecting your model
 Very few relationships are ‘linear’ - a
more complex mathematical function may
be required
Limitation: Simple Linear
Regression Model cont...
 The model predictions may not be very
accurate, especially if you are forecasting
over time horizons well outside your
sample period.
 One explanatory variable may not be
sufficient. Hence multiple regression
analysis.
Multiple Regression Analysis
Objectives
 Multiple regression analysis is a multivariate
statistical technique used to examine the
relationship between a single dependent
variable and a set of independent variables
 It is widely used in business for two broad
classes of research problems; prediction and
explanation
The Multiple Regression
Model
 Instead of having just one predictor
variable on the right hand side of the
equation we have many; ie
y = C + b1x1 + b2x2 + .... +bnxn + error
where the error, or residual term, is the
component unexplained by your model.
Prediction with Multiple
Regression
 The objective is to maximise the overall
predictive power of the independent (x)
variables as a means of forecasting the
dependent variable, eg equity prices
 Often, predictive power is maximised at the
expense of interpretation of results
 A second objective is the determination of the
relative importance of each independent
variable in the prediction of the response
variable
Assumptions in Multiple
Regression Analysis
Five basic assumptions are made when
calculating a multiple regression
relationship
Assumptions:
Multiple Regression ...cntd
1.That we are dealing with a linear function of the
independent variables plus an error term;
y = C + ß1X1 + ß2X2 + ....... + ßnXn + error
2.That the error term has a mean of zero
3.That there is a constant variance in the error
terms
4.That the error terms are independent
5.That there is no significant linear relationship
between the independent variables
Linearity of the Relationship
 The concept of correlation is based on a linear
relationship thus making it a critical issue in
regression analysis
 It is easily examined in residual plots
 Problem may be remedied by transforming the
data eg taking logarithms, square roots or a
polynomial may be fitted to accommodate the
curvilinear effects, eg
equity price = constant + ß1 x GNP2
Error Term
The error term must have a zero mean by
definition since the line of best fit will
pass directly through the centre of the
data. There will be as many positive
errors as negative errors.
Constant Variance of the
Error Terms
 A non constant variance of the error
terms implies the relationship is changing
over time. If the model spans a long time
period conditions may not be stable.
 Will lead to problems of prediction. Again
taking logarithms or square root of the
independent variable will stabilize the
variance.
Independence of the Error
Terms
 The pattern of residuals should appear random
and similar to the null plot.
 A pattern occurs if the basic model conditions
change but the changes are not incorporated in
the model. For example, predicting profits on
swimsuits with monthly data including two
winter seasons and one summer season would
lead to negative residuals for the winter months
with positive residuals for the summer months.
Independence of the Error
Terms (cont...)
 Could be rectified by taking first
differences in the data or including a
variable to represent the seasonal
component.
Relationship Between
Independent Variables
 As the complexity of the model increases, so
does the degree of inter-relatedness of the
variables on the right hand side of the equation
 It is not an ideal world and when dealing with
business problems variables do tend to move in
line with one another
 Need to check the correlation matrix to identify
the problem
Relationship Between
Independent Variables cntd
 High correlations between the variables leads
to unstable coefficient estimates for the
independent variables
 Remedy is to omit the correlated variable and
add more data.
Selection of Variables
 You will normally have a number of possible
independent variables from which to choose
 To assist you to obtain the ‘best’ regression
model objectively, a sequential search process
may be employed in the form of a stepwise
regression
 Most popular technique in variable selection. All
variables are examined to determine their
contribution to the predictive power of the
model.
Building a Multiple
Regression Model
Checklist
 Select dependent and independent
variables
 Plot each variable over time and perform
‘exploratory data analysis’
 Check for high correlations between y
and x’s
 Check for high correlations within the x’s
Building a Multiple
Regression Model (cont...)
Checklist
 Construct model using Eviews
 Stepwise regression recommended for
situations where large number of
independent variables are involved
 Delete insignificant variables from the
model
Building a Multiple
Regression Model (cont...)
Checklist
 Obtain final model specification
 Check the adequacy of model by plotting
residuals over time - are they random or
‘spherical’?
 If not adequate, carry out the
suggestions for model improvement
outlined earlier
Download