Uploaded by kar fai chan

8 Residual Analysis - Tagged

advertisement
Residual Analysis
Dr. William Lau
Tel: 3943 8572
williamlau@cuhk.edu.hk
The Financial Modelers'
Manifesto
by Emanuel Derman and Paul Wilmott in 2009

Dr. Emanuel Derman was the Managing Director of Goldman Sachs and headed the
firm’s Quantitative Risk Strategies group.

Dr. Paul Wilmott was the founder of the Mathematical Finance program at Oxford
University
“We do need models and mathematics – you cannot think about finance and
economics without them – but one must never forget that models are not the world...
It doesn't fit without cutting off some essential parts. And in cutting off parts for the
sake of beauty and precision, models inevitably mask the true risk rather than
exposing it. The most important question about any financial model is how
wrong it is likely to be, and how useful it is despite its assumptions... There
is no right model, because the world changes in response to the ones we use…
Markets change and newer models become necessary. Simple clear models with
explicit assumptions about small numbers of variables are therefore the
best way to leverage your intuition without deluding yourself.”
https://www.uio.no/studier/emner/sv/oekonomi/ECON4135/h09/undervisning
smateriale/FinancialModelersManifesto.pdf
3
Assumptions about the Random
Error, ɛ
 E(ɛ) = 0
 Var(ɛ) is constant
 ɛ’s are normally distributed
 ɛ’s are independent (so all pairs of ɛ are
uncorrelated)
Lesson Outline
1
Regression Residuals
2
Detecting Lack of Fit
3
Detecting Unequal Variances
4
Checking Normality Assumption
5
Detecting Outliers and Identifying Influential
Observations
6
Test
Detecting Residual Correlation: The Durbin-Watson
5
Actual random error e and
regression residual ^e
6
Data for 20 Athletes
7
8
SAS printout for first-order
model
9
SAS printout for first-order model
10
SAS printout for quadratic
(second-order) model
11
SAS printout for quadratic
(second-order) model
Detecting Lack of Fit
13
Detecting Lack of Fit with Residuals
 Plot the residuals on the vertical axis against each
of the independent variables on the horizontal axis.
 Plot the residuals on the vertical axis against the
predicted values on the horizontal axis.
 In each plot, look for trends, dramatic changes in
variability, and/or more than 5% of residuals that
lie outside 2s of 0. Any of these patterns indicates
a problem with model fit.
14
First-order model
15
SAS plot of residuals for the firstorder model
16
MINITAB plot of cholesterol
data with least squares line
17
Quadratic (second-order)
model
18
SAS plot of residuals for the quadratic
model
19
First-order
model
Secondorder model
20
21
SPSS regression printout for the demand model
22
SPSS plot of residuals against
price for demand model
23
Partial Regression Residuals
The set of partial regression residuals for the jth independent
variable xj is calculated as follows:
= y – (0 + 1x1 + … +
xj-1 +
j-1
x
j+1 j+1
+ … + kxk)
= + jxj
where = y – is the usual regression residual.
Partial residuals measure the influence of xj on the
dependent variable y after the effects of the other
independent variables have been removed or
accounted for.
24
SPSS partial residual plot
for price
25
Graphs of some mathematical
functions relating E(y) to p
26
SPSS regression printout for
demand model with
transformed price
27
Residual Plot of the Transformed
Model
Detecting Unequal
Variances
29
A plot of residuals for Poisson
data
Poisson Probability Distribution
30
Two Properties of a Poisson Distribution:
1. The probability of an occurrence is the same
for any two intervals of equal length.
2. The occurrence or nonoccurrence in any
interval is independent of the occurrence or
nonoccurrence
any other
interval.
A Poisson
distributedinrandom
variable
is
often
useful in estimating the number of
occurrences
over a specified interval of time or space. It
is a discrete random variable that may
assume
an infinite sequence of values (x = 0, 1,
2, . . . ).
31
A plot of residuals for binomial
data (proportions or
percentages)
32
Binomial Probability
Distribution
Four Properties of a Binomial
Experiment
1. The experiment consists of a sequence of n
identical trials.
2. Two outcomes, success and failure, are possible
on each trial.
3. The probability of a success, denoted by p, does
not change from trial to trial.
4. The trials are independent.
33
A residual plot of data subject
to multiplicative errors
34
Stabilizing transformations for
heteroscedastic responses
35
Salary and work experience data
for 50 social workers
36
MINITAB regression printout for
second-order model of salary
37
MINITAB residual plot for
second-order model of salary
38
A plot of residuals for data
subject to multiplicative errors
39
Stabilizing transformations for
heteroscedastic responses
40
MINITAB regression printout for
second-order model of natural
log of salary
41
MINITAB residual plot for secondorder model of natural log of
salary
42
MINITAB regression printout firstorder model of natural log of
salary
43
First-order model of natural log of
salary
44
Statistical Test for Testing
Heteroscedasticity

Divide the sample data in half and fit the
regression model to each half.
 Conduct two-tailed F-test to compare the
estimated variances of the random error
terms of the two models.
 H0 : 12 = 22
H1 : 12 ≠ 22
45
SAS regression printout for second-order
model of salary: Subsample 1
(years of experience < 20)
46
SAS regression printout for second-order
model of salary: Subsample 2
(years of experience  20)
Checking the Normality
Assumption
48
MINITAB histogram of residuals
from log model of salary
49
MINITAB stem-and-leaf plot of
residuals from log model of
salary
50
MINITAB normal probability plot
of residuals from log model of
salary
Detecting Outliers and
Identifying Influential
Observations
52
Standardized Residuals
 It is the z-score for a residual
 Observations with standardized residuals that exceed 3 in
absolute value are considered as outliers.
 Possible reasons for outliers:
 Experimental procedures may have malfunctioned.
 Experimenters may have misrecorded the measurements.
 Data may have been input incorrectly into the computer.
 If none of the above, it could be accurate outliers!
53
Data for Fast-food Sales
54
MINITAB regression printout for model
of fast-food sales
55
MINITAB regression printout for
model of fast-food sales
56
MINITAB plot of residuals versus
traffic flow
57
MINITAB plot of residuals
versus city
58
MINITAB regression printout for model of
fast-food sales with the corrected data point
59
MINITAB regression printout for model of
fast-food sales with corrected data point
60
MINITAB plot of residuals versus traffic
flow for model with corrected data point
61
MINITAB plot of residuals
versus city for model with
corrected data point
62
Numerical Techniques for
Identifying Outlying Influential
Observations
 Leverage [OPTIONAL]
 Cook’s Distance [OPTIONAL]
 The Jackknife
63
Leverage [OPTIONAL]
^
𝑦 𝑖 =h 1 𝑦 1+ h2 𝑦 2 +… +h𝑖 𝑦 𝑖 +…+ h𝑛 𝑦 𝑛
64
Cook’s Distance [OPTIONAL]
2
𝐷𝑖 =
( 𝑦 𝑖− ^
𝑦𝑖)
[
h𝑖
2
( 𝑘+ 1 ) 𝑀𝑆𝐸 (1− h𝑖 )
]
65
The Jackknife
 A deleted residual, denoted di, is the difference
between the observed response yi and the
predicted value (i) obtained when the data for the
ith observation is deleted from the analysis.
 di = yi - (i)
 An observation with an unusually large (in
absolute value) deleted residual is considered to
have a large influence on the fitted model.
Detecting Residual
Correlation: The DurbinWatson Test
67
dL,  ≤ d ≤ dU, 
dL,  ≤ (4 – d) ≤ dU,

68
A Firm’s Annual Sales Revenue
69
SAS regression printout for model of
annual sales
70
SAS plot of residuals for model of
annual sales
71
Reproduction of part of the Table
E.8 on page 676 in the textbook
72
dL,  ≤ d ≤ dU, 
dL,  ≤ (4 – d) ≤ dU,

73
Check Your Understanding
Which of the following methods is frequently used to
check the normality assumption about the error term?
a)
Plotting partial regression residuals against xi.
b)
Calculating VIF for each independent variable.
c)
Constructing histogram of residuals.
d)
Plotting y against x.
e)
Conducting the Durbin-Watson test.
Download