Document 11040544

advertisement
Uewey
HD28
.M414
Estimating The Covariance Matrix
From
Unsynchronized High Frequency Financial Data
Bin Zhou
Sloan School of Management
Massachusetts Institute of Technology
Cambridge,
02142
MA
#3807
April, 1995
Estimating The Covariance Matrix From
Unsynchronized High Frequency Financial
Data
Bin Zhou
Sloan School of Management
Massachusetts Institute of Technology
Cambridge,
MA 02139
i/ASSAGHUSETTS liMSTITUFE
OF TECHNOLOGY
MAY 3
1995
LIBRARIES
Abstract
This paper proposes an estimator of the covariance matrix of currencies using
unsychronized and noisy high frequency observations.
The
estimator allows us to estimate the covariance matrix over a shorter
time interval with more accuracy.
when
is
not f-consistent
there are so-called observation noises. Increasing observation fre-
quency
tion.
The estimator
infinitely
does not always increase the accuracy of the estima-
Optimal observation frequency
total volatility over the noise level.
is
dependent on the ratio of the
Daily covariance matrices of three
exchange rates are calculated to demonstrate the methodology.
The
empirical results show that the correlations of the three currencies are
strong but various over time.
Key Words:
f-consistency; observation noise; volatility.
Introduction
1
The
estimation of the covariance matrix of financial prices
folio
optimization and risk management.
is
necessary in port-
Besides sample covariance, many-
other estimators have been proposed (Stein 1975,
Dey and Srinivasan
1985).
However, estimating the covariance matrix from daily data can have serious
problems. Jobson and Korkie (1980) indicated that, in some cases,
better
it is
to use the identical matrix instead of the sample covariance matrix in the port-
folio selection.
The problem
is
that the
number
of observations
to estimate all entries of a big covariance matrix.
one
may want
to collect
To
may
prevent us to do
not enough
get around the problem,
more data over longer time
changing condition of markets
is
interval.
so.
Another approach
number
to impose constrains on the covariance matrix to reduce the
parameters (Frost and Savaino, 1986). The constrain
However, the
may be
is
of free
subjective
and
not reflect the reality of the market. This paper explores another possibility
of using high frequency data. Because of fast-growing
is
now
for
computer power, data
available in ultra-high frequency, such as tick-by-tick.
Exchange
rates,
example, can easily have over one million observations in one year.
There are several
difficulties in
at different times.
The
first
one
is
The
price or quote of each cmrrency or stock
The second
difficulty is so-called observation noise
the issue of unsynchronized time.
comes
using high frequency data.
(Zhou 1995a). The high frequency time
series
from a continuous process with observation
S{U)
where P{t)
is
=
P{U)
P{t)
is
noises:
Ue[a,b],
€t„
(1)
a diffusion process
dP{t)
The
+
can be viewed as observations
=
fi{t)
+
referred to as a signal process.
of observation.
Ct,
a{t)dWt.
The
noise
(2)
Ct,
only occurs at the time
behaves irregularly and no distributional assumptions have
been made about the
noises.
The
representation captured
many
characteris-
tics,
such as strong negative autocorrelations in high frequency observations.
The
present of observation noises can cause great difficulties in constructing
(Zhou 1995b) estimators. For an estimator without f-consistency,
f-consistent
increasing observation frequency infinitely can do
In the next section,
I
will concentrate
more harm than good.
on constructing the estimators
the covariance matrix of unsynchronized financial time
generality,
series.
I
will only discuss estimating the covariance of
The variance estimators can be found
last section,
series.
1
in
rates.
loss of
two financial time
Zhou (1995a, 1995b). In the
will gives the estimates of daily covariance
matrix of three exchange
Without
for
and daily correlation
Estimating The Covariance Using Unsy-
2
chronized Data
Suppose that two time
two processes of
I
{Sx{U)} and {S'y(sj)} are observations from
with observation noise. To further simphfy the problem,
(1)
assume that the
series
diffusion process P{t)
is
a Brownian motion with a time
deformation:
= Mti) + ^x{rx{U)) +
exiU),
a<U<b
=
eyisi),
a
SxiU)
Sy{Sj)
^xiU),
is
fy('Si)
Hvist)
+
Wxir) and Wy{t)
Wx{TY{s^))
are
all
+
independent.
the covariance of two Brownian motions
<
Si
<
b.
The parameter
Wx{t) and Wy{t)
of interest
over time interval
[a,b]:
c{a, b)
Two
for Ti
processes
= Cov{Wxib) -
Wx{t) and Wx{t)
<T2 <T3 <
are
Wx{a), Wyib) - Wyia))
(3)
assumed to have no leading
effect, i.e.,
Ti
Cov{Wx{t{)
- Wx{T2),Wy{T^) - Wy{n)) =
0,
and
(4)
C0Y{Wy{n) - Wy{T2),Wx{Ts) - Wxiu)) =
Under assumption
(4),
c{s, t)
+ c{t, u) =
c(s, u),
s
<
t
<
u.
0.
Without making any assumption about the regularity of time sequence
{ti]
and
{.Sj}
and the regularity of the covariance of each pair of Sx{ti)
SxiU-i) and Syitj)
—
S'y(ij_i),
it
is
very difficult to construct a
likelihood estimator of the covariance over time
the interval. Instead,
c{a, 6)
=
I
E
[a, b]
—
maximum
using observations within
propose the following unbised estimator:
(SxiUnj))
- SxiU-u-iMSvis,) -
Sx{sj-i)),
(5)
a<Sj<b
where
i'^ij)
= mm{i
Notice that the subscripts
:
>
ti
i
Sj}
and
and i~{j)
=
ma.x{i
:
U <
That
j are interchangeable.
Sj}.
is,
estimator (5)
can also be written as
c{a,b)=
^
(SxiU)
- Sx{U-MSy{s,.^i^) -
Sx{sj-^,^,))),
(6)
a<f,<6
When we
have synchronized times, the estimator
variance.
To save computation
is
simply the sample co-
time, series {5y(sj)} should have fewer data
points.
Theorem
1
The estimator
(5) is unbiased,
Ec(a,
The proof
is
straight forward
b)
=
i.e.,
c{a, b)
and can be found
in the
Appendix.
There are no distributional assumptions being made about the
The time deformation
can be exotic or autocorrelated.
noises.
They
function Tx{t)
and
ty{s) do not need to be known. Therefore, the volatihties in each tick do not
have to be constant or given. Without the presence of noises, the estimator
is
f-consistent.
However, when there are nonneghgible noises, the variance of
the estimator diverges.
Theorem
2 LetZxiU+a))
WV(tj_i). If
all
=
= Wy{sj)-
Wx{ti+{j))-Wx{ti- (j-i)) andZyisj)
ex {ti) and eyisj) are zeros and maxi varZx{ti+{j))
^
var{c{a,b))=
var(Zx(ii+o))ZK(sj))
^
0.
—
»
0,
then
(7)
a<Sj<b
Otherwise,
var{c{a, b))
>
^
var{ex{ti+^j)))var{eY{sj)).
(8)
j
The proof
If
is
the majority of noises
approaches
if
is
given in the Appendix.
is
nonzero, the right-hand side of equation (8)
infinity as the observation frequency increases.
the frequency
is
On the
other hand,
too low, the right-hand side of equation (7) stays high. There
an optimal observation frequency to minimize the variance of the estimator
(5).
To
find such
an optimal observation frequency without assuming any
regularity of time sequences {ti}
and {sj}
is
complicated.
To
get
some ideas
about
this optimal observation frequency,
I
discuss only the following simple
example:
1.
Time
series
SxiU) have
Sj_i
<
size
m+
and Syisj) have
1
^(j-i)fc
<
i(j-i)A:+i,
to
=
a,tm
...,
<
tjk-i
<
So
=
a,Sn
=
size
=
Sj,i
n
1,
+
I,
..,
n
—
1.
m=
kn.
except
It is
2.
easy to see that
Let Zx{U)
i'^{j)
=
=
b,
jk and i^{j
= WxiU) - Wx{U-i)
variances of signal changes are
—
1)
and Zy{s,)
=
=
b.
—
(j
where
3.
The
aj^
=
Tx{b)
—
—
2
= —^
= Vx
=
and var{ZY{sj))
=
noises have no autocorrelations
Ty(fc)
—
Tyia).
and
and
var(r/y(si))
=
r/y.
2
var(Zx(^^)Zy(s,))
where a
is
a constant between
Wyisi^i).
constant
all
Tx{a) and ay
var(r7x(it))
-
Wyisi)
2
var{Zx{ti))
l)k
=
1
aal{U)al.{sj)
and
2;
=
2
a""^""^
n?
The
conditions, the variance of the estimator (5)
Under these
1
var(c(a, b))
where a
is
=
q(1
2
2
2
+ t)^^^ + S^^y +
K
is:
2
+ 2^^^ + ^nrjWx
'^<^Wx
(9)
Tfl
7v
a constant between
1
and
2.
The proof
of the equation
is
given in
the Appendix.
Prom
sj{a{l
the optimal observation frequency n
(9),
+
l/k)rxrY
+
2rx/A;)/4, where
When
signal-to-noise ratios.
size of the other one, the
When
is
too
is
and ry
much
optimal observation frequency
observation frequency
may have
crj^/Vx
the size of one series
The optimal
there
=
rx
near
is
is
high level of noises, high frequency data, such as tick-by-tick,
many data
n data
points to use the estimator
points,
observation frequency.
I
can
covariance, then use S'y(tj),j
I
Ck{a,b)
bigger than the
near ^JarxTY/^-
is
utilize all
(5).
n < m. n
first
=
1,
is
1, 2A;
m
I
propose
data points and
k times larger than the optimal
use SY{tj),j
A; -1-
Of course we can
the data points,
the following estimator. Again assume that {SxiU)} has
Finally,
(Tyhy, the
proportional to the signal-to-noise ratio.
throw out some data points. However, to
{S'y(si)} has
=
-1-
1,
=
...
0, A:,2fc,
...
to estimate the
to estimate the covariance.
average these k estimates. In summary,
=
Y. iSx{tt{j))-Sx{U-u-k)mSY{s,)-Sx{s,^k)),
lJ:
p=l j=p{k)n
(10)
bound
of the
This estimator
is still
not f-consistent. However, the upper
variance of the estimator (10) becomes finite
approaches to
frequency-
infinity.
Estimating Covariance and Correlation
3
trices of
In this section,
I
by Olsen
&
will
estimate the daily covariance and correlation matrices
Associates.
It is
US
dollar
1,
recorded from the Reuters screen.
traded in the market followed by
price
less active.
is
quoted in
is
dollar
the Deutsche
data
JPY/USD.
set.
provided
It
con-
(DEM/USD),
the
mark and Japanese
1992 to September 30, 1993.
DEM/USD
is
The data
is
are the most active currencies
JPY/DEM,
Cross-rates, like
are
In this paper only the bid prices are used because the bid
its entirety.
The ask
digits only. Interpreting ask price
data
HFDF93
mark and US
(JPY/USD) and
(JPY/DEM) from October
much
The high frequency data
referred to as the
tains spot rate quotes of the Deutsche
Japanese yen and
Ma-
Three Exchange Rates
of exchange rates from tick-by-tick data.
yen
when the observation
price
is
by computer
quoted by the
is
last
two or three
often troublesome. Since the
a non-binding quote, the level of observation noise
is
very high.
The
basic statistics of the returns of the three exchange rates are listed in Table
8
1.
Table
1:
Summary
Statistics of Tick- by-tick
Retm^ns
n
Min.
Max.
Mean
sd
Skew.
Kurt.
p
DEM/USD
1472032
-.0066
.00544
9.92e-8
2.67e-4
.017
6.87
-.451
JPY/USD
570689
-.0105
0.0104
-2.18e-7
3.72e-4
-.029
32.17
-.425
JPY/DEM
158958
-.0098
0.0100
-1.70e-6
3.50e-4
-.385
23.26
-.107
Series
p:
the
first
order autocorrelation.
To examine
if
there are any lagged correlations
among
the three exchange
Figure
rates, cross-correlations of daily returns are calculated.
1
shows that
three exchange rates have significant correlations, but there axe no lagged
correlations.
To
calculate the correlation matrix, the following volatility estimator
is
used (Zhou 1995a):
^^''
= \T.
E
(5(i,)-5(t,-,))(5(i,+,)-5(i,_2fc)).
(11)
p=l j=p{k)n
and the
level of noise
^'
=
is
estimated by the sample autocovariance
-^E E
{S{U)
-
S{U^,)){S{U-k)
-
S{U_2k)).
(12)
p=l i=p(k)n
The
estimates of the annual volatilities, the variances of the signals, and the
noise level rf are listed in Table
2.
i
-II
rr
"
''
ts-
'
:'
i
.
Prom Table
quency
2, I
can get a rough estimate of the optimal observation
each pair of exchange rates.
for
about 260,000. Compared to the
rough estimate
is
JPY/USD,
less
it is
the covariance.
DEM/USD
For
than one
half.
I
Using a similar argument,
The
pairs of exchange rates.
rates are given in Figure
A;
=
/c
=
choose
fre-
and JPY/USD, the
size of the smaller series
3 in formula (10) to estimate
1
is
chosen in the two other
daily covariances of the three pairs of exchange
2.
Since the market volatility changes over time, the change of the covariance
is
aifected
by changing
desirable in
some
volatihty; therefore a correlation matrix
cases.
Using
volatility estimator (11), daily volatilities are
calculated for each exchange rate.
rates are plotted in Figure
Daily correlations of the three exchange
3.
There are several interesting observations from Figure
relation between
DEM/USD
indicates that the
US
dollar
and
is
JPY/USD
US
dollar
First, the cor-
When
it
moves,
it
mark and the Japanese
Deutsche mark has the same feature. During the
both the
3.
are almost always positive. This
a leading currency.
the same direction against both the Deutsche
of the time interval,
may be more
first
four
and a
moves
yen.
half
in
The
months
and the Deutsche mark dominated the
market. However, after February of 1993, the Japanese yen became a dominating currency. The role of each currency changes over a long time interval.
11
»«
reb
Ftfr
IM3
W*r
WiK
IMS
Apr
Apr
»S3
Kiy
K>y _
1»»
1*»
'in
)(in
lul
AU<^
S«p
C«i«rif«<efr»l"U**n
Figure
2:
Daily Estimates of the Covariance of Three Exchange Rates.
12
1M3
T*b
1
1992
1993
0(t
ftb
im
Mtr
1))3
1993
1993
Apr
A»r
Ktr
C«mliti«n t<tui<(n JPVA'SD
im
Miy
1))2
1993
1993
1»3
1993
)vn
AUf
S.p
nw
199}
lun
& )PY*)IM
=
11
1993
However, they are relatively stable over short periods such as months.
Since the three exchange rates are dependent, the actual covariance matrix
Buying one unit of DEM/USD and
is
singular.
of
JPY/DEM
minimum
to zero for
end a neutral
position.
The
JPY/DEM and selling one unit
empirical results confirmed that the
eigenvalue of estimated monthly covariance matrices are very close
all
twelve months.
Discussion
4
High frequency data provides us with enough data not only to estimate a
large covariance matrix, but also to estimate the variance matrix over a short
time interval. This
is
extremely beneficial in a fast changing market.
ables people to see the market change quickly
time.
Of
course, using high frequency data
nonsynchronized time causes great
is
and adjust
not without
difficulty in getting
a
It
en-
their portfolio in
its difficulties.
maximum
The
likelihood
estimator and the observation noise causes great difficulty in achieving an
f-consistency.
The
covariance estimator proposed here
sumptions. However,
if
one
is
willing to
is
simple with few as-
impose more conditions, the estimator
can be improved.
In the
(4)
cmrency market,
among exchange
rates.
it is
This
reasonable to assume no leading correlation
may
not be true in the stock market
14
when
small stocks are considered. Lo (1990) showed that there
stock market,
i.e.,
big stocks
may
lead small stocks.
is
asymmtretry
in the
The proposed estimator
does not work in such a case. However, small stocks are often thinly traded;
they are not the focus of this research.
15
Appendix:
i)
Proof of Theorem
First,
I
1,
the unbiasness of estimator
(5):
write the change of prices as following:
-
Sx{ti+u))
Sy{sj)
-
Sx(ti-!^j^i))
Svitj-i)
=
=
Zxiti+(j))
Zy{sj)
+
evisj)
+
^xiU+u))
-
ey(sj-i).
-
^x{ti-{j-i)),
and
Then
Ec{a,b)
= E
^
[Zx{ti+(j))ZY{sj)
+ Zx{U+(j))eY{sj)-
Zx{ti+(j))eY{sj-i)
a<Sj<b
+^x{ti+(j))ZY{Sj)
+
-^xiti-u-i))ZY{sj)
=
exiti+^j))eYisj)
-
-
ex{U+(j))eY{sj-i)
ex{U-ij-i))€Y{sj)
+
6x(^,-(j-i))er(sj_i)]
=
\/3^x(*t+(j))3af(sj)
Yl
a<Sj<b
<
3 ma.x
a x{ti+(j))y^^ ay (sj)
—
>
0.
j
On
the other hand,
when
there axe noises,
var(c(a,6))
>
^
var(
ex(^t+(j))ey(sj)).
a<Sj<6
i)
Proof of equation
var(c(a, b))
=
(9):
var(^[(Zx(^(j-i)/c)
+
•••
+
Zx(ijfc))Zy(sj)
1=1
+{Zx{t(j-i)k)
+
••
+
Zx{tjk))eY{sj)
-{Zx{t{j-i)k)
+
••
+
Zx{tjk))eY{sj-i)
+exitjk)ZYisj)
+
exitjkyvisj)
-^x{tij-i)k-i)ZY{Sj)
=
a{l
+
-
-
ex{tjk)^Y{sj-i)
ex(^o-i)fc-i)er(sj)
+
ex{tu-i)k-i)eY{sj-i)]
y)^^
+ 2aWY + '2^^ + '2<^Wx+WYv'x
m
k
n
17
References
[1]
Dey, D. K. and C. Srinivasan (1985), "estimation of a Covariance Matrix
Under
[2]
Stein's Loss." annuals of Statistics, 13, 1581-1591
Frost, P.A.
and
J.E. Savarino (1986),
Portfolio Selection."
J.
"An
Empirical Bay's Approach to
of Financial and Quantitative Analysis, 21, 293-
305
[3]
Goodhart, C.A.E. and
cial
[4]
markets."
Jobson,
J.
portfolios."
[5]
J.
L. Figliuoli (1991),
of Inter.
Money and
"Every minute counts in finan-
Finance, 10, 23-52.
D. and B. Korkie (1980), "Estimation for Markowitz efficient
JASA,
75, 544-554.
Ledoit, Olivier (1994), "Portfolio Selection: Improved Covariance Matrix
Estimation." Ph.D. Dissertation, Sloan School of Management,
[6]
Lo,
Due
Andrew and A.C. MacKinlay
to Stock
(1990),
"When Are
MIT
Contrarian Profits
Market Overeaction?." The Review of Financial Studies,
3,
175-205.
[7]
Markowitz (1952), "Portfolio Selection."
[8]
Stein, C. (1975) "Estimation of a Covariance Matrix." Rietz Lectinre,
Annul Meeting IMS, Atlanta, GA.
18
J.
of Finance, 7, 77-91.
39th
[9]
Zhou, Bin (1995a), "High Frequency Data and VolatiUty In Foreign Ex-
change Rates." JBES,
[10]
in press.
Zhou, Bin (1995b), "Estimating The Variance Parameter From Noisy
High Frequency Financial Data."
[11]
MIT
Sloan School Working Paper.
Zhou, Bin (1994), "Forecasting Foreign Exchange Rates Subject to Devolatilization."
Artificial Intelligence
Freedman, Klein
&;
b
Capital Markets,
Lederman, Probus, Chicago.
19
/
in the
Ed. by
Date Due
MIT LIBRARIES
llllllilil
iIiNIiIjI:
llJl!!!!!!
IJlllllil
3 9080 00922 3733
Download