Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach On Lead-Lag Estimation Mathieu Rosenbaum CMAP-École Polytechnique Joint works with Marc Hoffmann, Christian Y. Robert and Nakahiro Yoshida 12 January 2011 Mathieu Rosenbaum On Lead-Lag Estimation 1 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Outline 1 Introduction 2 Lead-Lag estimation from non synchronous data 3 Simulations 4 Real Data 5 A random matrices based approach Mathieu Rosenbaum On Lead-Lag Estimation 2 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Outline 1 Introduction 2 Lead-Lag estimation from non synchronous data 3 Simulations 4 Real Data 5 A random matrices based approach Mathieu Rosenbaum On Lead-Lag Estimation 3 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Motivation Observation from practitioners in finance Some assets are leading some other assets. This means that a “lagger” asset may partially reproduce the behavior of a “leader” asset. This common behavior is unlikely to be instantaneous. It is subject to some time delay called “lead-lag”. Mathieu Rosenbaum On Lead-Lag Estimation 4 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach A toy model for Lead-Lag Bachelier model For t ∈ [0, 1], and (B (1) , B (2) ) such that hB (1) , B (2) it = ρt, set (1) e (2) e0 + σ2 Bt , Xt := x0 + σ1 Bt , Y t := y et−θ , t ∈ [θ, 1]. Our lead-lag model is given by Define Yt := Y the bidimensional process (Xt , Yt ). We have ( Xt Yt (1) = x0 + σ1 Bt . (1) = y0 + ρ σ2 Bt−θ + σ2 (1 − ρ2 )1/2 Wt−θ Mathieu Rosenbaum On Lead-Lag Estimation 5 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Intuitive estimator in the Bachelier model (1) Estimation idea (1) Assume the data arrive at regular and synchronous time stamps in the Bachelier model, i.e. we have data (X0 , Y0 ), (X∆n , Y∆n ), (X2∆n , Y2∆n ), . . . , (X1 , Y1 ), and suppose θ = k0 ∆n , k0 ∈ Z. Let Cn (k) := X Xi∆n − X(i−1)∆n Y(i+k)∆n − Y(i+k−1)∆n . i Mathieu Rosenbaum On Lead-Lag Estimation 6 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Intuitive estimator in the Bachelier model (2) Estimation idea (2) Heuristically, we have 1/2 n Cn (k) ≈ ∆−1 n E (X· − X·−∆n )(Y·+k∆n − Y·+(k−1)∆n ) + ∆n ξ . Moreover, ∆n−1 E (X· −X·−∆n )(Y·+k∆n −Y·+(k−1)∆n ) = 0 if k = 6 k0 ρ σ1 σ2 if k = k0 . Thus we can (asymptotically) detect the value k0 that defines θ in the very special case θ = k0 ∆n by maximizing in k the contrast sequence Cn (k). k Mathieu Rosenbaum On Lead-Lag Estimation 7 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Outline 1 Introduction 2 Lead-Lag estimation from non synchronous data 3 Simulations 4 Real Data 5 A random matrices based approach Mathieu Rosenbaum On Lead-Lag Estimation 8 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Covariation estimation Previous-Tick estimation Estimating covariation is an intricate problem as soon as non synchronous data are considered. Assume now we observe X at times (T X ,i ), i = 1, . . . and Y at times (T Y ,i ), i = 1, . . ., with T X ,i ≤ T , T Y ,i ≤ T . We build X t = XT X ,i for t ∈ [T X ,i , T X ,i+1 ), Y t = YT Y ,i for t ∈ [T Y ,i , T Y ,i+1 ). For given h, the previous tick covariation estimator is m X Vh = X ih − X (i−1)h Y ih − Y (i−1)h . i=1 Mathieu Rosenbaum On Lead-Lag Estimation 9 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Drawback of this estimator Epps effect Systematic bias for this estimator. Example : Assume that X and Y are two Brownian motions with correlation ρ and that the observation times are arrival times of two independent Poisson processes, then one can show that E[Vh ] → 0, as h → 0. Mathieu Rosenbaum On Lead-Lag Estimation 10 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach A convergent estimator under asynchronicity Hayashi-Yoshida estimator Let IiX = (T X ,i , T X ,i+1 ] and IiY = (T Y ,i , T Y ,i+1 ] The Hayashi-Yoshida estimator is X Un = ∆X (IiX )∆Y (IjY )1{I X ∩I Y 6=∅} . i j i,j This estimator does not need any selection of h and is convergent. Mathieu Rosenbaum On Lead-Lag Estimation 11 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach The lead-lag model Let θ > 0 (for simplicity, extensions are quite straightforward) and set Fθ = (Ftθ )t≥0 , with Ftθ = Ft−θ . Assumptions We have X = X c + A, Y = Y c + B. (Xtc )t≥0 is a continuous F-local martingale, and (Ytc )t≥0 is a continuous Fθ -local martingale. ∃vn → 0, vn−1 max sup{|IiX |}, sup{|IiY |} → 0. The T X ,i are Fvn -stopping times and the T Y ,i are Fθ+vn -stopping times. Mathieu Rosenbaum On Lead-Lag Estimation 12 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Estimation strategy Estimator We set Un (θ) = X ∆X (IiX )∆Y (IjY )1{I X ∩(I Y )−θ 6=∅} , i j i,j with (IjY )−θ = (T Y ,j − θ, T Y ,j+1 − θ]. Eventually, θbn is defined as a solution of n U (θbn ) = max U n (θ), n θ∈G where G n is a sufficiently fine grid. Mathieu Rosenbaum On Lead-Lag Estimation 13 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Result Theorem As n → ∞, vn−1 θbn − θ → 0, e c iT 6= 0 . in probability, on the event hX c , Y Mathieu Rosenbaum On Lead-Lag Estimation 14 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Outline 1 Introduction 2 Lead-Lag estimation from non synchronous data 3 Simulations 4 Real Data 5 A random matrices based approach Mathieu Rosenbaum On Lead-Lag Estimation 15 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Synchronous case Setup We consider 300 simulations of the Bachelier model with synchronous equispaced data with period ∆n . t ∈ [0, 1], θ = 0.1, x0 = ye0 = 0, σ1 = σ2 = 1. The mesh size of the grid hn satisfies hn = ∆n . We consider the following variations : - Mesh size : hn ∈ {10−3 (FG ), 3. 10−3 (MG ), 6. 10−3 (CG )}. - Correlation value : ρ ∈ {0.25, 0.5, 0.75}. Mathieu Rosenbaum On Lead-Lag Estimation 16 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Results in the synchronous case θbn FG, ρ = 0.75 MG, ρ = 0.75 CG, ρ = 0.75 FG, ρ = 0.50 MG, ρ = 0.50 CG, ρ = 0.50 FG, ρ = 0.25 MG, ρ = 0.25 CG, ρ = 0.25 0.096 0 0 1 0 0 13 0 0 10 0.099 0 300 0 0 299 0 0 152 0 0.1 300 0 0 300 0 0 300 0 0 0.102 0 0 299 0 1 280 0 11 66 Other 0 0 0 0 0 7 0 137 124 Table 1 : Estimation of θ = 0.1 on 300 simulated samples for ρ ∈ {0.25, 0.5, 0.75}. Mathieu Rosenbaum On Lead-Lag Estimation 17 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach One sample path, FG, ρ = 0.75 0.4 0.2 0.0 Contrast 0.6 ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ●●● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ●●● ●●● ●● ● ● ●● ● ●●●●● ●● ● ● ●● ●● ● ●● ●● ● ●●● ● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ●● ●●● ●● ● ● ● ● ● ●●● ●● ●●●●● ● ● ● ●●●● ●●●●● ● ●●●●●● ● ●●● ●●● ● ●● ● ● ●● ● ● ●● ●● ●●●● ● ●●●●● ●● ●● ●● ● ● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ●●● ●● ● ● ●● ●● ●●● ● ●● ●● ●● ●● ●● ● ●● ● ●● ●● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ● ●● ● ● ● ● ●● ● ●●● ●●● ● ●● ●● ● ●● ● ●● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ●●●●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ●● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ●●● ● ●● ● ●●● ● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ●● ●●● ● ●● ● ● ●●●● ● ● ● ●● ●● ●● ● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 lag Mathieu Rosenbaum On Lead-Lag Estimation 18 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach One sample path, FG, ρ = 0.25 ● ●● ● ● ● ● ● ● 0.05 ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ●● ●● ●●● ●● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ●●●●● ●●●● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ●● ●●● ● ●● ● ● ● ● ●●● ● ●●●●● ● ● ● ● ● ● ●● ● ● ●● ●●● ●●● ●● ● ●●● ● ●● ● ● ● ●●● ●●●● ●●● ● ●● ● ● ●●● ● ● ● ● ●● ●● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ●●● ● ● ●● ●● ● ● ●● ● ● ●●● ● ●● ●● ● ● ●●● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●●●● ● ● ● ●● ●● ● ● ● ●● ●●● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ●● ●● ● ● ●●●●● ● ● ●● ● ● ● ●● ● ● ● ●● ●● ●● ●● ●● ●● ● ● ●● ● ●●●●● ● ● ● ●● ●● ●● ●● ●●●●● ● ●●● ●● ● ● ● ●● ● ● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ●●●● ●● ● ● ● ● ●●●● ●● ● ●●●● ● ●● ●● ● ● ● ● ● ● ● ●●● ●●● ●●● ●● ● ●●●● ●●● ●●● ● ● ●●● ● ●● ●● ● ● ● ● ●● ● ●●●●● ●●● ●● ● ●●● ● ● ●● ● ●● ●● ●● ●● ●● ● ● ● ●●●●● ● ●●●● ● ● ● ●●● ● ●● ●● ● ●●●● ● ● ● ● ● ● ●●● ● ● ●●● ●● ● ● ●● ● ● ●●●● ● ● ●● ●●● ● ●● ●● ●●●● ● ●●●● ● ●●●● ● ●● ● ●●●● ● ●●● ● ●● ● ●● ●●● ●● ● ●●● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●●●●●● ●● ●● ● ● ●● ●●● ● ● ● ●●●● ● ●●● ● ● ●●●●●● ●● ● ● ●●● ● ● ●● ● ●● ●● ● ●● ●● ●●● ●● ●●●●●● ● ● ● ● ● ● ●● ● ●●● ●● ● ● ●● ● ●● ● ●●●●●●●●● ●● ●● ●●● ● ●● ●● ●● ●● ●●●● ● ● ● ● ● ●● ●●● ●●● ●● ●● ● ●● ●●● ●● ● ● ● ● ● ● ● ●●●●● ● ●●● ● ● ●● ● 0.00 Contrast 0.10 0.15 0.20 ● 0.0 0.2 0.4 0.6 0.8 1.0 lag Mathieu Rosenbaum On Lead-Lag Estimation 19 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach One sample path, MG, ρ = 0.75 0.2 0.3 ● ● ● ● 0.1 ● 0.0 Contrast 0.4 0.5 ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ●● ●●● ●● ●● ●● ● ●●● ● ● ● ●● ● ● ● ●● ● ● ●●● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ●●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ●● ●●● ●●●●●●● ●●●●●●● ●● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●●● ● ● ●●● ●● ● ● ● ●● ●● ● ● ●●● ●● ● ● ●● ●● ● ●●●● ●●●● ●● ● ● ● ● ●●●● ●●●●● ●●●● ● ●● ● ● ●● ●●●● ● ● ● ● ●● ●● ● ●●●●● ●● ●● ● ●●● ● ●● ● ● ●● ●●●● ● ●● ● ●●● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ●● ● 0.0 0.2 0.4 0.6 0.8 1.0 lag Mathieu Rosenbaum On Lead-Lag Estimation 20 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach One sample path, MG, ρ = 0.25 ● ● 0.15 ● ● ● ● ● ● 0.10 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● 0.05 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 Contrast ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●●● ●● ●● ●● ● ● ●● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● 0.0 0.2 0.4 0.6 ● ●● ● ● ● ● ● ● 0.8 1.0 lag Mathieu Rosenbaum On Lead-Lag Estimation 21 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach One sample path, CG, ρ = 0.75 0.4 ● 0.2 ● ● ● ● 0.1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●●●● ● ●● ● ●● ●● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ●●● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 Contrast 0.3 ● 0.0 0.2 0.4 0.6 0.8 1.0 lag Mathieu Rosenbaum On Lead-Lag Estimation 22 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach One sample path, CG, ρ = 0.25 ● ● 0.20 0.25 ● ● ● ● 0.15 ● ● ● ● ● ● 0.10 ● ● ● ● ● ● ● 0.05 ● ● ●● ● 0.0 ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 ● ●● ● ●● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● 0.2 ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 Contrast ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● 0.8 1.0 lag Mathieu Rosenbaum On Lead-Lag Estimation 23 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Non synchronous case Setup We randomly pick 300 sampling times for X over [0, 1], uniformly over a grid of mesh size 10−3 . We randomly pick 300 sampling times for Y likewise, and independently of the sampling for X . Fine grid case, with θ = 0.1 and ρ = 0.75. Mathieu Rosenbaum On Lead-Lag Estimation 24 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Results for the non synchronous case θb FG, ρ = 0.75 0.099 16 0.1 106 0.101 107 0.102 46 0.103 19 0.104 4 0.105 2 Table 2 : Estimation of θ = 0.1 on 300 simulated samples for ρ = 0.75 and non-synchronous data. Mathieu Rosenbaum On Lead-Lag Estimation 25 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Outline 1 Introduction 2 Lead-Lag estimation from non synchronous data 3 Simulations 4 Real Data 5 A random matrices based approach Mathieu Rosenbaum On Lead-Lag Estimation 26 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach The data Dataset We study here the lead-lag relationship between the two following assets : The future contract on the DAX index, with maturity December 2010, The Euro-Bund future contract, with maturity December 2010. Mathieu Rosenbaum On Lead-Lag Estimation 27 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Dealing with microstructure noise Methodology We want to use high frequency data. First approach : use of the Uncertainty Zones Model. Here we just use signature plots in trading times. This enables to take advantage of non synchronous data. We keep one trade out of twenty. We then compute the function U n over these trades. Mathieu Rosenbaum On Lead-Lag Estimation 28 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach 7000 6000 Realized volatility 4000 5000 0.6 0.5 0.4 3000 0.3 0.2 Realized volatility 0.7 8000 0.8 Signature plots, October 13, for Bund (left) and FDAX (right). 0 10 20 30 40 50 Period (in number of trades) Mathieu Rosenbaum 0 10 20 30 40 50 Period (in number of trades) On Lead-Lag Estimation 29 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach −2 −4 −6 −8 Contrast function 0 2 Function U n , October 13, between -10 and 10 minutes, mesh=30 seconds −600 −400 −200 0 200 400 600 Time shift value (seconds) Mathieu Rosenbaum On Lead-Lag Estimation 30 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach −8.0 −8.5 Contrast function −7.5 −7.0 Function U n , October 13, between -5 and 5 seconds, mesh=0.1 second −4 −2 0 2 4 Time shift value (seconds) Mathieu Rosenbaum On Lead-Lag Estimation 31 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Bund and DAX, lead-lag estimation Jour 1 Oct. 5 Oct. 6 Oct. 7 Oct. 8 Oct. 11 Oct. 12 Oct. 13 Oct. 14 Oct. 15 Oct. Vol.(Bund) 2847 2213 2244 1897 2545 1050 2265 2018 2057 2571 Vol.(FDAX) 4215 3302 2678 3121 2852 1497 3018 3037 2625 3269 Mathieu Rosenbaum LL. -0.2 -1.1 -0.1 -0.5 -0.6 -1.4 -0.8 -0.8 0.0 -0.7 18 19 20 21 22 25 26 27 28 29 J. Oct. Oct. Oct. Oct. Oct. Oct. Oct. Oct. Oct. Oct. On Lead-Lag Estimation Vol.B. 1727 2527 2328 2263 1894 1501 2049 2606 1980 2262 Vol.F. 2326 3162 2554 3128 1784 2065 2462 2864 2632 2346 LL -2.1 -1.6 -0.5 -0.1 -1.2 -0.4 -0.1 -0.6 -1.3 -1.6 32 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Outline 1 Introduction 2 Lead-Lag estimation from non synchronous data 3 Simulations 4 Real Data 5 A random matrices based approach Mathieu Rosenbaum On Lead-Lag Estimation 33 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Model (1) Dynamics We consider two processes (Xt )t∈[0,1] (leader) and (Yt )t∈[0,1] (lagger) such that Z t Xt − X0 = Ks+θ dWs+θ , 0 Z t∧θ Z t∨θ Z t Yt − Y0 = ρ Ks dW̃s + ρ Ks dWs + Ls dWs0 . 0 θ 0 The interval [θ, 1] is the set of time where the lead-lag relation is in force. For s ∈ [θ, 1], dYs = ρdXs−θ + Ls dWs0 . Mathieu Rosenbaum On Lead-Lag Estimation 34 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Model (2) Observations We consider m + 1 equidistant data for each process : Xi/m , Yi/m ), for i = 0, . . . , m. m = pbp a c where p is a positive integer and a > 0. Later, p will be the order of magnitude of the number of “days” the processes will be observed and m + 1 the number of data per day. This parameter will drive the asymptotic. Mathieu Rosenbaum On Lead-Lag Estimation 35 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Increments We consider increments of the processes on grids with mesh 1/p. Notation For i = 1, . . . , p, and l = 0, . . . , bp a c ∆(l,p) Xi = Xi/p+l/m − X(i−1)/p+l/m . ∆(0,p) Xi and ∆(l,p) Yi are centered Gaussian with variance X vi,0 Z i/p 2 Ks+θ ds, = Y vi,l (i−1)/p Z i/p+l/m = (ρ2 Ks2 + L2s )ds. (i−1)/p+l/m Random vector of interest : Z (l,p) = p 1/2 ∆(0,p) X1 , . . . , ∆(0,p) Xp , ∆(l,p) Y1 , . . . , ∆(l,p) Yp−1 Mathieu Rosenbaum On Lead-Lag Estimation > . 36 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Theoretical covariance Let bθcp = bpθc /p. Z (l,p) is a Gaussian vector of size 2p − 1 with 5-diagonal covariance matrix Σ(l,p) . For l = 0, . . . , bm(θ − bθcp )c : 1 ≤ i ≤ p, 1 ≤ j ≤ p, i = j p + 1 ≤ i ≤ 2p − 1, p + 1 ≤ j ≤ 2p − 1, i = j (Σ(l,p) )i,j (Σ(l,p) )i,j (Σ(l,p) )i,j (Σ(l,p) )i,j (Σ(l,p) )i,j (Σ(l,p) )i,j 1 ≤ i ≤ p, p + 1 ≤ j ≤ 2p − 1, j − p = i + p bθcp 1 ≤ i ≤ p, p + 1 ≤ j ≤ 2p, j − p = i + p bθcp + 1 p + 1 ≤ i ≤ 2p − 1, 1 ≤ j ≤ p, i − p = j + p bθcp p + 1 ≤ i ≤ 2p − 1, 1 ≤ j ≤ p, i − p = j + p bθcp + 1 X = pvi,0 Y = pvj−p,l XY = pvi,l,1 XY = pvi,l,2 XY = pvj,l,1 XY = pvj,l,2 with XY vi,l,1 Z i/p−(θ−bθcp )+l/m =ρ 2 Ks+θ ds and (i−1)/p XY vi,l,2 Z i/p =ρ 2 Ks+θ ds. i/p−(θ−bθcp )+l/m The parameter θ appears in the location of the diagonals. Analogous result for l = bm(θ − bθcp )c + 1, . . . , bp a c. Mathieu Rosenbaum On Lead-Lag Estimation 37 Introduction Lead-Lag estimation from non synchronous data Simulations Real Data A random matrices based approach Result Theorem Using random matrices theory results, we can build another estimator of the lead-lag parameter and provide its asymptotic theory. Mathieu Rosenbaum On Lead-Lag Estimation 38