Tracking Eye Motion from Retinal Scan Data with a Map Seeking Circuit Al Parker Department of Mathematical Sciences Montana State University November 13, 2004 Joint work with • Austin Roorda and Scott Stevenson, College of Optometry, University of Houston • Curt Vogel, Department of Mathematical Sciences, Montana State University • David Arathorn, Center for Computational Biology, Montana State University • Tomas Gedeon, Derek Sonderegger and Shaun Harker, MSU Math Dept 1 Outline • Show and Tell (Results) – Estimating Eye Motion from AOSLO data – Image Registration • The Algorithm which finds the motion: Map-Seeking Circuit (MSC) 2 Results: Estimated Motion from AOSLO Data 120 Vertical Motion 100 80 60 pixels 40 20 2 layers 16p 0 2 layers 32p 4 layers 16p −20 0 50 100 150 time in frames 200 250 300 Horizontal Motion 50 2 layers 16p 2 layers 32p 40 4 layers 16p 30 20 10 pixels 0 −10 −20 −30 −40 0 50 100 150 time in frames 200 250 300 3 Image Preprocessing 1. Smooth the data with a Gaussian kernel to reduce the effect of noise and and amplitude variation. Raw 450 450 400 400 350 350 300 300 250 250 200 200 150 150 100 100 50 50 50 100 150 200 250 300 350 400 450 500 50 100 Smoothed 150 200 250 300 350 400 450 500 4 2. Break the smoothed image into “channels”. High 450 450 400 400 350 350 300 300 250 250 200 200 150 150 100 100 50 50 50 100 150 200 250 300 350 400 450 500 50 100 Medium Low 50 100 150 200 250 300 350 400 450 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500 5 3. Determine the eye motion across “patches”. This yields HIGH RESOLUTION of the motion: We can calculate 16 - 32 estimates of the motion per frame, which is about 480 - 960 estimates per second. 450 450 400 400 350 350 300 300 250 250 200 200 150 150 100 100 50 50 50 100 150 200 250 300 350 400 450 500 50 100 150 200 250 300 350 400 450 500 6 Image Registration • Dewarp each frame of the AOSLO video • Add each dewarped frame to create a mosaic or montage Dewarped image 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 500 7 How to add frames: Initialize montage with (intensity=0,weight=0) 5 4.5 (0,0) 4 (0,0) (0,0) 3.5 (0,0) (0,0) (0,0) 3 2.5 (0,0) (0,0) (0,0) 2 1.5 1 1 1.5 2 2.5 3 3.5 4 4.5 5 8 How to add frames: Map the first pixel, update the montage 5 4.5 (i1,1) 4 (i1,1) (0,0) i1 3.5 (i1,1) (i1,1) (0,0) 3 2.5 (0,0) (0,0) (0,0) 2 1.5 1 1 1.5 2 2.5 3 3.5 4 4.5 5 9 How to add frames: Map the next pixel, update the montage ... 5 4.5 (i1,1) ((i1+i2)/2,2) (i2,1) 4 i1 3.5 (i1,1) ((i1+i2)/2,2) i2 (i2,1) 3 2.5 (0,0) (0,0) (0,0) 2 1.5 1 1 1.5 2 2.5 3 3.5 4 4.5 5 10 MAP-SEEKING CIRCUIT ALGORITHM (MSC) Model images E, E 0 as vectors in Hilbert spaces H, H0, respectively. Given transformation T : H → H0, define the correspondence between E and E 0 associated with the transformation T to be the inner product hT (E), E 0iH0 . Goal: Find T which maximizes correspondence from linear transformations of form (L) T = Ti L (2) ◦ · · · ◦ Ti 2 (1) ◦ Ti 1 , where for each “layer” ` between 1 and L, we have i` ∈ {1, 2, . . . , n`}. For example, we can let Ti1 be some horizontal translation of the image E and let Ti2 be some vertical translation of the image E. 11 ADVANTAGE of MSC over Cross-Correlation: MSC can include other transformations such as rotations, dilations, shear, compression, .... SYNOPSIS: A Map-Seeking Circuit finds a solution to the discrete optimization problem ∗ , . . . , i∗ ) = arg max (i1 L 1≤i`≤n` (L) (2) (1) T ◦ ··· ◦ T ◦T (E), E 0 i i i 2 1 L H0 MSC KEY IDEA (which makes it fast) Imbed the discrete problem in continuous constrained optimization problem. Maximize multilinear form M (x(1), . . . , x(L)) = hTx(L) ◦ · · · ◦ Tx(1) (E), E 0iH0 , where Tx(`) = n X̀ (`) (`) xi T i i=1 SIMPLIFYING PROPERTY: Components of gradient of M can be computed quickly and relatively cheaply via the inner product ∂M (`) ∂xi (`) = hTi ` ◦ Tx(`−1) ◦ · · · ◦ Tx(1) (E), Tx0 (`+1) ◦ · · · ◦ Tx0 (L) (E 0)iH0 12 where T 0 is the adjoint or conjugate transpose of T . MSC can viewed as an iterative algorithm which uses this gradient info to maximize the correspondence. 13 COMPUTATIONAL COST • Computation complexity of each MSC-type iteration is on order of the sum n1 + n2 + . . . + n L , where n` is the number of transformations in layer `, and L is the number of layers. • The complexity of an exhaustive search is the product n1 n2 · · · n L . Convergence tends to be fast and solutions tend to be robust if the data has a “sparse encoding”. Image data pre-processing can provide such a sparse encoding. 14 CONCLUSIONS Using MSC, • We can accurately track eye motion from AOSLO videos, even through saccades. • We can identify very general transformations between AOSLO video frames, including shear and compression. • Register AOSLO video frames to create a de-noised retinal image 15