>> Zhengyou Zhang: Okay. So let's get started. ... Cleveland. He's President and CTO of LC Technologies. ...

>> Zhengyou Zhang: Okay. So let's get started. It's my great pleasure to introduce Dixon Cleveland. He's President and CTO of LC Technologies. I made met him last year at a European conference on eye movement and again this year, last month at [inaudible] Conference. I have tried his system. You will see the demo later on. I believe this is the best system building in the variable of sight. That's why I'm very excited to have him here and demo the system. And today the talk will be very special. There's no screen. This is the first time I host a talk without any computer or slides. Dixon, please. So, by the way, he has more than 25 years’ experience in the system [inaudible]. It’s a beautiful system. You will see. >>: Why do you switch on the [inaudible] sunglasses? Dixon Cleveland: You’re an [inaudible]. >>: I don’t see why you're looking. Dixon Cleveland: I'm your user. I need your help. What's wrong with this picture? >>: I don't see where you're looking. Dixon Cleveland: Can’t see where I'm looking. You need to know where I'm looking. Why do you need to know where I'm looking? Why do you need to know where I'm looking? >>: [inaudible]. Dixon Cleveland: You can see through the glasses. >>: [inaudible]. >>: More than just eyes. Dixon Cleveland: Oh, I see. Do it with point in my face. >>: [inaudible] expression, the eye brows. Dixon Cleveland: That's exactly right. So the deal is that eyes are really important. We as humans make eye contact. That's just one of the most important means of communication we have. I may be talking to you right now, but you're reading my face as much as you’re listening to my words. Computers need to do the same thing. They put up these marvelous displays, you see all these beautiful graphics and these big screens with all kinds of pixels and color depth and all that sort of stuff, does that computer have a clue what you're looking at when you look at that stuff? Doesn't have a clue. Just does not have a clue. Should it? Yes. It's dealing with a person. We do it. Our eyes are built. We got whites in our eyes right next to these dark irises. So when I look one way or another you can tell immediately my eyes are moving around. It's a communication mechanism. It’s the fundamental basis of all this. So, can computers actually see our eyes? The answer is yes. eye trackers are out there. They’re around. LC Technologies happens to build a pretty good one. This can be done. This whole business of putting on eye tracker, it will nicely, one day it will be a sugar cube sitting right in the very bottom of your device, whatever device it is. Great big screen, fine. Sugar cube sitting right there. All the way the way at the other end of the spectrum you're going around with your hand held by a sugar cube right in there looking back at you, figuring out what you're doing with your eyes, communicating with you as if you're a human being that can see. So I want to talk today about a lot of the physiological basis of what's happening inside our head, how our brains work such that we can make these decisions. What's going on behind our eyes? Un-cloud some of this issue of what's going on inside our head. So the whole story basically starts with light. The universe started with light. Universe is Internet, light going all over. It's going between the planets, it's going between the galaxies, it's going between people, it's actually happening down at a microscopic level. Everything starts with light. So then human life comes along, and we humans or any life form has to figure out what's in its environment. Well, it picks up photons. So you have to have some mechanism to pick up the photons, that’s the way you figure out what's going on in the world. And so we build these things, and in the human they became our eyes. And it's a pretty fantastic device. It's an amazingly fantastic device. The engineering that went into designing our eyes is just outstanding. Well, why do we go to all this trouble? Well, we’ve got to survive, we've got to find food, we've got to do things, see what's going on out there. We've got two objectives when we are in vision. One is this big thing you've got to be able to see everything. If a bear is over there and that bear is going to come after you you’d better see it in your peripheral vision. At the same time, if we're going to look at something in detail that we can use it in a way that we really want to be able to use it we've got to be able to see with very, very high resolution. And so nature did this a funny thing. It decided instead of just building our camera with uniform pixel density every place, it puts this really, really high concentration of cones at the very central part of your vision; and your peripheral vision’s got 70 times less resolution than you do in your central vision. This is a beautiful solution to the problem because you can get very, very high resolution image when you look at something, and you still get to see the entire world. So peripheral vision versus central vision. But this creates all kinds of problems. And one of those problems is that if you want to go look at something you have to point your eyes. So if I want to look at you I’ve got to point my eyes over there. If all of a sudden I want to look at you I've got to point my eyes over in this direction to look at you. But there is a wonderful piece of serendipity here and that is that you now can tell what's interesting to me. I'm interested in you so I look at you. I'm interested in you so I look at you. That's cool. We are communicating now. So the reciprocity of optics is a fairly important concept here, and that is if I can see each other, if I can see you, you can see me, cool. Let's build on that. And that's why eye trackers can be built. We are the best eye trackers there are. You don't need to go buy an eye tracker. You’ve got two of them right there and you’ve got this visual cortex in your brain that just does eye tracking like mad. It's perfect. I wish we could build a system that was just that good. But we're getting close to that. We can start to do that today. It's still pretty expensive, and it's still pretty clunky, but it's doable. The existence proof is out there. You can build those eye trackers. So what else then happens in the brain? Let's talk a little bit more, staying with the eye for just a second, let's talk about a couple other parameters. We said we’ve got 70 times the density of cones in the central vision of the eye than we do out in our peripheral vision. That's true. Well, what is the scope of that? Basically, if you take that very high central part, the foveola, the central part of the macular region, and you hold your thumb out at roughly arm’s length, that foveola covers about the size of your thumbnail. So when you point your eyes, I want to look at your eyes, I stick it out there, that thumb covers about this much of your face. You're sitting, what, 10 feet from me? So your eye has to point at least that accurately to get the information that it wants. So the eyes are pretty damn accurate when they point. Well, nature has a problem now. Let me back up a second. I forgot an important piece of this puzzle, and that is if you were to have the same resolution of pixels all over your retinal vision turns out that your optic nerves will be about this big around, the visual cortex to process all of that would be about a half a cubic meter, and that's a little bit difficult to carry around. It just won't work. So nature really had, to get high resolution in your central vision, it really had to concentrate down and it went to all kinds of extremes to get high resolution in the center. Out in your peripheral vision there's a cell body for each rod and cone out there. When you start to get in towards the central part of the vision it can’t get the density that it needs by putting a cell at every location. What it has to do back at that point is put the cells right around the outside of the macular region, and nothing but the wires and the receptors themselves are at that very central part where there’s 70 times the cone density as there is in the rest of your eyes. So nature went to all this trouble to do this. And in the process of doing then came up with the problem well, you’ve got to point the eyes. You have to move them. So then it came up with the ocular muscle systems. Those ocular muscles are the best muscles you've got in your body. You don't think about them, all of this stuff is kind of unconscious and we’ll get into the unconsciousness trip a little bit later, but what's happening is that those, the muscles have to be really, really precise and they have to be really, really fast. When I look and focus on you, back to the photon problem, I'm only getting a certain number of photons. So the photons coming in, bouncing off of you, some of those photons happen to make it through my pupil back onto my retina so I can see your eyes, and when that happens there aren't many photons left. There are pretty few at that point. So even though we've got zillions of photons floating around in the universe the number that get into my eyes are pretty small. So nature's got to make use of those photons the best way it can. One of the ways that it makes use of those photons is to put, well when I take a picture I have to hold my eyes still for a certain period of time, and that period of time is about 250 milliseconds. So I have to go over and I fixate, hold my eyes still for about 250 milliseconds, wait for all those photons to come in, develop enough of an image that can then go back into my occipital lobe and get that thing, get that image processed and make sense out of it. So nature's got a problem at this point, and that is it’s got to hold that eye still within a couple of pixels for 250 milliseconds. That's a pretty astounding engineering problem. But it does it. Those muscles do it. They are like no other muscles in your body, they never get tired, they can hold your eye extremely stable, and then all of a sudden, bam, they can move your eyes at 300 degrees, 600 degrees per second over long-distance and stop them on a time and hold bloody still. That's a fantastic engineering problem, and nature solved that problem. It built the ocular muscle system to do that. So obviously if nature went to all this trouble to design this complicated control system, this complicated ocular muscle system, it’s important. It really is important to us. I don't mean to harp back to this idea, but computers, they aren't paying attention to that process that's going on. One of the reasons that we haven't thought to pay attention to that topic is that it's all unconscious. By and large, everything we do with our eyes is unconscious. >>: Quick question, how far back in evolution do you think this whole system developed? Like how similar are we to like frogs? Dixon Cleveland: Good question. That's a marvelous question. I don't know the answer to that question. I really don't know. I can speculate, but my speculation wouldn't be any better than yours so I won’t to that kind of speculation. But I don't know. So you can see now if this thumbnail is about 1.2 degrees across, that's the extent of the foveola in your eyeball itself, and across the foveola there are approximately 100 pixels, 100 cones. So that means that your eye has to be able to hold still to within 100th of a degree for a period of 250 milliseconds; otherwise, if the ocular muscle system weren't that good, why have all the density of cones? It just wouldn't be there. So the balance that nature finally chose is this one with high resolution, but the ocular muscle system has to hold it that still. The reason I'm going into a lot of this detail about some of these numbers in the eye is that before we can actually design a good eye tracker we need to know what the eye is capable of. We need to build an instrument that's good enough to measure what the eye really does; but to build it any better than that, if all we are interested in is where people are looking, then we don't need to build it any better than that. So there's this concept of the Heisenberg Uncertainty Principle of eye tracking. So there's this Planck's constant thing in the real Uncertainty Principle which determines exactly how precisely you can measure something that's in motion. The analogous thing here is how precisely do our eyes really have to work? So that's kind of the concept that I'm getting at here. Just how precisely does the eye have to work? And one of the numbers that we just sort of derived here in this conversation was the eye has to be held still approximately within 100th of a degree for 250 milliseconds. That’s engineering design requirement for the ocular muscle system. It's pretty fantastic, and the eye can do that. So if are going to build an instrument we need to be able to measure that kind of stuff. Well, I shouldn't say we need to be able to, but that's the ultimate target. That's where eye tracking wants to go. Ultimately, that's the objective. Well, we've been talking about this idea of the eye just holding still and taking a good picture, and that's true. And even if I look at your eyes and my head’s doing this I can still get a fairly good feel, but my eyes rotating during that 250 milliseconds in order to get that nice stable image where I can still see what's going on. So as we are driving to the road and we are bouncing around we still see everything fine. By the way, you can tell when you're not seeing things fine when you start to get too blurry you get dizzy. There's some vestibular feedback that we get on our own that says our eyes aren’t working very well. So if you're playing ball and running around like this and you're not dizzy, your eyes are basically getting the information that they need, and your eyes are holding steady and your ocular muscles are holding your eyes still with respect to what it is that you're looking at, that moving ball as you go catch it>>: [inaudible], but you couldn’t read that way. Dixon Cleveland: You don't think so? >>: I don't think so. Dixon Cleveland: Good question. Somebody ought to run that experiment. >>: I mean you're bobbing your head. Try reading a page of text at arm’s length while your head is, w don't do that. Dixon Cleveland: Indeed. >>: You do in a sense. I mean, kids and adults are actually reading in a car. You’re bouncing around, you're moving around. Some people get car sick. >>: Walking with a cell phone. Dixon Cleveland: Yeah. >>: Walking on a treadmill. Dixon Cleveland: So there is a degradation. But there's really not quite as much degradation as you might think. The ocular muscles accommodating a lot of this stuff. They really do; they’re marvelous at accommodating it. Well, once you’ve looked at something for 250 milliseconds and got a good, clear image there's generally no reason, from a photographic standpoint, to continue looking at it. If that environment is constant, you're looking at a word, once you’ve read that word and that information goes back into your occipital lobe and your visual cortex processes it then says oh that's the word something and that word something goes up and your frontal lobe processes the word something, you don't need to look at that word anymore. You need to go look someplace else. >>: Question. Does the eye evolve from the baby stage to the adulthood? Because obviously when a kid was young, sometimes I look at the baby I don’t what's in his mind or what really he’s looking at. And you need to give them some stimulation for that. Dixon Cleveland: You do. There's a very, very complicated thing, and when you're actually born you have about 20 times this many connections between the rods and cones in your eyes than your brain says you need. And there's a big problem of figuring out which cones and cells, how they fit together geographically because they aren't laid down perfectly; so nature goes through a process of apoptosis which is program cell death, which is sounds really awful and weird and blah, but what's actually happening is it’s figuring out which of those connections are the right ones to make the best use of the rods and cones that exist, and that is a process that happens during the first several weeks of life. It begins during that period of time, and it really actually happens up through about five years. And if people, if in the process people don't untangle that and figure out exactly which connections are the right ones to make your eyes work for you, you have serious reading problems. >>: So we can design our eye tracker in that way. We can try various combinations. Dixon Cleveland: Well, that's beyond the scope of eye tracking at this point. We're going to assume for the moment that people don't have amblyopia. Amblyopia is the disease that you get when you’ve got one eye that your muscles can control it, for example, and they start doing funny things; and an eye can go completely blind because that process of apoptosis never does figure out which cells are right and it just keeps wiping out cells and the connectivities[phonetic] until all the connectivity is gone. The rods and cones continue to work but your occipital lobe sees nothing. It just doesn't get any data in the worst case. Anyway, that's getting a little far afield. A wonderful question, an excellent topic, but a little far afield. So the next thing is that these ocular muscles have to do is once you’ve looked at one thing, you've taken the picture, you’ve gotten enough photons, there’s no more information to be had there, it's much more valuable now to start looking someplace else, your eye then saccades to the next location. Bing. Instead of looking at you, I'm going to look at you and I move my eyes all over the place. And then saccades, the eye has to move at really high speeds; and as it's moving, from looking at you to over looking at you, you don't perceive it but the video signal, if you want to think of it in those terms, it goes from your eyeballs back to your occipital lobe, your visual cortex, stops. It's a phenomenon called saccadic suppression. And saccadic suppression was actually discovered and kind of validated in a cool way and that is people actually flashed a light at you while your eye was saccading from one fixation to the next. People didn’t notice the saccades or notice the flashes if they happened during that period of time. So that's where this concept of saccadic suppression came from. Do you actually notice that when your eye fixates or saccades from one fixation? No. So that's happening way down at a much lower level, but that brings up this concept that your visual cortex is processing these images, but your perception of the environment happens in a completely different part of your brain. And that completely different part of the brain views the world not in an eye-centric frame of reference; it views it in a world-centric frame. What's gravity? This room is out there, it's relatively stable, I walk around on a relatively, locally flat earth; and so I can just set a frame of reference out there, some inertial coordinate frame. We’ll work out of an inertial coordinate frame and your vision sees that what you perceive as your environment is in that frame. But your eyes are going around collecting a little bit of data there, they saccade over to that place with another fixation, they get a little bit of data there, and they're putting this image together. And so remember, we're doing all this because somehow eye trackers need to be able to accommodate all this action that's going on. So there's one other really important part of what's going on in your brain that we need to discuss, and that is all seated in a part of the brain called the superior colliculus. Fantastic chunk of brain. The question is you can only look at one place at a time. You’ve got this thumbnail going around; you can put the thumbnail there, you can put the thumbnail there, how do you choose where to put the thumbnail next? How do you choose where we are going to look next? That fundamental cognitive process of ours is essential to how we live. And you could think of it this way, we are always looking at, when we need visual information our brain somehow is optimizing the process, where do you point your eyes, winner take all decision, where’s the one place I want to put my eyes next to get the most important information to me right now? How does the brain make that decision? Well, some pretty interesting work was done by Doug Munoz up at Queen’s University; and basically what he found was, and this work is now fairly old and it's rooted 10, 15 years ago and a lot of theory before that, but in the superior colliculus there is the equivalent of a map. If you were to take the folds of the brain in the SC and lay them out you'd find a map. And at the center of the map is your foveola. So this is an eye-centric map. And in that map it starts off being blank. There's nothing in this map at all. It's empty. And some part of your brain comes up and say, a visual part of brain if you're reading, it says well my fixation right here is right now, and I'm projecting the next fixation for me to get the next most useful piece of information is over in that chunk of text over there, so I want that next fixation to go over there. So it sends a signal down, goes into this map in the superior colliculus, and it starts to build a spike saying, I want information at coordinate X, Y with respect to where I'm looking right now. And if there were no other inputs eventually that spike would reach a level and hit a threshold and bam, that would trigger your next saccade, and so your saccade would move 13 degrees to the right, two degrees down, and depending upon how you have the orientation of your book in this case, and bam that's where you're I would go next. So if you get philosophical about this and think well what is going on, the superior colliculus is getting inputs from all over the brain. It gets inputs from your vestibular system. So you sit down and you feel something, you think well maybe I need to look at that, it will send a signal into this map in the SC start building up a spike at that location. If you hear a scream off in the distance and it's your little kid you’ll say well that needs some attention, so that part of your brain will send a signal down into your SC and it will start building a spike and this is pretty important to you, so it he’ll build that spike real fast with respect to some of the other ones you get. If you're walking down and you feel your balance is going a little crazy, you're going to trip over something, that part of your brain will say I need visual information here. That's where I want to look next. It will send a signal to the superior colliculus. So the superior colliculus, this map has got these spikes building up all over the place. One of those spikes eventually goes through the threshold that we were talking about. Bang. That's where your eye goes. Once it goes there the map is cleared and starts again. All these pieces of your brain that would like visual attention will start putting their boats into the superior colliculus and the superior colliculus, it doesn't quote make the decision, but it adjudicates that decision. That's where the adjudication of the decision to where your eye goes next is made. And that process is happening how often? >>: Every 250 milliseconds. Dixon Cleveland: Every 250 milliseconds. Exactly. And it goes on and on and on 24 hours a day we are doing that, in REM sleep we are doing that. Do your ocular muscles ever get tired? Never. Do your eyes get tired? Sure. You perceive eyes being tired, but what is it that actually perceives being tired? It's your eyelids. It's not your ocular muscles. Those ocular muscles say I'm ready to go man, I'm holding still, bam. Saccade over to there, they're happy. They're just absolutely happy out there all day long. They don't need sleep. They’re a lot like the muscles in birds. Once they start to fly those bird muscles, and physiologically there's a lot of similarity between those muscles, those birds just keep going, they just fly and fly and fly. One of the important things about all of this stuff, so you can start to see how this is going on in your cognitive, your brain is just always, all parts of your brain want visual attention, we’ve got this mechanism for choosing where your eyes are going to go next, and as I look at you I can see where you choose your eyes and that's very important information to me. And that then ties us back to this place where we want to have our computers do the same thing. It is fundamentally essentially human process and we want to duplicate that process as best we can in computers to make them as interactive, as humanly interactive as we can make them. So if anybody's got any questions at this point I'm sort of finished with the idea of what's going on in your brain about all of this stuff. Yes? >>: So how do you explain micro saccades? Dixon Cleveland: I don’t. Micro saccades, I've never been, there are a lot of theories about what micro saccades do. There is the theory of edge detection. You need to move your eye around a little bit small enough and the theory there is that you want to move them at least one cone which is a 100th of a degree, but that theory kind of bothers me because you do it side to side how do you get the vertical stuff at the same time? And micro saccades are known not to be circulardic[phonetic], to go in different corrections they happen at random times. It's a marvelous phenomenon. And there's another theory that says it’s corrective. If you have it centered up exactly where you want to look then you should go make a correction and center that whatever it is that you're really looking at up in the central part of your vision. By the way, the central part of the vision actually has a tail distribution. When I was talking about the foveola 1.2 degrees across, that's only the very central part, and the distribution drops down sort of almost like a bell curve on either side. But first off, those actions are quite small. And so unless you want to get into physiological studies of saying the eyeball is moving in a way that you really wouldn't expect it to move to try to figure out whether somebody's got some physiological issue with their eyes, but in general, that's a different realm of eye tracking. A very important one, nice applications come out of that area, but that's not the case. Yeah? >>: It seems that our system, our vision system has very high reaction to movements, like our eyes will not be able to avoid looking at moving objects. Dixon Cleveland: Yes. >>: Can you make a comment on it? Dixon Cleveland: Well, that happens mostly with the rods rather than the cones. The cones have very, very, are very sensitive to, they're basically considered the black and white part of your issues. They can see very well in the dark and stuff like that. But one of the things they are is very sensitive to motion. Motion is important to survival. Generally, when things move in your environment, they are more important for how you interact with that environment than just static things. So that's one of their central roles, and that's kind of a different topic here. But you do get attracted by that, and a lot of things that happened in the rod systems that you detect it at the rod level to get fed back around, go back to the superior colliculus and the superior colliculus it builds a spike, and the superior colliculus then moves to that thing, then moves your eyes to look at that place where there is motion. That's a very well-known phenomenon. >>: Question. Dixon Cleveland: Just a second. >>: Let’s say you have more light, do you have a shorter saccades? Dixon Cleveland: Interestingly not. No. I don't know why that is. It's a marvelous idea, and why nature didn't optimize it that way I'm not really sure. But basically what happens is that your pupils stop down to make sure that you’ve got a relatively constant amount of light and then the ocular system continues to operate at its own pace. Yes. >>: What about the whites of your eyes? [inaudible] difficult to understand what you're looking at. Is there a physical advantage [inaudible] evolution or is that a social advantage that selects the whites of the eyes [inaudible]? Dixon Cleveland: I don't really know. I've never really read a lot of good papers on that topic. So I can speculate one way or another. But as far as I'm concerned there are good concepts in both of those points of view, so I wouldn't say one or the other. It's probably the answer is yes, and yes. Yeah. >>: Are there certain patterns of saccadic movement for novel or confusing inputs? Dixon Cleveland: Yes. That was one of the early research done by a lot of eye tracking researchers is trying to figure out the patterns of newness of verses, and that has evolved recently into the definition of trying to differentiate whether somebody's a novice or an expert. And so obviously the eye pattern’s pilots are perfect example of that. When somebody first learns to fly they’re looking all over and who knows what they're looking at. And after a while, when they become an expert, they know what to look at. Their eye patterns do change considerably, and so one of the cool applications of eye tracking is actually being able to differentiate when somebody's learned something well enough to be moved over into the expert category. Dixon Cleveland: There's another interesting phenomenon underlying that too, and that is that when you first learn something you learn it in your frontal lobe. You’re conscious about it, you're aware of it. You can't be aware of everything. So as you learn it transfers to different parts of the brain, and the cerebellum section is one of the places it picks up a lot of that stuff. So when you walk, you just walk with your cerebellum; it's in control, and it’s unconscious. So the learned knowledge actually transfers from one place to another. And if you ever have taught somebody how to drive and you’re teaching them about stop signs, for example, they won't even see the stop signs, then they'll start paying attention too much attention to the stop signs, and then after that, there's actually a dip and they quit paying attention again. And you can actually tell as the teacher while you're watching this kid that they just, that information, the frontal lobe is onto the next task; it hasn't quite made it back to the cerebellum part of the brain and that part hasn't learned yet and they seem to have forgotten something. They're not forgetting; they're becoming an expert. So celebrate that change, don't penalize them and say oh, you forgot to look at that. Yeah, they might've gotten in an accident as a consequence of it, but that wasn't slowing down their warning process necessarily. So there are those paradoxical behaviors that happen as one learns. >>: So are there two levels of consciousness? When you're looking at something you may not be paying attention to it and you might be thinking of something else, but your eyes are still stuck to that particular spot. Does that make sense? Dixon Cleveland: It does. That brings up the whole topic that the way I've been talking so far it actually sounds as though maybe your eyes are just going all over the place sopping up information as fast as they can and that is often the model. But there often times when you don't want visual information. And in that case you'll actually see people get to the point where they'll close their eyes, they’ll just go ugh; they're in their own head. Their cognitive process is elsewhere; they don't want to clog up the process with the visual information. That's superfluous to them at the time. And that's one of the real problems of eye tracking. I can look at your eye, but I can't tell necessarily, so I can tell when you look here or there that yes that's the more important thing for you to look at the time, but I can't really tell is visual information what's really important in the central part of your brain at the time. So there's a lot of work to be done with cognitive psychology to be able to untangle those kinds of things and really fine tune the use of eye tracking information. We as humans seem to be fairly tolerant of that. How we do that, the algorithms that we employee in our heads to do it, we need to go study that stuff. You had a question? >>: Uh, yeah. So you mentioned the fatigue and it being caused mostly by eyelids. Is that fatigue caused by also adjusting [inaudible] people up and down? Dixon Cleveland: That's a good question. I don't know the answer to that question, but my speculation is not very much. >>: I think related to that I notice like most of these eye trackers use IR light. Dixon Cleveland: Yeah. >>: I guess with visible light shining in your face you don’t actually go and, basically your ocular system knows how to go and stop [inaudible] so they reduce the light coming in, but for IR [inaudible] apply? Dixon Cleveland: No. We’re not very [inaudible]>>: And there's often fatigue [inaudible]. Dixon Cleveland: Right. And that's an important topic in eye tracking is we don't want to put too much IR light on the eyes. Two reasons, one is that they dry out the surface of your eye and that in itself is not good for eye tracking because of the corneal reflection image is not as clear and can even go away and you get congealed tears and stuff like that, so the ability of the eye tracker to track is not as good as it might be. And the second thing is typically a point source of light becomes ends up as a point source of on the back on the retina. So if there's a lot of light that's concentrated in one LED it has the potential to warm up the rods and cones that it lands on and so we have to be a little careful about that. I actually sat on a committee in [inaudible] where we talked about that kind of thing and went back to David Sliney’s work way back in the 1970s when they were first beginning to figure out how much light damage was done, and eye trackers are fairly safe. There's really not a safety issue to speak of. We’ve paid attention to it at LC Technologies because we work a lot with people with disabilities, and the last thing you want to do is create any kind of even comfort problems with their eyes. >>:. How would you track [inaudible] power emitted by your systems compared to say, infrared and yet walk around outside [inaudible]? Dixon Cleveland: That's a wonderful way to frame that question. Typically, pretty small. I forget exactly what the maximum permissible exposure number is, but when you're walking around outside it's typically 5 to 10 times that in worst case. But also in that worst case when you're outside you tend to squint fairly seriously. >>: Worst case for you is brighter? Dixon Cleveland: Yes. >>: Okay. Dixon Cleveland: Yeah. So your eye has also a natural protection method of squinting. But, as you might guess, eye trackers would love to see a naked eyeball out there in space. If there were just an eyeball out there floating you could aim a camera at that thing and figure out where it was pointing fairly easily, but unfortunately we have these things called eyelids and eyebrows and glasses and stuff like that that kind of cloud the image of the eye. And squinting is a serious problem in eye tracking. It's just a plain serious problem. And it's one that we really do need to address more if eye trackers are absolutely going to become ubiquitous. >>: That point that you, I noticed that your materials online say that your systems work on 90 percent of the population. What's going on with the 10 percent? [inaudible]? Dixon Cleveland: We could talk for a whole hour on that topic. And I really dislike putting those numbers out because there are so many ways that you have to slice that up. But in general, we use the bright people, LC Technologies uses the bright people system, and that really provides a lot of advantage to a lot of people because we get better contrast between the pupil and the surrounding iris and can calculate the pupil center significantly better using that approach, and that gives better accuracy, ultimately. But, if somebody has a reflective, the corneal surface where the light actually goes into the eye, reflects off the retina, reemerges from the eye and causes the bright pupil effect, if they have low reflectivity there are about one or two percent of the people that have exceptionally low, well, it's not that high, one percent of the people have very, very low reflectivity and we have a hard time detecting a bright pupil in that case. In those cases using a dark pupil eye tracker actually is an advantage for those people. Some people just have very droopy eyelids. So as your eyelids come down you can still see just fine but instead of, so if the eye looks like this and the eyelid comes down and whacks off the top of the pupil, where's the pupil center? And if that lower eyelid comes up in the corneal reflection is down there you have to be able to see both the pupil and the corneal reflection in order to be able to predict a person's gaze accurately and sometimes some of those, the percentage of people just don't have a very good, their droopy eyelids block too much of the pupil for them to be able to predict their gaze. Some people just plain have goopy eyeballs. They don't blink enough. And so there's this surface reflection that you get off the cornea that isn't as nice as it might be and so that can get in the way. So there are any number- >>: [inaudible] dry in that case? Dixon Cleveland: Yes. The lateral fluid than the eye is very complicated chemistry and you have to blink a lot; and some people, in particular who have motor-neuron type diseases, their ocular muscles continue to work but their blink muscles don't. So they can be difficult to track. So there are several different dimensions over which that can cause these regions and they're all fairly small, but they accumulate, and you have to be aware of them all if you're going to design for a general population. You have to accommodate as many of these cases as you can. >>: Question. Dixon Cleveland: Yes. >>: How do you solve the glass problem? Dixon Cleveland: Well, regular glasses? >>: Glasses. Yeah. Dixon Cleveland: Generally eye trackers can work fairly well through glasses. >>: But they have all these reflections on the same [inaudible]? Dixon Cleveland: Yes, they do. If you're glasses are tilted wrong the LED that’s trying to see your, the camera is trying to see your eye through the glasses and superimposed on that thing is a big reflection and so happily most glasses are fit such that that reflection occurs outside of the image of the eye, but not always. A lot of people have these glasses that if you look at me from the side you see that they're sort of tilted this funny way, and those people can have trouble. Another thing with regular glasses is the people who have hard line bifocals, there's a split. So actually when the camera is looking, it sees your eye and it sees the top half of your eye through one lens and the bottom half through the other lens and it throws up its hands and it basically can't handle that. Graduated bifocals don't have that problem. Explicitly they can still find a corneal reflection and find a pupil, but as the calibration of the eye is different if you see it through one power in through another power. So a cool solution to that eventually is that you figure out which angle you're looking through that person’s eyes and you put some information into the computer about the glasses that you're wearing and can figure out all things. We haven't gotten that far on the eye tracking industry to solve those problems. So that is a good point. >>: You mentioned ocular muscles in the superior colliculus function 24 hours a day even in sleep. Would you expand on that, the sleeping part? Dixon Cleveland: No. My point was that it just keeps going. It doesn't have to sleep as much, and mostly when you see that activity is in REM sleep when your eyes do go. So I was making a point of that’s not part of your system that gets tired or anything like that; it’s happy to keep going. Yes. >>: So when you calibrate one of these systems one thing that I've noticed is that you can perform a calibration that works really great and then come back a day later and the calibration doesn't work on you anymore. I don't know about your system, maybe you figured out that magic to make that calibration work over a longer time period, but what's going on there? What would cause that calibration to come down? Dixon Cleveland: Again, there are any number of reasons. And the main one is that what is involved in the calibration procedure? There are two things that you're calibrating a lot of times and there are two general philosophical designs towards it. Do you mind if I put this topic off for just, I'm going to get to that topic. I will answer that question. I don't mean to avoid it, but what I'd like to do is at this point is talk a little bit now about how eye trackers are designed and what some of the requirements are of eye trackers and that will provide a good basis for answering your question. So what are the general performance characteristics that you want out of then eye tracker? My personal opinion is that the most important thing to do, assuming that you’ve got the ability to track a large number of people and the system will actually find an eye in the first place, but once you've solved that basic problem a really key issue is accuracy. There are some cases where accuracy is, some applications cases where accuracy isn't particularly important. Is somebody when they're driving, do they glance up outside the windshield and look down the road every once in a while? You don't have to know if they're looking at this angle or that angle very much. But in a lot of applications with computers, and particularly with small handheld devices where you’ve got a lot of options, you want to know is he looking at this coordinate or that coordinate. So accuracy ends up ultimately being pretty important if you want to talk about gaze-based interaction with a computer screen. One of the things that I attempted to do in my earlier part of this discussion was to indicate the eyes are capable of pointing very precisely. They can point probably, we don't know this, but they can probably point to an accuracy of the about, and a repeatability on that of about one tenth of a degree. We know it's at least half a degree because the foveola itself is one degree across and there would be no purpose in our eyes just pointing with a half a degree error because the image that we are wanting to look at would be at the outer edge of that foveola. It’s just not the way we are designed. So we can point at least a half a degree, and a lot of people say well that's the only, that’s it. It doesn't need to do any better. Well, there's some nice old eye trackers that have shown, that Purkinje eye trackers that were developed by Cornsweet and Crane in the late 60s and early 70s, and those guys actually showed that the repeatability is probably closer to about a 10th of a degree. So back to Heisenberg Uncertainty Principle, that really ought to be the target of our eye tracking is to get those kinds of accuracies because the eyes can do that well, why not measure that well? And in fact, when you design your screens, you put the size of your icons are based upon how repeatedly and how accurately your eyes can resolve those things. So we want to target for that. So accuracy is important. The second thing that's really important is that we, as human beings, move around. I'm not standing here giving this lecture standing like this. I'm moving all over the place. And if we don't move, all this research these days going and saying you better stand up and not sit at your desk for too long, blah, blah, blah, we’ve got to keep moving. We’re just designed that way. So we need to build eye trackers that can accommodate that. So those two performance metrics are really central. We want to get accuracy, but we want to allow simultaneous freedom of head motion. So what you’ll see on this demo system that we’ll show you after the discussion here is how LC Technologies has solved, demonstrated that that problem can be solved. What it actually does is takes cameras and puts them on a gimbal and so the cameras can move around. Was that on the original idea? No. We borrowed it from our own eyes. Our own eyes are gimbaled. So if that's how nature chose to solve that problem why don't we just solve it the same way? Just move them around. And so we have the equivalent of the peripheral vision and the central vision. In the peripheral vision there's a couple of wide field cameras. In this case we don't move them, we could, but they're actually different and they have a wide field of view, and when somebody comes in to sit down in front of a scene it sees that there's a face over there and it swings those cameras around, and then these cameras are the central vision. They’re telephoto lenses that can zero in on your eye and they have a fairly small field of view. If you didn't have the gimbal you'd have to hold your eye in this really tiny space. But what we’ve found is that if you can measure the eye and get roughly get ten pixels per millimeter at the eye you can get pretty good gazetracking. So that is one of the important things we want to do. Get 10 millimeters, or ten pixels per millimeter at the eye. So if you're up head mounted system real close, that's easy. Millimeter spans a lot. You don't have to get very much resolution in the camera. The further you get back, those pixels narrow down, and it's a tougher and tougher problem. And typically, if you want to sit 60 centimeters or roughly 2 feet away from the monitor, to get that you need a fairly telephoto lens to get that sensitivity. You’ve also got to get enough photons in the camera to get that high resolution image at that point. And remember the target is let's be able to measure the eye within, there may be other problems with the noise, but we are trying to get down to resolutions and accuracies of between one tenth and a half of a degree. So then, as you move even further back you need more and more telephoto lenses. Well, one way to do it is just continually put more and more pixels into the camera sensor. Well, if you’ve put too many pixels in the camera sensor you're getting one photon per hour on one pixel. You just, there's not enough light out there. You have to throw so much light on the subject to get the resolution of the pixels that you want. So then they say well, why don't you just point all your LEDs phased array and just point them at the eye and not illuminate the entire field? Well, nature didn't design that approach with us so there’s not an existence proof that that's a really good way to go. So the approach that we use on our system is to just point those cameras, use the wide peripheral vision. There’s a completely different vision system to find the face, you could use that also to do the reading of the facial expressions and the facial gestures’ but the key part that we, for our eye tracking need out of those cameras is to find out where the eyes are so we can point our eye tracking cameras at them. And as you'll see back there, this is a great big, clunky, ugly device; and it's expensive. It costs tens of thousands of dollars for that bloody thing. But it’s a proven principal model. It can be done. It demonstrates, it’s a commercially available piece of equipment that we have out there now, it's too expensive and too big yes, but it demonstrates that the problem can be solved. And so it really comes down now, we are at the place where we need to just put a bunch of good engineers on this job, solve the problems with the optics, use smaller motors, little MEMS, devices, if we make a smaller camera we need smaller motors, and can we get this whole thing down into the sugar cube that I was talking about earlier on? Well, maybe not next year. But is that doable? I don't see why not. In fact, I'm confident it can be done. And that's exactly why I'm talking to you guys because the environment at Microsoft is let's build this thing. If we've got some reasonable confidence that you can build the thing and have it do what it’s going to do and allow computers to communicate with us the way people do come with vision systems, let's do it. So basically that's the chat, and we can go back and you guys can play with the demo at some time. But I do want to get back to your question now. So, would you rephrase your question? >>: Why calibrate the system? How long does the calibration last? Why did you change it? Dixon Cleveland: The eye is a tough little tennis ball. Its parameters don't change. So there is no good reason that a calibration that you get today shouldn’t work an hour from now, a day from now, a week from now, a month from now, even years from now. If your eyeball changed, if the radius of curvature of your cornea, if the flattening of the cornea towards the edges, if the location of your foveola within your retina, if any of those parameters changed your occipital lobe would throw up its hands and say, what in the world? So that doesn't change. But what happens in most eye trackers is that they calibrate a combination of parameters. When projecting a gaze, remember that the concept of gaze prediction. This isn’t, let me get a better. You’ve got a screen you’ve got an eye tracker down here, you’ve got an eye up here, you're looking at a gaze point out here, and that's your direction of gaze. Behind this thing is this whole eyeball, and there is an LED in the center of the lens and it’s throwing up light, excuse me, and there's a corneal reflection here someplace and a pupil center, so we have the pupil center and corneal reflection. If your gaze angle at the eyeball is fixed all this geometry ought to be fixed. But when you do a calibration we have to know, in theory, the general idea is what's the X, Y, Z location of the eyeball in space? What's its orientation in space? And when you project that line out where does it the object that you're looking at? So it's a complicated almost robotic problem of calculating the gaze point. Most eye trackers do a calibration where they throw all of this geometry, the optics geometry, the spatial geometry of the environment, all into one big model and they say that the x-coordinate of the gaze is proportional to some constant plus, and this isn’t any one dimension one constant plus some gain K times the glint pupil vector on a G, P V, and so you have to come up with this and then multiply it out and there's all kinds of other polynomial expansions about that. This is what's called a lumped parameter model, and you can kind of get the feeling that embedded in these coefficients C, K, and all the other higher-order terms in there is all the geometry of the eyeball, all of the geometry of this space, and if any of that changes then you’ve got to recalibrate. So we've actually separated, when we do a calibration on a human we’ve already calibrate the geometry. And you’ll see back on that thing that the geometry, that we’ve got a monitor that's actually fixed into the eye tracker. So the eye tracker down here knows its relative position to the environment. And all we do is calculate seven parameters of your eye for each of your two eyes. So all we're doing is getting those parameters, and as I said before because the eyeball is a tough stable little device, a tough little tennis ball as I like to call it, you don't need to calibrate again. >>: So those seven parameters are geometric parameters described in geometry of the eyeball.? Dixon Cleveland: Yeah, that's right. Anatomical geometric descriptions of your eyeball and your eyeball alone. I’ve got one other point here. And here's where it really gets down to one of the important problems of eye tracking. As good as your eyeball is regarding its physical and anatomical stability, there's one thing in the eyeball that is a potential problem for eye tracking. And that is the muscles that control your pupil diameter. You have two muscles that are controlling your pupil. There’s one pair of muscles, one set of muscles, radial muscles that propagate radially and then there's the sphincter muscles. So the sphincter muscle sits right around the outside perimeter of your pupil. And the radial muscles are attached the other direction, and so it's the counterbalance of these two muscles that controls the pupil diameter. It turns out that this pupil center does not have to stay exactly at the center of the optic axis in order to maintain a good image of your eye. If it were to drift off a little bit to the right or to the left, up or down, the photons that do get through the pupil would still converge at the same point on the retina. Get the idea there? So it turns out that as the eye, as the pupil opens and closes, it does not open and close about a precise concentric point that's constant. So as the radial muscles contract and your eyes dilate, it might dilate more to one side than the other. At that point the pupil center has actually drifted. Correspondingly, when you're pupil closes back down it might go back to that point where it could end up someplace else. So in the pupil center corneal reflection method the concept is that the center of the pupil represents a known and fixed location in the eyeball. >>: And that also could happen right after the calibration. Dixon Cleveland: Absolutely. >>: But that does not fully explain the question of the next time you come back the calibration does not work. Is it because of the [inaudible] cones or lighting? Dixon Cleveland: There's another, since you asked the question in detail I'll answer it in detail. An eyeball, lens of the eye, cornea sticking out, I'll exaggerate it, optic axis of the eye, first known point of the eye, and then I'll exaggerate. So this is the optic axis, visual axis. At the back of the visual axis right there is where the foveola is. So when we point our eyes, we point at such that the visual axis lands on that thing that we want to look at and its image lands right in the middle of the foveola. So one of the issues with eye tracking is what’s this angle between the optic axis and the visual axis? And in the optics field that angle is called kappa, and it has a vertical component and a horizontal component. So if I calibrate and measure kappa and then somehow I rotate my head my 90 degrees but we've made the assumption, which is a false assumption, that somehow the eye tracker might make the assumption that the rotation of the eye is just the same and it goes back to some equation like this and then it says what’s the gaze point? Well, the glint pupil vector with these two components has actually shifted some because the gaze vector, instead of projecting straight out of the eye is off at an angle, and you may have corrected for that angle beautifully as long as you assume that there's no roll rotation, which is also called wheel rotation or torsion of the eye, and then you extend, you take the worst case where it rolls 90 degrees. And now, instead of going out and projecting well here's where the intercept of the optic axis is and so I’ll therefore translate over two centimeters to get out to the next point, your head is rotated. And so now the actual visual axis intercepts a different point in the screen. So if you don't measure the roll angle you've lost the information. One of the biggest problems in eye tracking is the range. If your eyeball is at this range when you calibrate and then you move your eyeball back, look at the new geometry. It now sees your eye from a different point of view and the gaze angle is actually less than it was when it was closer. So that means the pupil vector actually got smaller. You're still seeing looking at exactly the same gaze point but the projection, the measurement that the eye tracker is able to make on that glint pupil vector got smaller. It got smaller for two reasons. It's a double barrel effect. One is that the angle actually got smaller; the included angle between the camera axis and your visual axis got smaller, but at the same time the eye is further away so anything that's further away shrinks in the image. So it then projects a gaze point that’s below. And conversely, if your eye were to move forward it would project a gaze point that's too high. If you don't accommodate all that geometry correctly, if you come back a couple days later and you happen to be sitting further back or sitting too close, you can get these kinds of errors. So the way LC Technologies has solved that problem is with a completely different kind of device for measuring range. Most of the eye trackers out there have a camera in the center and two LEDs that are offset to the side that illuminate the eye and then, so the camera sees the eye, okay, and in the image of the eye there are two corneal reflections, and the distance of those two corneal reflections is somehow related disproportional to range, and so you can do a first order of correction in this geometry. But that's kind of fallacious because it's assuming that the corneal sphere is a sphere. It isn't. There’s a lot of flattening out towards the edges of the eye, and that’s just so the eye can focus better. So when you calibrate, if you're looking off, if you’re looking at some point on the center of the screen and then you look at a different place, even though your range didn't change at all the distance between these two corneal reflections did change because the surface is not a sphere. It's varied. So the approach that we've taken in the eye tracking here is to solve all these problems explicitly rather than having a lumped parameter model that calculates the gaze point from the calculated pupil vector we go through many steps. We do calculate the glint pupil vector very precisely, but then we actually do some ray tracing. So as we find that the eye is all oriented over in this direction it accommodates the fact that there's a flattening of the cornea. And that geometry is explicitly calculated when we see a gaze off at an angle. We have a mechanism in here called the Asymmetric Aperture Method which is able to measure the distance to the corneal reflection on the eye. That's kind of a whole different topic here, we can go off on that one for half an hour too, but let's not for the moment. But we can measure the range to the eye without these two LEDs that are offset from each other. Our LEDs are at the center of the lens, and we actually look at the shape of the corneal reflection and you will see that in there. But that allows us to measure the range. So we minimize a lot of those effects by doing explicit modeling of all of the ray tracing, taking into account the geometry of the environment, the geometry of the eyeball, as we move back and forth we got this Asymmetric Aperture for measuring more precisely the range to the eye, find its X, Y, Z location in space, and therefore more accurately predict its gaze point. >>: I’d like to leave some time for the people to see the fantastic system. Dixon Cleveland: Sounds good. >>: Otherwise we'll continue for another hour. I want to stop you here. Let's thank Dixon.

>> Zhengyou Zhang: Okay. So let's get started. ... Cleveland. He's President and CTO of LC Technologies. ...

Related documents

Products

Support

&gt;&gt; Zhengyou Zhang: Okay. So let's get started. ... Cleveland. He's President and CTO of LC Technologies. ...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Zhengyou Zhang: Okay. So let's get started. ... Cleveland. He's President and CTO of LC Technologies. ...