>> Ken Hinckley: I really glad to welcome back... together for I don't know how many years. Probably...

advertisement
>> Ken Hinckley: I really glad to welcome back Patrick Baudisch. We worked
together for I don't know how many years. Probably five years. It seemed a lot
longer, but -- we had a lot of fun working together and all kinds of cool research
projects. So now he's a professor at Hasso Plattner Institute, and he's working
with some students there and doing all kinds of cool stuff. So he's going to show
us what he's been up to. So welcome back.
>> Patrick Baudish: Well, thank you so much. Can I turn this on here? Thank
you. Thank you. I was just saying to Brian, I basically what I've been to go in the
past year is thinking of this day when I would be back at Microsoft and working
hard so I could -- I would not embarrass myself. So let's hope this works out.
Okay. So let me start with something old.
So last time I was here, we talked about this back-up device interaction and well,
the whole idea was that I said, well, you can't really get precision on the front of a
touch scene because the finger occludes this thing, so we interact with the back
side. And because we did that, we could do all these tiny little devices like
watches and clip-ons. This is Amy. Hey, Amy. Do you recognize yourself?
>>: Not [inaudible].
>> Patrick Baudish: Okay. And you know, I showed you this chart and I said
look at this, with back up device interaction like a two person error rate even for
these tiny devices and you know, with front side touch it can never be accurate
because the thing is occluding this and we talked about occlusion problems and I
said there's a fat finger problem and the fat finger problem is the fact that the
fingertip is soft. So that was the last talk that I gave backwards. Or is it?
Okay. So this talk is two halves and the first one is about high precision touch,
and it's basically about admitting that what I said last time was not true.
[laughter]. So we've been wondering about this fat finger problem. And you
know why is this, why is touch so inaccurate. Is it really that, you know, people
can be very accurate or the softness of the fingertip prevents, what's the
problem, why is touch so inaccurate? And so the question we've been asking
ourselves, is it really that there's something wrong with users finger, so could it
be that we're just using the wrong equipment or there's something wrong with the
technology side that we're not -- that we're not recognizing touch the way we
should? Maybe that's just a misunderstanding between human and machine at
this point.
So this first half of the talk is two parts. The first one is kind of the scientific part
where we try to understand touch again from the bottom up, and you understand
what is wrong with touch, why can it not be precise and in the second half I'll tell
you how we actually made the device that exploits that concept.
Okay. So the whole thing started, let's assume there is no such thing as a
fattening problem, let's just assume that humans are arbitrarily accurate and to
make this more precise of what I mean with that is I mean people are able to
point precisely to a sand grain or something extremely tiny and to make it even
more precise if you use a target twice under the exact same conditions, they will
hit the same spot, okay. So we just assume that for a second, okay, see if you
have a touch situation, you've got tiny target and you ask the user to kind of tap
on this target and if they do it under the same conditions, they will hit it exactly
the same way.
Okay. But if we make this -- if we make this assumption, well, but we do observe
a big targeting area, it must come from somewhere, so the -- people in the
literature today roughly agree on the target size for a touch button should be
roughly like 15 millimeters and the circumstances can be small in that, but why is
it 15 millimeters, I mean where does this problem come from? And so maybe it's
this. If you use this target twice under the exact same conditions they will
[inaudible] but maybe we're not doing a good enough job to understand what the
exact same conditions are.
And so current touch systems basically look at contact area. Maybe there's more
than contact area that establishes, you know, this context of exact same
condition.
Okay. So here's kind of a think about and it's not two dimensional. So traditional
touch pads or touch screens use this. I mean, they gather X and Y value and
you know, that works because we want -- we've got a screen connected to it
that's two dimensional, so this is kind of the straightforward thing, you know we
want 2D, so why don't we just get 2D. What we say is well, it's not that sample.
Touch input is actually much higher dimensional than 2D. Yes, eventually we
want 2D but in the process we first have to deal with a much higher differential
space.
Okay. So here are the factors that we think that we initially hypothesize could
have an impact on that. So the first one is actually comparably well used to it.
It's yaw. It's the rotation between the finger and the pad. I heard, I haven't
personally verified that maybe you guys know about this, that the iPhone actually
applies in offset. So if I touch an iPhone or an iPod touch like this, it actually
adds a little offset to the finger and you know that because if you rotate your
iPhone around and touch from it the other side, I heard, I have never really
experimentally validated that, that touch does not work that accurately any more
because the offset that's applied now is obviously applied backwards then, so it
kind of doubles the problem, if you will.
Has anyone verified that? Okay. I don't know if -- I would be curious to hear
what Windows mobile does in that actually.
>>: [inaudible].
>> Patrick Baudish: Huh?
>>: [inaudible].
>> Patrick Baudish: Okay. So the next thing is, the next thing is pitch. And so
actually like you know Cliff and I think Daniel's in that paper as well, so they did a
paper on -- they notice the fact that on a table top computer if you reach over
long distance that your finger hits the surface at a different angle. I would say
there's a bit of a confound with parallex and head position, but overall it seems to
be safe to assume that there's something about pitch here that impacts the touch
location.
Something that people haven't looked at yet but we actually looked at which is
roll. And the certain with mobile devices people don't hit the screen always at the
same angle, finger can be rolled differently. And fingers shape actually, I mean,
what makes us thing that two people have very different fingers would hit the
surface the same way? But even more importantly, this one, I think, is kind of the
question.
Think of it this way. If I had like a -- if I had like a grain of sand and I ask you to
point, what makes us thing that all humans would do it the same way any mean,
that's not really specified. The finger is huge and the target is tiny, so somehow
we need to make up a mental model of what it means to hold this huge thing to
point to something tiny. And I can already tell you, it turns out yes, humans are
very different, we all do it in a very specific way, we can reproduce the way how
we do it, but we all do it differently. And that actually makes a big difference,
right?
It means that if a touch screen is designed to work with me, it's probably not
going to work with you or it will work but only within that, you know, space of 15
millimeters that we talked to earlier. We're still looking for a point here. This is
some speculating. We're looking for a -- we just solve the whole thing with brute
force, but we're still looking for a simple like close to representation that says how
people mentally represent touch.
Okay. There might be more factors such as head position. Okay. So we ran the
users to validate this. It's a very simple study. We used the finger works touch
pad. And basically people always, all they did was touch the center of this cross
here in between they touched an okay button to go back and forth, to make sure
they're not just hitting down. We hit a bunch of different conditions. We rotated
the pad for half the trials. This gives us a chance to actually find the actual
target. You see this in the second, that basically simulates the yaw.
We instructed people to hold their finger in different possible tours from like 90
degrees which produces this, to slightly bend over to the other side. So pretty
extreme angles over all, from 90 degrees to 15 trees for a pitch.
And then the user variable, we didn't really do anything special about it, like with
any user study you have a certain number of participants, and we just looked at,
you know, the amount of spread and deviation we get between these people.
But keep in mind these were all 12 students, so whatever spread we observed
between these two people, the spread in the actual world is going to be probably
larger than that. So whatever in fact we try the true effect that we're dealing with
is probably -- this is a very conservative estimation for the spread between users
we're going to find.
Okay. That's what the apparatus looked like. This is one of the students. He
sees instructions there. And we used a foot switch to trigger the whole thing to
make sure we're not getting any weird take of artifacts.
Okay. So the unit I'll be presenting the results and in a second is presented.
And so keep in mind that the error in the touch is all subject of calibration. There
is no exact way of saying this is the target and this is the matching location. It's
all a matter of calibration, you know, even though the touch pad -- on a touch
screen the input and output amount on the same device, but they're only
meaningful by means of calibration, so what we did was we looked at the spread
and basically make the assumption that we're calibrating post hoc. So if you -- if
a user touches in a certain field, if all these touches are a little bit off, we can fix
that later with calibration, but the spread that we're getting, we cannot fix, right,
because spread is just, you know, basically [inaudible]. So if we find the spread
of 7.5 millimeters, that means roughly we would make a button that's 15
millimeters across.
And we would you know, we hypothesize that all these factors were meaningful.
Twelve participants, six [inaudible] and here are the results. So I'll show you
some summary data in a second, but overall what we found is what we wanted to
see. If these factors that we looked at were not meaningful, we would expect to
see one big cloud where all the touch locations would be like spread out over the
same area for the same pitch and roll, and you know, people, and what we found
was that indeed all these different conditions all got their own little galaxy, so
what you find is two big blocks for different yaws and these are like different pitch
angles and stuff like that. And I'll show you more about this in a second.
So the first thing is really important. We actually found that there was structure in
the data and that structure is really important to give us high precision touch.
Okay.
So here's the summary data. We used the same data and we parsed it with
different knowledge, so if we parse it with the knowledge that current touch
screens have the 2D model, we get what you expect, which is kind of this 1.5,
you know, 15 millimeter target size, you know, corresponding spread, 7.5
millimeter spread. That means all we do is we extract X and Y, and we get this
huge spread.
The opposite extreme if you throw all the knowledge in that we have, which is
roll, pitch, yaw, and user ID, we can actually narrow down the error to a third of
that, which is a substantial improvement. So you could also think of this as like
we could fit eight times more screen elements to the same area, assuming that
people have perfect vision which is obviously not the case.
Or we could, you know, for me personally, since I care a lot about very small
mobile device, we could make very small mobile devices out of this. So I'll show
this again in a second, but basically the first big step here is if you know about
yaw, if you know about the rotation of the pad that buys you like 20% accuracy
out of the date and then the next step is if you know about these other angles,
like roll and pitch that buys you another improvement and then knowing about the
users actually quite a substantial improvement again at the end.
So the more you know about touch, the more of that observed deviation you can
eliminate and the more precise your touch gets. Okay. So I'll walk you through it
just to get a sense what the data looks like.
So this is data from one participant. And first of all so these are the recognized
target locations. This is where the actual target is located. And as you see,
everyone touches -- well, this person, this participant always touches a little bit
too low and a little bit further to the right. And you can -- and the other side is just
mirrored. So once you know that, you can actually merge that data if you have
the yaw rotation. If you didn't have that, then basically you would have to deal
with this huge amount. But interestingly users never touch the actual targets,
since they're always too low, you know, oddly enough there's a hole in the middle
where they never touch the target.
>>: [inaudible].
>> Patrick Baudish: This is the rotation of the pad, okay? So since people
always -- since people always target a little further like -- since people eventually
touch a little lower than they intended, from any angle you will get this effect.
Okay. So if we eliminate that part, what's left over you can now partition based
on roll and you find that every part of roll has its own little galaxy. Again, this is
one user. I'll show you in a second how it looks across users. And you get the
same effect for pitch.
So if you didn't know about pitch, you would have to deal with this type of spread.
If you know about pitch, then you could actually deal with these much -- this
substantially reduced little galaxies.
Okay. So this is users, just look at the user data a little bit. Do you see any
differences here? So every column is -- I think these two kind of stand out here,
right? I mean these two people have nothing in common basically. This person
can hugely benefit from some sort of post calibration where we apply offsets and
this person hits everything the same way independent of finger angle, basically.
It's very interesting.
I'm kind of speculating. This person might not have been the person who has
used mobile device for a long time and has kind of internalized the model, like
maybe he's being conditioned by his touch device.
>>: I'm going to ask the question. Why didn't you have some distractor targets?
>> Patrick Baudish: I [inaudible] that question enough from you, Andy.
[laughter].
>>: As you know, I have this notion that the way that you select an object
depends on the local context. Right, so if I crowd a bunch of things together, I
would expect users to take more care into selecting an object.
>> Patrick Baudish: So what people did in the -- okay. So I think there's a
distinction here. There's an important distinction. They're not using the mouse,
right? They're not -- like so first of all, like in -- if you have like CD corrections
and stuff like that, then I think [inaudible] on the way make a big difference. In
that case, how would you have done it? Like, I mean, I'm not quite sure. So
basically have to get the finger into the cross hair every single time, right? What
would you put distractor? How would you design that?
>>: Well, I think you need to give some penalty or cost for being sloppy. It could
be that participants three and four are just participants ->> Patrick Baudish: No. I'm glad you -- okay. Now I understand what you
mean. And the simple answer is no. Let me tell you why. [laughter]. We totally
had this discussion, okay. If what you said -- and we exactly had this discussion.
If -- the phenomena you're looking for is a participant that looks like this and
another participant who looks like that. If you see this, however, this is never the
result of negligence. Do you understand why? I mean, this person basically is
completely reliable. These two people are equally reliable. For a certain pitch
condition, this person got his 30 touches into the exact same size circle. Do you
see that?
Like so for a given finger angle, we said now are you doing this with like a 30
percent roll of something, right? This person got his 30 touches into this circle,
this person got his 30 touches into that circle. It's a systematic effect. It's not
negligence, right? If the person was negligent, then every circle would be the
same. But the circles are the same. If you look at the data, basically all these
participants are equally good. Every single circle that you see is kind of roughly
the same size.
We put a price out for like the most accurate person.
>>: [inaudible] like imagine that you're cross hair had your like [inaudible] if you
miss it by five pixels you lose the game basically. So there's these other targets
that are near by the once you pick, people are going to approach it differently,
they're going to be like I need to be careful ->> Patrick Baudish: Everyone is careful, right?
>>: Right.
>> Patrick Baudish: So I mean ->>: I'm not saying that -- but there could be an effect of if there's other targets
near by, it might affect how you approach it. Like that's what Andy is saying.
>> Patrick Baudish: It's in Germany [inaudible] there was electric shocks, right?
[laughter].
>>: [inaudible]. I think we actually have a pretty good model of contact size even
outside the digital world. It is a question of are all these ranges of pitch and yaw
actually relevant in the context of ->> Patrick Baudish: That I give you.
>>: [inaudible] with a smaller pad.
>> Patrick Baudish: Fair enough. I mean, what you have to say, though, is that
even if you say that in -- if you say that in 90 degree angle, for example, which
actually does happen if you think about it, right, in certain situations ->>: That would be like the most careful pitch, right? That's if I'm trying to get
something really small, I think [inaudible] pretty good model that I have a smaller
[inaudible].
>> Patrick Baudish: Based on what you see here, that's true, right? Because
that at least for this participant that's the smallest circle, right? Absolutely. So,
yes, you could argue that you sample the two large space and if you use the
subset of that that the touch screen condition would actually produce better
results. The results of the approved touch would be the same but the control
condition would then perform a little better, absolutely, yeah. Fair enough. Yeah.
>>: Another thing is that [inaudible] introduce more compound like [inaudible]
because as soon as you make it react [inaudible].
>>: You're testing a different phenomenon.
>> Patrick Baudish: People have no reaction whatsoever. All they do is they hit
the same thing. This [inaudible] wouldn't quite know what to tell people if they
made a mistake, right? There was no target size in that -- for that matter. We
could have -- I mean, and I'm sure that, you know, we could have given people
like size of targets of different sizes but really that was a cross hair that was well
understood. And I mean, we tested the -- I mean the hypothesis was that people
under the same conditions would be able to get their finger into the same
position, the answer is yes, they can. And the spread that they actually get is
very small. The spread that people have in that range is like that.
The main reason why we get a systematic or we lose so much precision is
because some people have that model here. Like if the finger is at a different
angle, they just have a different way of doing it. And then there is this huge
difference between people. So that means if you wanted to make some sort of
correction, you cannot just make a touch that puts a correction in place, you need
to know who you're dealing with.
>>: [inaudible] bigger circles? Like some of the large --
>> Patrick Baudish: Oh, some -- so this has a scale here. This is a centimeter.
>>: [inaudible].
>> Patrick Baudish: Yes, oh, absolutely. I mean, basically what we get out here
is this means you could have a target size of five millimeters and people will get it
with 98 percent accuracy if you apply all the corrections. I'll talk about this in a
second, because we put some engineering thing after that to actually build this.
Okay. So users are different. So if you wanted to get some of this effect, you
would somehow have to find a way how to implement these things. So here's a
quick summary of this thing. So basically a regular touch screen has to deal with
this type of spread. If you have a device that knows about the pad orientation
such as the iPhone you'd have to deal with less spread, and if you actually have
something that is more things than the circles you have to deal with get smaller
and smaller. But you also need to know about people.
Okay. So we thought like, well, shouldn't we be able to build something like that?
So how would we make one? Here is the first answer to that. We all have like a
licon [phonetic] or OptiTrack [phonetic] at home, so that's easy.
>>: One more question. [inaudible].
>> Patrick Baudish: They were all right handed. And we actually selected them
all to be right handed because we didn't want to deal with any mirroring effects.
Do you think that left handed people are structurally different than those people?
>>: Well, no, I just wondering what type of mirroring you would need.
>> Patrick Baudish: I mean for us like I think what we're basically saying there's
a systematic bias. Any person who comes in who has even more variables is
good for us. Right? Because everything that is added, we will -- the more error
we introduce, the more error we can take up, a better calibration. Right? Yes.
>>: When you have the [inaudible] doing this, how much were they concentrated
on actually [inaudible] the target? Because I come from entertainment software.
People tend to be ->> Patrick Baudish: Yeah.
>>: Sloppier when they're more relaxed and it's not necessarily [inaudible].
>> Patrick Baudish: So we -- so what we did was -- I mean, first of all when you
look at this, I would say there had were always concentrated. No one here has -I mean, for this person like the very flat pitch actually produced a larger interval,
but overall there's no outlay here, there's nobody -- you would see if somebody
acting sloppy would see large circles on this and you don't. I think what we did is
we -- I think we put out 40 bucks for the person who did the highest precision
overall. And then it turns out there are different ways to compute that. So I hope
everyone is -- [laughter]. Christian unfortunately first talked to someone and said
I think you're the most accurate one, and then he changed his metric, so that was
uncool. [laughter].
>>: [inaudible] impress Patrick, too, so ->> Patrick Baudish: No, no. You have no idea. Students are not different from
American students in that sense. They're very full of themselves actually, believe
me that. Okay. So how to make this thing. Okay. So the first thing we can do is
we can use an OptiTrack, and so what we do, this is my other student Shawn, by
the way. So he's wearing -- we basically glued a little market plate on to the last
segment of his finger, and we can use that to track, you know, we can -- I mean
this is -- the OptiTrack is a wonderful device. It gives you six degrees of
freedom. And it's point five millimeter accuracy. So you have the retro-reflective
markers, so you add the 6 to 16 cameras. It's probably not exactly practical for
everyday use, but it's a wonderful way of getting ground truth, right? This is
basically ground truth. I mean, within point five millimeters we know exactly what
we're getting.
And so that also means we use that in the user study as a base condition for
what could possibly be done. And then we thought about we should try to make
this a bit more mobile maybe. So here is a little quiz. This is a fingerprint. What
part of the fingerprint is this? The top. How is the finger held? Pointing down.
Good.
Next question is a little harder. How is that finger oriented?
>>: Rolled.
>> Patrick Baudish: It's rolled? Left or right?
>>: Left.
>> Patrick Baudish: It's left, right? Because you see the core point being shifted
to the right. Wonderful. Okay. So turns out if we look at fingerprint we get
everything that a traditional touch pad gets. But by also looking at the ridges
inside that contact area, we can reconstruct and just as nice as that is, we can
extract exactly the variables we care about, which is roll, pitch, yaw, and the
beauty is also finger ID, you know, because that's exactly what fingerprint
scanners do.
So we basically well, this is not how we do it, but this is -- you could think of we
try to find the core point which is the very data point and we look at the specific
location inside the circle of that, and then it would tell us how this thing works.
But I mean you can get subsets of that out of other device. For example, on the
surface we know that we can get yaw for example, and that's definitely
something that should be used.
Well, we did a lot of research, and it turns out people already make devices that
do these things. So if you've left the country recently, you already have
experience with this one. This is a guardian. It's used by American immigration.
I'm not showing the most expensive feature. It's on this side, which is the license
to be used by immigration which makes this thing outrageously expensive.
Fortunately the German distributor was kind enough to let us have one for a
couple of months.
It's based on FTR, so it's basically similar to the interactive tables but the FTR is
kind of backwards. And so the finger is actually flushed with light and then the
camera looks at an angle cross the surface and gets a wonderful picture and I
actually can show what you that looks like. So it's kind of realtime data you get
out of this thing.
By the way, what is happening, is the finger being dragged or rolled?
>>: Rolled.
>> Patrick Baudish: It's kind of nice, right, you can extract -- you can see all this
and now it's being dragged. The different flavors of that. There are some that
have a silicone pad on top of it, they give you slightly better if you actually go
from immigration to this. I don't know. Most of you are citizens probably didn't
have to do this. They give you the full nice treatment, but they have to wipe this
thing every between two fingerprints and so this is one with a glass surface and
it's kinds of cool because they actually stay good for like you know a half hour or
so before you have to clean them. Which is obviously desirable for interface like
that.
So right, so we just talked about this. This for example is roll because the core
point is in different locations and that would be drag. And that's one of the things
that make us different just in case you were thinking of the micro-rolled paper
from Kai [phonetic] last year. They called it roll but they're not really tracking roll,
right, they're just looking at little movements, but they have no idea if you're
actually dragging or rolling in that case.
Okay. So this is kind of cute because the algorithm we envision is extremely
simple, right? And they have 50 ways of doing this more cleverly. But we
basically do is we have the user touch the target from different angles, and we
just create a database, but we -- where we record the finger posture and the -sorry, we basically just recording the fingerprint and the target offset associated
with that and then doing use, all we do is we look at the fingerprint in the
database, we find the closest matches, and we aggregate their offsets that we've
prerecorded.
So three days before the Kai deadline Christian comes into my office and says
like I think this is good, we can run this, it just turns out it takes 10 hours for a
participant to actually run this algorithm. We have like 12 participants and but
fortunately he just -- you've just basically created a bunch of virtual installed in a
bunch of machines and just crunched this on six machines overnight. So we're
not quite realtime yet.
But we're also being extremely brain dead. All we do is we do a feature
extraction like using a surf algorithm which is basically we extract a handful of
features for every fingerprint and then it goes brute force through the database
and it picks those fingerprints for the highest number of matches, which means
we're not even bothering looking at outlines or rotations or any of that, we're just
looking for feature matches and that turns out if you have a good number of
samples that gets you there. But you could think instantly of 50 ways to make
this better and clearly it's something we need to do.
It's also nice because it not only -- I mean just to hammer this home it toss the
rotation of the finger as well as the user matching. It's all the same stuff. Okay.
So we did this second use to validate if that's any good. So we had three
experimental interface conditions. The first one as I said we through in the
optical tracker as kind of the ground truth of what could possibly be done if you
had perfect tracking. Then we used the fingerprint condition and the third
condition we called simulate capacitive which is basically we use the data from
the fingerprint scanner but we threw the fingerprint away and just kept the
outline, got the centroid, which is basically what capacitive systems do.
Okay. We had again like rotation, roll, and pitch. And so hypothesis was that
that the opticals basically the cleanest possible implementation of our idea. It
should beat the simulator capacity by a factor of three because that's the factor
we observed in the first study and then we hypothesize with the fingerprint
condition would also be capacitive, but we didn't know by how much. It was kind
of clear it wouldn't be quite the gold standard but we didn't know how close to the
gold standard we would get with that.
12 participants again, 650 trials per participant, here's what we got. Okay. So by
the way there's a hiccup with the scale here. That's just plain wrong there. So
ignore the scale. After now everyone is looking at it.
Okay. Simulate I have capacitive. The difference here is basically respective
factor of three and it turns out the fingerprint approach worked pretty well
actually. We're getting a substantial -- actually I opportunity do the math, it's like
a -- I think it's more than twice as accurate as the controlled condition. There's
still some potentially for improvement here which is basically that the tracking
based on the feature extraction is not quite as accurate as the OptiTrack. But I
think it's clear at this point if we map this back to the old scale that we're basically
down to being able to acquire targets of like six, seven millimeters across
compared to the 50 millimeters earlier.
Okay. So conclusions this part. Based on the stuff that we're doing right now we
can basically get more accurate touch. You could think of your mobile device
with an onscreen keyboard. You could envision that you don't have to use the
dictionary quite as often because you actually hitting the targets right away: And
I'm not -- you know, I've definitely contributed my part here. Targeting aids. We
use a lot of targeting aids which basically instead of being touched I mean there's
a first a ballistic movement to touch but then we use a targeting aid at the end
something like shift or optic curves or something like this that reveals the target
which basically I mean they work really well because now -- but it's not touch
anymore to a certain extent, right, and they take time and I think they eliminate a
little bit of this walk up experience that go touch systems have.
So I think with what we're proposing here you could envision that we have to use
targeting aids less often and that would be beneficial. And then personally I'm
excited about this, because I care a lot about very small mobile devices and you
could envision making -- taking a mobile device you know assuming that you
have the [inaudible] of a 12 year old, you shrink it to half size and you would still
be as accurate as before.
So and I think the lesson learned from like more like a theoretical point of view I
think whatever we thought of the fat finger problem is there's probably some sort
of fat finger problem. There's probably some inaccuracy that comes just out of,
you know, human skin somehow. But it's really, really not where the problem
comes from. The main problem comes from the fact that we traditionally have
oversimplified the notion of touch and touch is not a 2D thing. We want 2D out
which is why we've always done 2D, but in order to get this, it makes sense to
think of this as a much higher dimensional problem.
So next steps there's a whole bunch of things we want to do. Clearly we need to
speed up our algorithm a little bit. I think we roughly know what to do there.
What kind of makes me excited is that -- there's some -- many of you have
probably seen this. There's some stuff Intel technology for touch screen is
basically RGB and camera, if you will, on the surface. So you could envision if
they actually get the optics right that you could get fingerprint out of your LCD,
and so we hope that in some mor distant future we could actually get this into a
mobile device. So you would buy a mobile device that would just look like the
one that you have right now, it would just be much more accurate. And I think
that will be a nice feature to have for this.
So I've got more stuff. This was the main thing. I've got like another 15 minutes
of material for something more tangible in table topping. But I just want to show
a picture of Christian. So he's actually the student I'm doing this with. So he's
the guy who is able to, you know -- I was impressed by the fact that he was able
to do not only the science behind it but also to [inaudible] this out to six machines
three days before the Kai deadline.
Gerry Chu also interned with me earlier, did a lot of like intellectual input to this
project.
Okay. So I'll be happy to take questions and then I've got like more goodies.
Pass this around and take one out and play with it. This is for the second half of
the talk. I'll talk about this in a second. Okay. Dan.
>>: So projecting part, assuming you actually made the UI, the kind of device
that took advantage of this, do you have a way of measuring fatigue? In other
words, yes, maybe I can accurately get these but maybe it's more exhausting to
have to do that finer grade technique.
>> Patrick Baudish: Yes. Absolutely. Yes. I think you're absolutely right. I
mean, that's the same with a visualization aspects, right? I mean, just improving
the accuracy of your touch mechanism doesn't make everything twice as good
clearly. It's the same way if you cure cancer, it's not that people live twice as
long, there's always the next thing comes right after that that limits your system
clearly.
Ed?
>>: So a lot of the distributions, the touch that you had was really strongly
skewed distribution.
>> Patrick Baudish: Yes. They're very flat angle.
>>: Right. So I'm just wondering even -- so you get even when they're coming
down they tend to be a little bit skewed that way. But yet when you do your
calibration, you're decomposing it to an even circle.
>> Patrick Baudish: So clearly we're getting the worst by overlaying -- we're
always getting the worst out of that, right, by overlaying the worst possible angle
gives you the limits the accuracy get out.
>>: Right. So I'm just wondering about like, you know, what happens if you -can you start to close that down a little bit. Are there regularities?
>> Patrick Baudish: I mean those areas that have that comparably large spread,
which is still negligible compared to what we deal with today, is for those
situations where it's kind of perpendicular -- it's kind of parallel to the surface. I
mean, honestly that I would say is a fat finger problem, right, it's the fact that you
really have no idea what part of the skin is going to come down first.
Then I will be able to talk to that like next year I guess. And the reason is what
we try to do is we try to come up with a close representation of what this all
means, you know, is there -- is there like a point, you know -- I mean what do
people -- what do we think when we touch? I mean, how do we -- how do we
make -- if I put a pile of sand here and there's one of the grains is red and you
point to that, like how did you make that decision, right? I mean, how -- I mean
the fact that you can reproduce that same motion repeatedly means that you
must have some conceptual model right which is then apparently different from
other people about what is that model. Is there like one parameter that we can
boil this down to and say everyone has like an index, you know. And then
everything else, you know, would be easier.
>>: [inaudible] I'm just wondering if you thought at all about applications for
people who have tremor or other issues targeting and whether you could use this
to provide a technology.
>> Patrick Baudish: Yes. This is going to make it worse, seriously.
>>: I mean, I was just thinking if you could similar kind of ideas.
>> Patrick Baudish: Yeah, yeah, yeah.
>>: To.
>> Patrick Baudish: No, I mean I think we -- I mean, the clever things you can do
for people motor disabilities, I think Jay Waberg [phonetic] has some real cool
ideas for that. I'll be happy to point to you that afterwards. I think this makes it
worse because this doesn't help people with motor disabilities but I think some
designer will start making things smaller.
>>: [inaudible] things smaller.
>> Patrick Baudish: And then that's really bad for people with motor disabilities,
right?
>>: I was just thinking if you could use a similar sort of idea for adjusting for
tremor or for ->> Patrick Baudish: I think for tremor they're good solutions. Jay Waberg has
done this thing where he looks at like perpendicular motion on the mouse pad for
example and use that to reduce -- I mean, reducing CD gain seems to be the
classic way of helping people with motor disabilities. And to use a very sensible
way of kind of applying that.
We're doing a project right now which is actually I've thought about for three
years, people can't probably hear it any more, but I think we're [inaudible] which
is how to automatically adjust mouse gain by just we're basically logging a mouse
-- we're just observing use the mouse and we're trying to distinguish those
people for whom that is a comfortable speed from those people where the speed
is not comfortable. It turning out if people calibrated their systems they could do
a lot, but many people don't calibrate their systems. They don't know or they
just, you know, whatever it is.
And so if we can make that possible, that's going to make a difference. Okay. I
should probably move on to the second half so we don't run over too much. But
I'll be around afterwards, and I'll be happy to take more questions.
Okay. So here's a project I had a lot of fun with, and we ended up calling it
Lumino, and the little blocks I just passed around, some of the blocks we made.
We made a whole box of them. We spend a lot of time. And this was kind of my
excuse for buying a buzz saw and a belt sander and as I just said to Ken earlier,
the women's rest room has this extra room in the back that is usually for cleaning
supplies and that's like my new shop now. The rest of the Hasso Plattner
Institute is all software people, software and systems people and I think there's a
little of an uphill battle to try to convince people of the special needs of you know
why, you know, why we need the expense like a buzz saw.
Okay. So this is what Lumino does. This is Microsoft surface. And it shows you
on the table stuff that we have made on it. So this is there because of this. And
in this case, for example, it can recognize that this is an overhanging structure
and it might warn some person who makes a construction, think of this as a
record prototyping to architects or whatever you want it to be.
And it can do that because it recognizes 3D structure on the table. And the
secret in this is that these blocks are made out of two things, there is a fiducial
marker, which is basically the stuff that we all use in tangible objects on surface,
but there's something else on there which is a fused glass fiber bundle and that
actually allows the thing to look through and some secret sauce allow us to see
what happens on top. And I'll explain this in a second.
So the whole mission of this project is depending on what direction you come
from, you can look at this in different ways. If you come from tangible computing,
then blocks world is really like an old thing for you, you're not excited about this.
But maybe you just appreciate the fact that this is a solution based on photons,
you know, this is actually uses a table without plugs and stuff like that.
If you're coming from a table top background you might appreciate the fact that
even though we ends up calling it surface, there is actually not flat, and that it's
actually a potentially to build three dimensional things and I'll show you a couple
of interesting widgets we built based on that. Okay. Because we're not having
very much time, I'm skipping the next 30 slides. There you go. So the main idea
is -- that was quick, right? It's self contained because it's table top, right, it's back
projected. If you calibrate it once, it's going to stay calibrated. But most of all the
individual blocks have no batteries because it's all optical. And as trivial as that
sound, that's actually important. If you want to have a reasonably sized tangible
application, it's really no fun to have 50 things with batteries in them.
Okay. So here's how it works. Well, you know, I don't have to explain this
picture here. Okay. Let me explain. The diffuser is a camera, there's an
illuminant, there's a finger [inaudible] the finger is higher, the reflected picture
gets diffused, and the higher the space is the more the picture gets diffused and
that's exactly what you threshold to find out if touch is [inaudible] properly?
>>: [inaudible].
>> Patrick Baudish: Good. Okay. That was the fast version. But it applies to
everything. So this is a market that's in direct contact. Ignore the [inaudible]
here. That comes from -- that's a different factor. This is what a marker looks
like that chips with surface. You make these, Andy, right?
>>: No.
>> Patrick Baudish: Who came up with this marker concept?
>>: I believe Nigel King and ->> Patrick Baudish: Okay. Fantastic. So this is what a marker looks like that's
directly in contact with the surface, and that's what a marker looks like that's five
millimeters above. And that's a marker that's 10 millimeters above.
So the diffuser is pretty radical. Like anything that's not directly in touch with the
surface is typically you know you can't parse anymore.
So here's the trick that we're doing. There's two tricks. The first one is that we're
using these glass fiber bundles in order to prevent the table from diffusing. Okay.
Here's how it works. This is one of these [inaudible] these are things like that are
like floating around right now. And if you're not familiar with fiberoptics, what
happens here is the picture that appears here shows up here because every
single pixel, if you will, gets trapped in the glass fiber and by means of -- do I
have pictures on that? Yeah. So these are the individual fibers in there. If you
happen to have one of these stylish '70s hippyesque lamps at home, I do, not
anymore. I mean we killed a lot of stuff on eBay for this project.
So that's basically what happens. The single pixel is inside of one of these
fibers, and it leads it up to the top. If you have one of the professional -- if you
have one of those that magnify, these are 15,000 fibers in there, and those that
look like they're hand made, they are hand made, and they are about a thousand
to 1500 fibers in there.
And then internally that's basically two layers and by means of total internal
reflection the light is, you know, the glass fiber prevents the light from spreading
and escaping, and so that holds it together. Here's a little bit what it looks like.
Or does it? Okay. So that's kind of the effect. If you use the fiber bundle to see
through something, some of them magnify, some of them don't, but they prevent
-- they pretty much give you a focused image. So here's how it works. We're
using a fiber bundle on the surface and when the light now actually goes to the
block, hits the marker, comes back down and produces a focused image on the
surface.
And obviously this is recognized instantly and then the higher blocks are
recognized because the block in between basically hands off that single to the
higher marker. We call that deferred diffusion.
Okay. Here's the other -- there you go. And here's the other trick while they're
slanted, right? Normally if you stack things on top of each other they would
occlude each other, but the trick because they are slanted actually and this
marker shows up here, the second one comes here and the virtual image of the
marker on top actually is next to the previous one. And if you do this again, you
see that that what this slanted business does is it basically takes a virtual
arrangement of markers and lays them out horizontally. So that's kind of the trick
here. This is what one of them looked like. And so that's one of the ones that
are floating around. So what you see here is actually marker that is below this
block and it's an offset. And you can also see that the hand made ones actually
lose quite a bit of light. So there's a certain limit of how we can stack but the
professionally made ones actually are much less subject to that.
Okay. So the second idea is to use the glass fiber bundle to transfer on that
vertical arrangement. Okay. Let me show you some of the blocks we made. It's
been a lot of fun, and we made a lot of different designs of this. We made like a
straight one, a slanted one, and this one is really odd because it magnifies
actually. It takes more to wrap your head around. At least it took for me. This is
two straight bundles, but because they're cut at this odd angle, it actually
produces an image that's magnified in one direction but not the other, and I show
this again in a second.
Okay. So the straight bundles, they work fine. One of the problems is that the
image is passed down straight, so if you stack two blocks that have the mark in
the same area, they're going to occlude each other, so you can't do that so you
basically have to create a layout first where every block has its own reserved
area and that's basically -- which is a shame that Chris Harrison is not here
because he's been working on stackable markers, flat markers, not volumetric,
but that's pretty much the approach they're using.
The problem with this is you can only have a very small set of things because if
you want to make sure that you never get two of the same kind that puts a
certain limit on it, this slanted markers kind of deal with that, so now all the blocks
have the marker in the same area, and you can have a potentially huge set of
markers. They also tell the system how they're stacked, which is really nice. So
if you stack three of them together, you can just look at the markers and you
know exactly in what order they are placed on top.
You would have to make them like this, which is actually quite feasible. What
you do is you take a fused glass fiber bundle. You heat them up in the middle
and you apply a sheering force. Professionally that's possible. In the lab, I
wouldn't -- I mean, I would have no idea how to do that. The way we did it is we
just took a straight fiber bundle and kind of cut it at an angle. It didn't work quite
that well because you're not getting that same angle so you're losing more light
and when the light enters the fiber.
Okay. And then the last one is actually the one that is most interesting for
making stuff that requires flexibility is a magnification approach. What happens
here is it basically takes the whole space on top of the fiber bundle and maps it to
a reduced area like demagnifies it and maps to this area and then adds its own
marker. And what's nice about this approach is that it completely virtualizes
whatever is on top, so markers can never interfere with each other.
And that gives you a lot of flexibility. So this odd construction I made here. So
this area basically gets magnified to this just by the fact that it's cut at an angle
and then this basically passes it up and it's the whackiest constructions. You
lose a lot of light this way if you make it out of cheap plastic fibers.
So the downside of the magnification approach is we lose more surface. I'm
going to do this quickly here. But basically because you magnify you basically
get the geometric row, so whatever you see there is demagnified and that's the
price. I mean, you can build quite a size with the other approaches with that.
Here these are actually floating around right now. So what you see here is one
that has a one-third black marker and one on the top of it that has a two-third
black marker and they're actually the same size, it's just the fiber optic
magnification here. Okay. There's certain -- let me skip over this. We made
more round blocks, whatever. Okay. It's just insane. We spend, I don't know,
it's clear now, but we spent a lot of time just building lots of different versions. So
resolution. Basically every time the light goes to one of the fiber bundles gets
resampled, right, I mean basically it's a sampling process, every single fiber is a
pixel, if you will. The good news is that the professional manufactured bundles
are like 10 to 40 times higher revolution than what Microsoft surface gives you so
it's a complete nonissue at this point.
The self made one I think surfaces like a 1.1 millimeter pixel sizes, so -- for the
built in camera and for the displayed pixels. And so we can actually even deal
with these things, you know, these hand made ones that are plastic fibers and
there's some benefit in being able to prototype.
With professional made glass fibers light loss is not that much of an issue, but
with the plastic fibers we use it is actually quite an issue and so that limits how
high we can stack. And you can see that the contrast is already getting low here.
Okay. Let me show you some of the things we built. So these are Frederik and
Horz [phonetic]. They're working on these are me. And we that I had this little
check us application as a first demo. Whoops. Okay. So let's see. So now like
the white man is reaching the baseline and the table requests that it be made into
a king and the table recognizes that something has been stacked on top of it and
now like kinds of the table knows how this thing is supposed to move okay. So
that's all basic tangible. We didn't add anything to that conceptually.
So here like just to make this clear, the table knows what's being stacked on top.
So if you try to stack like a white piece on to a black piece, them the table will
refuse that.
Another nice aspect is we get a certain amount of hover out of this thing, so we
can use that if the user's not certain what to do, you can hold your hand over it,
and it's going to recognize the hand on top. And you can use that to make
suggestions.
So here are the markers again. Let me just show you what -- that's basically
what the table sees. So the bottom one is a black piece and the top one -- the
top one is a white piece. And you also see the marker gets demagnified, so
we're losing sites with every single layer. That's the encoding scheme we use.
Basically third black is white and two-thirds is black and then we have look hover,
complete, the other values for like different see through versions.
So if the table observes this, it sees this, a black part on the outside, another
black part inside and then the rest is background there. And we can make that
robust against clipping by adding different types of redundant encoding schemes
to that.
Okay. The second one I like in particular it was a paper at Kai called slap
widgets where it showed how to do interesting tangent widgets on the table. So
we can do this same thing. We can have this dial that takes out the color tins
and then you can stack another one on top that it's just saturation and now that
bundle actually encodes that transform so you can take it as a stand and apply it
to a bunch of different things, different from if you had like a whole set of these
things next to each other.
So you could envision using these for a whole range of things. So you could
have a time machine widget where like the dials would be day, month, and year,
and if you felt you wanted just an hour you just stack another one on top and
then there's your hour dial and you could use similar offset widgets or line layers
and what not.
Okay. The third one is the one I showed you at the beginning. So these
Luminos we can they recognize when they're being stacked, obviously. So for
the larger ones we actually have different markers at the different parts of it. So
if we combine them like this, obviously the table only sees the things that are on
top of blocks that are visible. The table cannot see anything here. But it
recognizes this part and this part, and so concludes that this must be a bridge
here.
Obviously the mechanism is limited to line of site, so if there's no line of site, we
don't see anything. And then this other example on the title slide we can use that
the same way to make, you know, to make recommendations and suggestions
on how to improve construction.
Okay. So basically all the rest -- I'll skip a bunch of slides here. All the rest is
about marker encoding and how to squeeze a lot of information into the space
that's normally reserved for a single marker. But it comes down to basically
either you want flexibility or you want a large number of blocks. You can't have
both at the same time because there's just no space for it. I mean, you could do
it if you had the higher resolution camera in there.
Okay. So as the last thing here's -- you know, if you feel like making them
yourself, here's our paper cutter construction little guiding -- we cut a quarter mile
of this stuff, so at some point you feel like you want to make a machine for this. If
you shake them, they start lining themselves. It's quite useful.
They're hand made. I mean, I hope you appreciate that. This was before I
bought the buzz saw and the belt sander. This all got much better after that. So
-- okay. So this has been a lot of fun. Again, conclusions. So I think that we
should think of table top as something that we could use to kind of make, you
know, more interesting, more diverse tangible applications. I think the whole idea
of traditionally most of these table top -- the tangible things have been done with
electrons, you know, there were cables in between the connectors, you get short
circuits and whatnot, need to deal with electrical issues. I think that doing the
whole thing optical has some benefits. And again, from those people from a
table top perspective, you know, maybe this is a good platform for doing things
that are actually not flat anymore.
Oh, here is the quiz for -- can anyone guess what this one does here? This is a
taper and we cut a cylinder out of this. Okay. Those people understand the
caption you are concluded from answering. If you take a -- if you take a taper
and you cut a cylinder out -- I'm sorry. If you leave the cylinder, if you cut
everything away but a cylinder, what -- has an interesting property, the thing that
comes out of it.
>>: [inaudible].
>> Patrick Baudish: No. Well, yes, [inaudible].
>>: Sense touch on the side.
>> Patrick Baudish: Yes, it senses touch on the side, yes. But you knew what
[inaudible] that's right. So we can -- do you get the idea? So some of the fibers,
if you shave this off, some of the fibers actually terminate at the side of the peg
so you get a peg that is touch sensitive all around.
And then here's [inaudible] this is my version of [inaudible] oh, there. Here. It's
small and expensive but you know, it's round. I'll come to you when I need more
input on non Euclidean geometry.
Okay. So these are things we're thinking about. Anyway, so these are the
people that make it all possible. This is kind of my little lab space right now here.
This is Frederik, one of the students on the project you saw a little earlier. You
saw Christian and Shawn. Well, let's see. If you haven't, you know, this is
Hasso Plattner. Actually he's the reason why we're all there. He is one of the
founders of SAP. He basically single handedly just wrote a check for this whole
place. And it's one of the reason why we have a chance to focus on research a
bit more than other places.
This person I don't -- this is Terry Winegrad [phonetic]. I know most of you know
him. We had a lot of visitors over the last month, which is great, because I need
to teach, and every time one of you visits, that means I need to teach a little less.
The general reward is I take you to my favorite German restaurant if you come,
it's really spectacular German food, I'll tell you that. And I've got an open position
for like a post-doc or internist or something like that. So if you know someone,
feel free to pass this on. That's all I had. Thank you.
[applause].
>> Patrick Baudish: It's much harder to ask questions about this one, right?
>>: What is the resolution of one of those blocks?
>> Patrick Baudish: Okay. So the -- we use two types of fibers, .75 and one
millimeter which is roughly -- I mean you would get [inaudible] basically with a
resolution of the normal table. The much harder question is where do you buy
your fiber. So it turns out a lot of the stuff that's used in these glass fiber lamps,
then they don't conduct IR very well. So you look at the screen -- so you take
one of these -- if you have a mobile device, just hold them against your mobile
phone, they look wonderful. Even the ones that are like crappy and hand made.
But if you do the same thing with an infrared camera you're not going to see very
much. And that's actually a problem we ran into. Like this looks great. Why
doesn't this work?
So you want to be kind of selective about this. And actually glass works really
well for these things, just glass. You cannot manipulate it. You know.
So I think for what we did, the resolution, the .75 and the one millimeter was just
fine. I mean the markers -- the smallest features on the markers were like four
millimeters or something, so you have plenty of super sampling to deal with that.
That means there's about thousand fibers that go into this thing so it's three
centimeters each, so what is it, 30 meters? No. It's a lot of cutting. But we -- I
don't know. It takes -- I think now that we know how to make them, I would say it
takes about -- takes about an hour to make one. I think that's a fair estimate.
We made a lot.
>>: So does the company that actually will make a display and.
>> Patrick Baudish: The fiber board?
>>: [inaudible].
>> Patrick Baudish: That's right. Yeah. So I talked -- yeah [inaudible].
>>: A much coarser.
>> Patrick Baudish: Yeah, absolutely. So if you go to like -- if you go to
American Optical or Shat [phonetic], and I was on the phone with a bunch of
these companies, I think my favorite experience was actually I talked to a little
company in Berlin and we talked about like -- we got some raw fiber bundles.
You get these components that are like some sticks that they used to assemble
the larger ones, you can buy them. And so I talked to them and they said, yeah,
how do you do it? And I said we use epoxy to glue them together, and he said
make sure to get a [inaudible] which is like a German epoxy brand, I thought it
was hilarious the fact that they're using exactly the same process that we kind of
made up for this.
So I talked to people at Shat who make fiberoptics at a professional level, and
the reason my glass runs extremely expensive right now overall, like a face plat,
if you bought a silver dollar size they can go for something like $900 a piece, is
because they are basically two types of customers who buy them. A, it's the
American military. They become part of night vision goggles, and they used to
adapt between [inaudible] focused imagine to a flat image for a CCD sensor, so
they basically have one of these bundles that are shaved like a spherical curve
into those, and these go for 900.
And the other is endoscopic applications where they basically use that to get a
picture of human body. And they're both very high margin, you know, markets.
And they need the precision and the resolution.
At the same time they told me that they were thinking about putting stuff on
mobile devices now, so to get like a bezzle free cell phone that would look kind of
sexy guess. And so that's something they're thinking about. And then obviously
the prices are going to come down substantially, right? You would have lower
resolution and you wouldn't -- I mean, the precision wouldn't be the same.
The ones we made ourselves, by the way, I think they're like five bucks apiece or
so.
>>: How much is the spherical one?
>> Patrick Baudish: The what?
>>: The spherical, the hemispherical one that you were talking about?
>> Patrick Baudish: How much.
>>: How much.
>> Patrick Baudish: We're talking about something that's tiny right now, right? If
you made that big, oh, gosh, I think it's a -- I hadn't thought about how to make
this anything bigger than [inaudible].
>>: Is that when you're getting ->> Patrick Baudish: So I mean that's the next steps, right? We're going to make
these out of plastic fibers first. I think with basically a smart skin we're looking at
different, you know, shapes, that we can touch enable that way and how to get
like tangible properties out of these. How to make that like the size of you know
that [inaudible] and Andy have done. You know the thing would weigh a ton.
>>: I was just curious.
>> Patrick Baudish: Yeah. I mean, overall -- like with glass what happens, it
turns out -- talking to these people you don't cut glass, they shave it down and
basically have two spinning disks and they grind it down. So you know they need
a contraption to hold this thing in there for like hours, right, and there's cooling
and stuff like that. It's a really, really complicated process. The whole plastic
stuff is very different because I mean you -- you can just work with the stuff as
you used to.
By the way, the ones you're touching right now, you're having, these are not
glued in, they actually just stuck in there. You would just squeeze the last one in.
And when you sand them later, during the sanding the top kind of spreads up a
little bit and they just hold you know they're basically the heat fuses them
together a little bit and so they're very actually nice to work with in the end. But
with glass fiber bundle you have no chance I guess to do that. Yeah. Well, no
more questions, thank you so much. Good to be back. And looking forward -- I
hope you have a little time for me later.
>> Ken Hinckley: Thanks.
[applause]
Download