Document 17864799

advertisement
>> Michael Cohen: Hi, it’s my pleasure to welcome Bruce Gooch here today. Bruce and I both got our
Ph.D.’s from the same institution, University of Utah. He got, I got mine a little bit earlier. Since then
he’s been professoring around the country at Northwestern, north of Chicago, and for last number of
years just north of here in Victoria. In addition to professoring he’s been taking a lot of the work that’s
come out of his research and turned it into product, which I hope he’ll talk to us about today as well
through various startups, I guess, anyway.
>> Bruce Gooch: Thank you Michael. So like Michael said I’m Bruce Gooch, just call me Bruce. If you
have any questions stop me and ask them. Comic Tycoon is a product, an app that my company,
Insatiable Genius just dropped. If at any time I get boring all the technology that I’m going to talk about
is available for free in the Windows Store. So you can get these apps for free, try them for two days, see
if you like it. But it’s Comic FX on the Windows Store.
Like I said I’m Bruce Gooch. I pretty much do non; I started out doing non-photorealistic rendering.
Cameras anymore are you big ubiquitous. Most of us probably have one on us. The crazy thing is about
eighty percent of the images that humans create our non-photorealistic. We create art it’s what we do.
Right now my focus is creating tools that allow people to create content, to create digital content. The
best content, the ones that I like the most are the ones that can crossover. I can do stuff in the real
world and it has some virtual, some digital component, or I do something in the digital world that gives
me some real world artifact.
Okay, so as Michael said in addition to being a professor, for the past three years I’ve been running a
startup incubator in my lab. So if you look at my vita I’m a little light on publications for the last couple
years. On the plus side I’ve started, we’ve started nineteen companies. I have four multimillion dollar
stable companies that have come out of the lab, Insatiable Genius, Toon-FX, Gaslamp Games, DJ Arts
Games. I’ve been working with these, with the Cogmed guys. We did a startup with them and we’re
building applications, medical applications for kids who have panic attacks.
Anyway there’s lots of stuff. Relevant Games has been building games with embedded cognitive
therapies for kids with ADHD. Like I said this, it’s worked fairly well. Part of the reason is Victoria and its
part of the reason that I left Northwestern to go to Canada. We own everything we build. The
University just signed off on it. They have a blank, I got a blank check for inventor owned IP, which just
didn’t exist in the U.S. at any of the universities that I went to, and so I was able to do this.
The other thing is B.C. for some reason is something sixty percent of the people in British Columbia run
their own business. It’s the largest concentration of self employed people on the planet is Vancouver.
Who knew?
>>: Why nineteen instead of one or two?
>> Bruce Gooch: Oh, they’re not me. Sorry I’m just a facilitator.
>>: So…
>> Bruce Gooch: I did Insatiable Genius and Toon-FX.
>>: Right, okay.
>> Bruce Gooch: Everything else is my students.
>>: So just each student will go out and just try to commercialize what they do.
>> Bruce Gooch: Yeah and they’ve kind of watched me and, part of this is it’s an exploratory process.
They’ve watched me make a whole lot of mistakes. They’ve watched my few successes and they’re like
Gooch is making money I can do this. Then I just tell them what to do. I give them space in the lab. Like
the Jet Pack Laser Cat it was three students that took my Freshman Game Design class. They came to
me; they had this really cool final project. I said hey you should do this on an IPad and then I just gave
them space in the lab over the summer. I let them, I gave them computers and I let them work in the
lab and they showed up every day. I showed them how to incorporate in Canada, how to get a bank
account, and how to start stuff on the IPhone, and they did it.
>>: And is the fact that the University gives you the IP or allows you to keep the IP, is that a conscious
decision on their part or are they just the exception to the rule and they just happen to be that way?
>>: Or are they just clueless?
[laughter]
>>: Yeah, like are they going to wake up some day and say wait we should be like everybody else and
hold onto the IP? Or are they trying to foster…
>>: You just need one or two huge successes in the…
>> Bruce Gooch: I, yeah I think that if they get an HP or a Google it’ll, that switch will flip.
>>: Yeah.
>>: Yeah.
>> Bruce Gooch: But so far they get a lot of mileage out of it.
>>: Yeah.
>> Bruce Gooch: Because I’m, there’s an incubator in Canada called Enterprise and my numbers are
better than Enterprise. So like I said they’re getting mileage in the Alumni Magazines and all kinds of
stuff. Then one of the deals that I cut with my students and I show them how to do operating
agreements, and I’m like do this and you’ll never get in trouble. If you make, if you do two percent of
your profit of your net and say that I’m going to start a scholarship program for computer science
students at UVC with that two percent of my net, nobody will ever, nobody’s ever coming after you.
[laughter]
So they’ve all done that. The catch for them is well they’ve got to make money.
>>: Sounds like a Mafia agreement?
>> Bruce Gooch: Right, ha, ha, being a faculty member, right. But two percent is two percent, you’ve
got to make money and so they’re all over it. You know and everybody’s like oh that’s fair I’ll do that
that sounds great, you know. Okay, so I released a bunch of apps and this is through Insatiable Genius.
I, ToonPAINT is one and this research I started forever ago. Take a picture, make something that looks
like a cartoon, just a black and white, then you can paint it in with your finger, paint in colors. That one
has about three million users now. This, it’s on IPhone, Android, Windows 8, and then Samsung. Pencil
FX, we dropped this one last fall about, not quite a year ago. This is a new engine that we developed
over the summer. Take a picture and it will give you something that looks exactly like a pencil drawing.
This one has about fifty thousand users.
Okay so that’s what I’ve done. I’m going to pull back the curtains here. I’m going to show you how I do
it. I kind of look at art, I have a three part process for working. I look at art and psychology, and
computers and I get ideas kind of from all three of them. We did a color to gray algorithm and it came
from looking at a book by Margaret Mitchell, the psychologist. A lot of the work that I do has been nonphotorealistic, I’m making art.
It’s really tough to sell art to a computer science crowd, right. Why should I care you’re making art? So
what I started doing is looking at psychological evaluation. I’m going to make an image that’s going to
make it easier for you to perform a task than a photograph will, right. We all know like a technical
manual. Certain images that we make are better at say facial recognition. If I can measure a
performance increase then that’s something that I can, that is a validation of this type of work, right.
Then sometimes it’s computers. I find a new tool, that tool makes something that looks like art, or that
tool enables me to tweak some psychological process, right. Some cameras make really good edge
images. Some let me see into the infrared and there’s things I can do with infrared. Some give me
depth. Well if I’ve got depth what can I do?
So usually there’s some inspiration but I’m trying to combine these three things. Another one is known
versus unknown, right. Usually I’m pretty happy if I don’t know what’s going on. Because if I don’t know
what’s going on then maybe no one knows what’s going on. Then you have to check yourself. Because
I’ll run into things, I don’t know what’s going on but do I have to go back to the book or somebody else’s
book? So there’s this interplay, right. Is this think known or is it unknown? Do I need to go read the
papers? Is it in the papers? Is it in the book?
Right, and then this idea of top down versus bottom up. A lot of what I do is very bottom up, right, I
just, I play with things. I look at them; I put together image filters until it looks like art. But then you’ve
got to back off and say what’s going on here? Why does this work? Why does this produce this result or
why does this, can I look at the bigger picture? Is there a bigger picture? Does this mean anything to
computer science at all? So it’s once again this thing where we do the close up, we do the high view.
Like I said usually I’m close and I try and work on things I don’t understand versus things that I do.
So today I want to talk about comics and this comic process. This is a crazy thing. Ever since humans
have had writing and art we’ve made comics, right. It’s one of the first things people do when they
invent new languages. Let’s start making comics. I’m going to do a picture and a thing. So you need the
Bio Tapestry, the Ronin, the Egyptian hieroglyphics. They’re all so amazingly effective, right. The 9-11
comic actually outsold the 9-11 report. More people read the comic book than read the report.
As far as imparting technical information they’re fantastic. Process information, do this, do this, do this.
The U.S. Military still uses a comic that was developed in World War II. The soldiers call it, How to Love
Your Rifle. But it shows you the cleaning and the operation of a military weapon and it’s a comic book
that you get. They also give you one that shows you how to file your taxes, turns out it’s eighty percent
more effective, eighty percent more soldier file taxes if you give them the comic book than if you give
them a sheet that shows them how to do it, right. Images and words, we put them together you get
more than one or the other.
So why don’t we all make comics? The big one is the crisis of realism, right. I want to paint the, you
know I want to paint this and boy oh boy I paint that, right. The other one is schools don’t really value
drawing in North America. By fifth or sixth grade I’ll get half an hour Art as an enrichment activity once
a week and usually they just let you go. Okay guys here’s a piece of paper draw something. You know
so I draw a stick figure, the guy next to me draws something amazing, and I quit, crumple, crumple,
crumple. Well we’re done with that, right.
That’s what I can do, right. I can take, what I knew was I could take a picture and I could give you
something that looked like a comic. The problem, what I really found at the beginning was I’ll just put
bubbles into ToonPAINT. It didn’t work it makes this. So I went back to the drawing board and I came
up with a five part process. This is how people go about creating comics. Once again, unknown I don’t
know how to do it. I go to the known, I got every book I could find and I read through them until I came
up with a process.
I talked to, I have a lot of Facebook friends that are comic artists now and I just asked them questions.
How do you do this? How do you do this? How does the process work? Well there’s a pencil or there’s
an inker, and there’s this guy, and there’s that guy. The big one, the first big thing is composition. We
all take photos. There’s tools on my phone that show me how to take a photo, where to line stuff up.
The problem with the comic is I’ve got to leave enough room in my composition for the bubble, right.
Then I’ve got to have some kind of eye control. Because if you look at most comics there’s something
about the way my eye’s work and the way that it makes my visual process work, right. My macula will
track in a certain way. So this guys eyes kind of point or his nose line points toward that bubble and
draws your attention to it.
When we start putting together panels it becomes even more complex. This for me was pretty cool.
I’ve done a lot of work in composition. I built tools that helped you simulate the works of Jackson
Pollack. One of the crazy things about Pollack paintings is everything hits the same fractional dimension.
He hits a very small range in the fractal dimension with his paintings. You can show that he went in and
cropped to hit his fractal dimensionality. There’s a physicist named Richard Taylor at the University of
Oregon who studied this forever.
So I just built a coaching tool that helped you. Okay so if you put some more fine stuff here it will get
closer and closer. You want this fractional dimensionality to be about one point seven. It turns out to
be freakishly difficult to get things near one point seven. But then when we did force choice
experiments later things that were closer to one point seven were actually, the audience liked them
more. If I give you two things, one with a one six and one with the one seven you’ll like the one seven
on average. People on average like the one seven more than they liked the one six or the one eight, so
it turns, it seems to work.
This is one that I did. This is once again looking at a mobile device. Let’s say you’re out whale watching,
you see the whale, somebody takes a, your wife takes a picture of you, and you want to send it to your
grandma on a mobile phone. We kind of have a couple of options right. I can crop, I can scale, which in
my case is wonderful because I just lost like forty pounds, or we can chop the image apart, put it back
together in a way that gets the important things all into an image that fits into a small spec, right. I’ll
talk a little bit more about this later. But that worked pretty well.
What we’re doing is using, combining computer graphics with some machine vision, little bit of artificial
intelligence to re-cut these images to fit a given size specification. Once that worked we did the same
thing vector animations. This worked really, really well because an SVG I just have a tree, that’s this
image. I can annotate things or give them a weight and then things get smaller or larger we change
based on these weightings. This worked really, really well for informational images, right. Because in
Canada well I need to see what these numbers are, right. If I do a uniform, right if I lose information of
my image uniformly I’m going to lose them, I want to, I’m going to lose important information. If we can
annotate importance then we can resize in a better fashion.
For the Wally Wood stuff, for the comics I tried everything. What I fell back on was I found this thing
called, that Wally Wood made called Panels That Always Work. Wally is the guy who wrote the original
comic with Stan Lee when they were in the military in World War II; that showed soldiers how to file
their income tax forms. Wally had made this poster and he put it on his, on the wall behind his desk that
said hey don’t, you know don’t reinvent the wheel with every panel you draw, just use these most of the
time.
So I, basically what I built was just that. It’s a Wally Wood Composition Tool. You choose a panel and I
just, I can show you either of them or let you cycle through them. Choose a panel and I just put it on
your camera, right. So you aim your camera, move the camera, put you in there, snap the shot. Like I
said not, this is one where there wasn’t a whole lot of technical, you know what I mean? I didn’t
reinvent the wheel I went out and found something that worked and I used it, instead of doing
something with a whole lot of technical specific, technical difficulty.
The next things was a new ink line algorithm. One of the problems on a mobile device is that I, there’s
just a limit to the computational complexity that I’ve got, right. When we started doing this on an
IPhone 3, you know I get the computational complexity of a stop watch. The modern stuff I get a GPU
but I’ll suck your battery dead in no time. But what it really needed is something that works fairly well
and is consistent. It takes ten seconds it takes ten seconds. Then on the devices when we went to
Windows 8 I get a whole lot more pixels and a better computational model, and I get more memory. So
it actually worked really well.
This is based on some work, I actually started this for my dissertation about twelve years ago. This work
we published it Sed graph in two thousand six with two of my graduate students. Basically all we’re
looking at is difference of Gaussians. With Cameron here what I’m trying to do is draw these edge lines,
just where ever you see the black lines. So that would be the basics of the comic. All this is is difference
of Gaussian. So I’m going to use a Gaussian filter, run multiple Gaussian filters and difference them to
make an image stack, right. So multiple blurs of the image, do a difference between these things, and it
finds these edge lines.
The problem is if you look at Cameron, it looks a little better on my screen, but those lines are really
harsh, right. It doesn’t look like a comic it looks very pixilated. Okay, so we smooth them. This was kind
of the other contribution there. We used anisotropic diffusion to smooth everything. What we found is
we could run multiple iterations of the anisotropic diffusion and then the Gaussian differencing. Then
one of the things when you look at an edge we usually think of an edge, so this is what the raw output of
those, of the Gaussian stack looks like and we’re going to do a thresh holding step to decide where an
edge is. Usually what we get is some sort of impulse, right. A step function, this is an edge, this isn’t an
edge. We just started looking at them as a tanage function. This gives us this kind of softened edge.
That’s what we were doing about yeah seven years ago. Seven years ago on a good computer, on a fast
computer I could get this to run at about twenty frames a second, right. It produced at six-forty by foureighty and it produced a kind of jaggy video, but it worked. Like I said it worked. The stuff that I do
today is crazy because I, well, the other thing that you do, what I do, okay so I’m going to go out and I’m
going to make better edge lines. I’m going to make better video. My goal here was to be able to shoot
to make something look like a comic have a still image that looked amazing, looked like a comic book
image. But have something that would run fast enough that I could do video. Today video on a mobile
device you can do it but there isn’t a whole lot of video sharing.
There’s kind of two companies that are doing it right now that are trying to do the kind of Instagram
thing with video. You’re limited to six seconds, six-forty by four-eighty in most cases. It just hasn’t
caught on. People aren’t doing it. But I’m saying I’m going to position my company for the future we’re
going to try this. So the first thing I did is I started building tools because I knew that I wanted to be able
to, I wanted to be able to do this low level thing.
I wanted to be able to program in MATLAB, right. Look at these filters and these iterative filters and
things are going to be going crazy. You know my diagram before was fairly straightforward I’m going to
do an edge image, and I’m going to do a color image, and I’m going to do a compositing step, and get a
final thing out. If you ask I’ll send you my other one. It’s a giant spaghetti of things. We’re doing, we’re
basically mimicking a lot of visual processes that happen in the human visual system.
So I’ve got feed forward and feedback, and I’m using the edge lines to inform the color process, and the
diffusion process, and all kinds of things that are going on. It’s kind of a secret sauce behind everything.
But what I knew that I wanted to do was to be able to prototype in MATLAB. Because I figure if I’m not
working in MATLAB I’m not doing anything that’s cutting edge. I’m just refining somebody else’s
solution. But I want to see this thing work on a GPU, on a mobile device.
So Brad Larsen is a chemistry student and he built a system that would allow you to work in MATLAB
using image processing tools and it would automatically kick out GPU code. So we started leveraging his
stuff. We actually worked with him quite a bit. We’ve debugged a lot of his stuff. We’ve ported it to
Android and that’s out there you can download it on GitHub right now. Then we’ve ported it to
Windows 8 and I haven’t released that code because I know that it only works on a single Nokia phone.
So the Nokia guys were happy with it. It was selling apps and we’re like okay that’s as far, we’re not
letting other developers touch it at this point.
So that’s the first step get a built in tools so I get to play with everything. This ended up paying off really
big time later on because the memory requirements for what we’re doing are so, are vast. So what I do
when you enter apps, when you’re changing things. Like they go into the app and change buttons like
you’re using a large, medium, or small filter. We’re actually generating a shader on the fly and it ends
up being about seventeen thousand lines of shader code that we generate. After that we just use your
shader, right. The second, the higher order parameters that you get to move the sliders you get to play
with and those are feeding into your shader, we’re actually building a shader on the fly for you. So this
worked out really well for us.
Okay, so these should play, this should be a video and bad things seem like they’re happening. Come on
baby play.
>>: [inaudible]
>> Bruce Gooch: Oh no. Okay doesn’t matter. This is well, Thor versus the Ice Giant in my back yard
with me and the neighborhood kids. Basically what happened, if this video was playing you’d see that
it’s pretty noisy and this is my first pass on this kind of stuff. I’m getting noise in two ways. I’m getting
noise from these edge lines and then I’m getting a little bit, I’m getting quite a bit of color noise back
here.
So I went through and tuned everything that I could. Yeah I’m just using diffusion in the soft
quantization. Once again, if this were playing, this is after that Valkiry has stabbed me and I’m laying
here on the ground. I’m getting a lot of noise over here in the background where the trees are. It’s, the
trees are moving but they’re in shadow. So I looked at parameter variations but it turned out that if you
do a bit of analysis most of my noise is coming where edges are. Right, I’m getting the bulk of the noise
in edges.
I look at everything. I looked at all kinds of parameters and once again I can mail you these slides, you’ll
just see that it’s crazy noisy. I wasn’t really able to get rid of a lot of the noise from these things. What
it turns out is that the problems are low light and texture. You can see on this, these are a couple of
images, I believe its Scarlett Johansson, low light areas were what, were the worst for us. Then these
high texture areas, right. Here’s another example this is a finer grain. You can see that everywhere
where I’ve got like her hair right here this is the worst because I’ve got low light and the repetitive
texture.
>>: These are all videos?
>> Bruce Gooch: These are images.
>>: Yeah.
>> Bruce Gooch: Videos stopped here.
>>: Okay.
>> Bruce Gooch: So what I did and the whole thing behind what I’ve been doing since two thousand six
is it’s very low level. If I get rid of all the noise in a single image, right. If I decrease the noise as far as
we can in a single image it seems that we’ve, that seems good enough when we move it to video. I was
going to talk about this before. On, yes if I can get multiple images they’re just simple Bayesian’s work.
But then I’ve got to do the registration problem which tended, we talked about this earlier. There just
isn’t good registration stuff outside of Microsoft that was freely available. So then I’m solving an
entirely different problem. On an IPhone, on the original four I could trick it into taking two images at
once and then the, it stored the second image in swap space, and then you could just do simple things,
you could do a Bayesian or you could just do an average over the two images because they ended up
being really registered. The camera would just flip twice and it got rid of most of the noise. But I
haven’t been able to do that again.
The other problem is even with this camera in a vice, so I clamp this into a vice on my desk and I was
looking, I playing with low light and so I was taking a picture of the inside of my file cabinet with my
camera in a vice, just pushing this even with the darn thing in a vice moved me about six pixels on
average. So that was just another level of complexity. So I ended up for mobiles dropping down to a
single image. But yeah if I could get three, three would be magic and all the noise goes away. So this is
another low light and texture example.
Let’s see where I am, wait did I, okay I must have gotten rid of those slides. I had a whole bunch where
it talked about noise; oh it moved them to the color section. Okay so what I did was, everything I did
was local adaptation, asymmetric filter kernels, and what I started doing was orienting them based on
flow images. So I can calculate optical flow really quickly, orient based on the optical flow so that
everything runs, my filter kernels run in the direction of the optical flow. That worked really, really well
and then orient the filter.
That got me to here. So once again it’s pretty good, right. I’m still in this place where we’re in the dark
and the shadow. It’s not really great. So I did something totally, I don’t know it’s biologically inspired.
In the filters I started looking at the morph count, if I were going to use a morphological operator. So I
look at the sixteen neighborhood or the eight neighborhood of a pixel and I just invert the filter
response. So does that make sense?
What was happening right here and I’m, this only works because I’m looking at a black and white image,
right. But I look at the pixels and I look at the sixteen or the eight neighborhood, right whatever my
filter size is. If I’ve got more black than white I use the same filter response. If I’ve got more white than
black I flip it. The only reason that I was able to do this is because there was an if, I get an if now in the
shader generator I get one if, in my shader. So I can make this one if call and then I just change that
impulse response.
That ended up actually working really well. Now it’s generated in artifact but the artifact is pretty
difficult to see. Because if you look at her hair right here I’ve generated those weird artifacts where it’s
exactly, its fence posting. I’m exactly right on those where it’s coming across and where it’s oriented.
It’s, but it’s okay because it just looks like a single fake hair, right. I mean this very uniformed textural
visual feel that’s given me this little artifact that goes across it.
So then take this, do diffusion, do an anisotropic diffusion, same thing that we did back here, in black
and white across the, I just did it across the color channels black and white. I anisotropic diffusion them,
push it together and that artifact goes away. So that’s what PencilFX is. Then if we just generate, we
can generate paper textures on the fly and blend her onto it. Once again, almost all of the noise that
came out of the algorithm is gone now, right. People just read it as the paper texture noise.
But I really like, like if we look at her fingers and then her lips it all looks pretty good. These are just
some parameter variation changes. So I’ve got a couple sliders here where we can change the size of
the filters. Then how much diffusion and then the weight, how much weight we give to the diffusion
image versus the edge image. We get a variety of styles. But like I said it worked fairly well but it was…
>>: [inaudible] basically a slider, some [indiscernible] sliders?
>> Bruce Gooch: Yeah, I can preset and then there’s a small, medium, large that’s the filter kernel size.
And we have names for them and I can’t remember what they are. You get a small, medium large, extra
large, and then sort of a blend one, and a…
>>: I mean essentially you let them play with parameters. You’re not saying here’s ten combinations
that we think you might like, pick one of the ten.
>> Bruce Gooch: Right.
>>: That’s the other…
>> Bruce Gooch: And that’s kind of the…
>>: Instagram…
>> Bruce Gooch: The Nokia guys push us for that.
>>: Yeah.
>> Bruce Gooch: And I didn’t like it because what we found is if you give the user, if I do everything for
you, you think the machine did it. If I give you some input even if the machine did ninety percent the
user believes that they did it. This is the art that I created. I did a graffiti app, I thought it was the
coolest thing I’d ever gone. But we did almost everything for you all you did was type and then you got
these graffiti images. It was the coolest thing ever. In the backgrounds, I spent forever on this, I was
able to embed QR barcodes and they looked like the background. So to the human eye it looked like a
graffiti tag but if you took a picture of it you could read the QR barcode. So it’d be a tag but the tag
would take to your website. You could use it for pricing, all kind of stuff. People hate it, hate, hate, hate
it. Because the machine does everything, that’s just typing who cares.
So, yeah, this puts out, this outputs video and I’ll put really great video but this is what it looks like. This
is a hand drawing a hand. It’s that whole Usher thing.
>>: You really should probably go out of the Power Point to get if the source file is there…
>> Bruce Gooch: Let’s try.
>>: Of course you have to figure out which crypt technique it is.
>> Bruce Gooch: That’s the problem with giving a graphics talk, right. It’s either this one, ah this is the
next one that I was going to show. So this is, this is the magnification talk, the stuff that you were
talking about before. The guys at MIT that had been doing the image magnification stuff so they could
see your pulse. I’m doing something similar but I’m doing it along the temporal dimension to give you a
Cassidy Curtis style image abstraction, this is this one.
But on the IPhone this will run, I can do this on an IPad 3 in real time at about thirty frames a second. So
I can pan it around, you guys could see it. But this works really quite well. I could show you this one.
See these things just pop and scintillate everywhere. But it’s mostly along the edges. See there’s some
noise in the color space but it’s far higher when we get to that edgy kind of thing that we’re doing,
which makes a lot of sense, right. It’s where the big changes are happening.
>>: So one of the affects that you’ve shown us so far it seems like you know they’re sort of the same
low level algorithm going on for every pixel, every area and image you get, therefore the same amount
of detail everywhere. While some artists would abstract away the detail in places that are unimportant
and just show you the detail of the faces or the things that they want you to look at. Have you looked at
that sort of higher level abstraction and focusing the detail in places you want people to pay attention?
>> Bruce Gooch: Yes and no. It’s like I have two hats, right. So as an academic yeah I did that like with
the cell phone stuff. When I take that hat off and put on the developer hat, no. There’s a lot of apps out
there that do that kind of thing and you can take a picture, I’ll show you black and white, let you color
on color, or let you scrub on a blur, you know or paint on. So in a blur everything except the part, that
other part.
So as a developer I have to make money to keep my little shoestring startup going. So I’ve kind of stuck
with stuff that made money. Where there was a ton of competition and other people were doing it I
didn’t, right. So it’s you know a dozen of one half a dozen of the other.
>>: Yeah, I do. I’m not trying to debate this. I’m just thinking like one of the ways to eliminate the noise
being distracting in the trees is by sort of just blurring that all away, because its trees and we don’t care.
But we care about details on the people.
>> Bruce Gooch: Yeah right. The real problem with that at the end of the day and I’ll talk about more
later, I need to be able to do foreground background. I could do a simple one where I’d have you paint
on the foreground, paint what’s important. But that assumes that I have robust segmentation.
>>: [inaudible].
>> Bruce Gooch: And I didn’t have really robust segmentation. So later on, it’s what I’m working on
right now. It’s what I’m doing today, like we, my graduate student gave a talk on it last week. So that’s
exactly what I’m working on. I want fast robust segmentation and I want a very specific property out of
that segmentation.
So let’s see temporal abstraction you guys saw that. Color algorithms, so here’s where I’m getting, I’m
just going to show you the slide. So there’s the slide. Let’s see if I can find me, nope that’s the source.
If you look at this it maybe a little bit difficult to see because I’ve got this. I’m getting a ton of noise right
there and then if you look at my face I hadn’t shaved that day and so when I move my face I’m getting a
lot of color noise and then on the back of the chair where I’m sitting.
So this is one of the things I spent forever, like six months working on this. Like I said the inversion stuff
worked really well on the line images and this is one of the things. It’s not so much your technical thing
it’s like; I went back to the drawing board I said okay where does noise come from? Turns out that noise
comes in sort of two types of noise; random noise and pattern noise. When I started doing experiments
there wasn’t a lot. Most cell phones aren’t calibrated but I’d have to take, get you to take your cell
phone through this massive calibration step. Like go into a really dark room, take two photos, then I
could get rid of a lot of the, like dark corent, the thermal noise.
But getting somebody to do that to their phone might, they probably, most people probably wouldn’t
do it. It was heinous and then once again I ended up with the segmentation thing. Here’s me wearing a
hop light helmet and showing you the kind of noise I could get. It also comes in sort of three flavors,
signal dependent, temperature dependent, time dependent. Like I said I worked on this for an entire
summer just looking at the kinds of things we could do, the kinds of algorithms that I get. In this photo,
once again if you could just turn down the light. I think the gentleman will do it. You can see a lot more
of the noise it’s just because it’s really light in there. But I’ve classified these based on the two different
types. This is one of me in a hop, in a reflective, highly reflective hop light helmet and you can see all
the different kind of noise that I’m generating that I want to get rid of.
We can turn the lights back up please. What I found out when I started talking to users it doesn’t
matter. I’m like what do you mean? Eh, it doesn’t matter. I’m like huh? I’d worked forever on motion
and focus, right. How do I get rid of motion and focus? Doesn’t matter, if the picture doesn’t work I’ll
take another one, it’s my cell phone. I’m like you’re kidding. As a researcher I’m like no this is a big
problem, this is huge. Nah, we don’t care.
So that’s my daughter Zoe, her dinner was too hot and she’s fanning it, right. Well Zoe could you just
hold the fan still instead of waving it. This is my cat Percy and he’s doing boat things. He’s in the dark
and he’s moving as I took this picture of Percy being naughty. It just doesn’t matter I’ll take another
picture. Once again that was the thing. I went back to the books. I cooked up everything I could. I tried
to do everything on the cell phone that I possibly could. At the end of the day the answer to my
research problem was don’t care. That’s what Percy was doing. Amazingly difficult students don’t see
the irony in this situation at all.
This however I was able to get a lot of, I was able to do quite a bit of stuff. On a mobile device two of
the real problems that you have are low contrast, ultra low contrast. Then the stuff that I was doing was
muddy and like Jessica, this is one of my engineers Jessica. If I can go into Photo Shop and just pop her
eyes out as white she looks ten times better, right. But any, the diffusion process her, there isn’t
enough difference in the color near her eye and her eye color and it just gets kind of muddy, right.
But a comic artist would pop that out. You make those pop. They also do this weird thing where you
make that, you know that fold at the bottom of your eye about ten times larger than it actually is. If you
look an art work, artists draw those things so big that you could balance a marble on them, because it’s
something that humans pay attention to. Once again the psychology informs the art, looks at the
computer science.
Here once again is how I work. There’s a lot, Monet wasn’t, was actually quite nearsighted toward the
end of his life. You can see a lot of his art work in the early phases is very, very tight. Then as Monet
ages everything becomes very, very loose, right. By Garden of Giverny it’s completely just shadowy.
What people believe now is that Monet painted what he saw.
There’s some evidence that Mondrian painted the apple tree in front of his house. This is Monet’s apple
tree evolving over Monet’s life. Oh, sorry Mondrian, that’s his apple tree that evolved over his life.
Turned out he had a very large tumor in his visual center, near is visual center. So there’s some
evidence that the abstraction that he was looking at, and the painting was based on this.
Well, was Van Gogh color blind? So there’s a Japanese research and I forgot to put his name on my
slide, which is horrid of me. Here they don’t look really different. What this researcher had found was
that if you, he’d done a whole bunch of work with protanope color blindness. He built this fairly simple
filter and there’s apps now that will do it. Well what would this look like if I were color blind? Turns out
if you look at Van Gogh’s with protanope color blindness they look a whole lot better.
So we took the Japanese researcher’s work on color blindness and the color blindness problem was an
interpellation problem, right. I’m going in one, I know what A is, I know what B is, I draw an
interpellation line. The simple thing that I did I said what happens if we exterpolate? Right because
what in fact we’re doing is reducing color contrast. What happens if we just use the same data, the
color blindness to increase color contrast? Once again are these showing up really well on the, can you
see a difference?
>>: Yeah.
>> Bruce Gooch: Because on my monitor it’s fairly amazing. Okay so this is in the app, okay I think in
the app we called the juiciness filter and this was inspired by Bill Buxton’s intern who gave a talk at
NPAR one time. You said I want photos that look like this. I don’t know what word to use but I’m going
to call it juicy. I want juicier looking photos.
So given all that, Katy Perry where are you?
>>: [inaudible]
>> Bruce Gooch: I’m not seeing it though. Anyway, given all that I can do pretty good video, in real time
on an IPad3. I just got a snippet of a Katy Perry video. But you can see it, it looks amazing, the colors
pulse slightly but other than that it’s fantastic.
Okay and that leads me to where I am today. This is what I’m looking at now. One of the problems that
I have that a real comic doesn’t is this thing that I call the coloring book property. So comic books were
made on four color presses and wherever two colors touched each other the artist would draw a black
line over that, right. It’s just an artistic convention. The reason was when the colors touched everything
got muddy. So you draw this black line over the top and it’s called a trap. So if we look at a coloring
book and I chose this one specifically because the artist who did this didn’t do a perfect job. There’s a
couple places here that aren’t trapped.
My algorithms give us that, right. I don’t have a complete segmentation of the image. I don’t have this
trapping property, if I had this, if I could do this suddenly the worlds my oyster for a lot of reasons. I can
go down the, I can go have a conversation with Charles Loop and we can take video, if I can process the
video and get this property suddenly I can go directly to implicits, right. So I can have an implicit image
but I can have an implicit video, right.
This is amazing because megabytes of video become kilobytes, right. Watermarking works, almost every
watermarking scheme you can come up with I can destroy with a couple of fine image transforms. Grow
it by an odd number of pixels, shrink it by an odd number of pixels, I can almost always erase your
watermark. But if you could throw this process at the big image, at the small image, if it were close,
bam, watermark. This would work for search, right. I run this process on your image, save off this little
implicit function as metadata, say hey go find me images like this. Then you just look at that image
metadata, right. So that’s why I’m doing this. The problem that I’m faced with is I fast, amazing, robust
segmentation. That’s what I’ve been looking at.
This is what I did before. The real problem with segmentation, I’ve got to enter in per image a whole
bunch of parameters that show me how, how exactly do you do that? So in the past what I’ve done and
what most people do is you over segment. Make the things really tight and then use some sort of
watershed to simplify these regions back together to get coherent regions. With this one I also had a, I
used face recognition and I used Koch Itti in, Itti Koch, Laurent Itti important saliency. So I would add
the saliency information or the face recognition information to these before I ran the watershed, right.
Takes a lot of time, no amazingly robust and this is where the universe intervened and dropped a
graduate student in my lap. His name is Ashland Richardson. Ash has been working on, Ash is thirtyfive, he’s been working on segmentation his entire life. He was at Quantum London and then he worked
for the Canadian Space Association, then he came and started working for me.
This is nine dimensional radar data, okay. So what most people do and what the gold standard C&C Kmeans, so you do K-means simplification, so we’re looking at satellite, this well perreal data, an overfly
with radar. They’ve simplified this nine dimensional, we use dimensionality reduction. ISO Map and
Alloli and we use a hybrid of those two, to go from nine dimensions to five, which the radar community
says is most salient.
So here’s K-means clustering on those five dimensions and here’s our K-means cluster, right. We’re far,
far faster. There’s one, Ash showed me this, so these are trees and, so here’s normal trees. These trees
in this region are different than all the other trees because there was a forest fire here twenty-five years
ago. Our data shows it and the K-means doesn’t but the Ground Truth does. Ash told me that this was
really important.
The one thing that we’re doing and now Ash calls these dendrograms; what he’s really doing is looking
at, it’s multi dimensional but all we’re doing is treeing, right. Same thing that you do but when does
your tree stop and how do you look at your tree? What everybody does is they say okay I’m going to do
some sort of tolerance or distance parameter. We did not bad, we did algebra. So as you’re filling up
your, as you’re filling up this tree type database we look at secant lines, based on what’s the difference
between adding this pixel versus not adding it? We look for changes in the derivative, basically, right.
So we’re looking at secant lines as this thing tending toward infinity or not? So we’re finding what
amounts to the first derivative of the change in adding a pixel versus not adding a pixel. But we’re using
an algebraic solution. We can’t use a calculus base solution.
This I think, Michael said that there’s a really strong theory group. Once again I’m off the map of where
I’ve ever been before. I’ve never seen anyone run an algorithm like this. Right where I’m trying to run
calculus based on how a data structure fills up, a calculus based solution. But I know that I can find
things that look exactly like minima inflection points and maxima. It works amazingly well. It works so
well that I can’t show you how this algorithm works.
I got, Ash just got a letter from the Canadian Military saying that they’ve given this the Defense stamp. I
can send you the slides because we’ve already given them. I can tell you about, I can’t tell you about
how the dendrograms work. I can tell you about our solution works to stopping but that’s our stopping
criteria. But, yeah it works really, really well. That’s what I’m using. My hope is that if I can get a really
robust segmentation, I’ve been using line interval convolution but I’ve just switched to a third order
Runge-Kutta to do smoothing. When the smoothing runs out if I’m at a junction between two color
areas I’ll just extend the line. That’s my hope; I’ll do some flow based algorithm that goes in between
those regions that will give me this coloring book property. But it’s one where I’ve got to back off now
and see if I can prove that I have the coloring book property for every image that I run.
Okay, given that a lot of what I’m looking at I’ve used really simple rules for simplification, right. So I’m
in suddenly in a really target rich environment. These are rules that cartographers have for simplifying
maps. There’s a ton of these and these are other things that I’m looking at to help me with this process.
How do I get myself this coloring book product kind of thing? So these exaggerations I’m going to find
out what’s important and exaggerate it. That’s, you know if it’s important, if I’m getting something of
my importance function I want to draw a line there, if not I don’t. There’s things that I want to eliminate
that aren’t important, right. How do I decide on what those things are?
Right now if it’s something like this that doesn’t enclose the region we’re calling it a texture kind of
thing. Your eyes enclose a region but your smile, the line of your smile doesn’t. So I’ve got to leave that
one, right. But maybe my eyelashes where two things join, where a surrounding region joins a nonsurrounding region maybe those can go away. Typification and I just randomly get rid of some of the
stuff in my image. It turns out that under certain conditions sure. If I throw a bunch of grapes at you it’s
the same as throwing a tennis ball at you, right. Our visual system doesn’t care. It’s a green thing in
motion. We really don’t care. So if it’s in motion, sure, right, get rid of it. If it’s still and it’s important
then probably not, but if it’s still and unimportant like you said throw it away, right. Then outline
simplification.
Once again I think that once I can get an implicit, right. We can get implicit following retain then
simplifying what an implicit function is becomes really, really easy, right. So I’m not, I really not worried
about that one. I think that’s kind of automatic. The typification I think is going to be a rule set. Okay
and I just threw these in I didn’t know if I’d go over time or under time. This is just another thing that I
built.
All of our apps right now you can create products. I’ve let that thing go. You can get it for free, it’s in
use in about a hundred apps right now. So if you do something with graphics, create a product I can
beat the price of CafePress or Zazzle Higher Quality. The user gets forty percent. Anyway that’s what I
have. Thanks.
[applause]
Any questions?
>> Michael Cohen: Great.
>> Bruce Gooch: Okay toothbrushes and cell phones. I’ve heard this attributed to many people but six
billion people on the planet, five billion of them have cell phone, four billion of them own toothbrushes.
I’m probably going to keep working on cell phones.
[laughter]
Alright, thanks for coming folks.
Download