>> Andy Wilson: So today we have a guest... Kentucky who happens to be my older brother, Hank Dietz. ...

>> Andy Wilson: So today we have a guest speaker from the University of Kentucky who happens to be my older brother, Hank Dietz. He got his Ph.D. from Polytechnic University, which many of you may know is now part of NYU. He probably is best known for his work on compilers, the PCTS tool kit which some of you may have used, and did a lot of early work on parallelizing Linux. But today he's going to talk to us about his hobby in photography. He's always been into photography. In fact, we just sold our parents' house which had a darkroom from our childhood, and, of course, nobody has any idea what to do is a darkroom in this day and age. Anyhow, so with that, I give it to Hank. >> Hank Dietz: Thank you. So I guess my brother has already said, this is kind of a hobby for me. So let me just say if I say some things that sound kind of wrong, they probably are, but, you know, that's okay. I've basically been playing around with this for a while. I'm not an image processing guy in any way, shape, or form. Not that I have anything against them, but I'm happy that they're image processing guys and I'm not. I am a computer engineering systems guy, and the kind of thing that I'm usually looking at is not just compilers or hardware architecture or operating system, I'm looking at how you integrate the different pieces to make the system work better. And if you think about it, even though I've been doing that mostly with parallel super computers, that works just as well for embedded processors, and, after all, what is something like this but a little embedded processor with a really cool IO device on it. Right? So essentially my background in photography goes way back before I was a computer guy. So essentially back in the 1970s I was photo editor of things like Broadway Magazine at Columbia University. I published photos as a professional photographer, did a lot of work that way. Then I kind of got down to doing serious engineering and I kind of forgot about all that stuff, and eventually back around '84-'85 I got sucked into doing some digital half-toning work fork Dupont. So I got involved in doing X-ray image half-toning, and I guess some of that stuff actually got picked up by IBM and William Pedibaker's [phonetic] group there, so that kind of was a little foray back to image processing-ish photography related things, but very short foray. Didn't really stick with it. Then back in 1996 I had this wonderful little scenario where I'm the guy who built the world's first living PC cluster supercomputer. Now, back in '94 when I did that, we needed some way of showing that we could actually keep the machines tightly synced. So as early as '94 we started building video walls. Well, back in '96 we had a 30-megapixel video wall. Where do you get 30-megapixel images to put on a video wall in 1996? So I had all these land sat images and I had stuff that -- from NASA, images of things like IO and what have you that were stitched together. But I want to be able to take my own images, and I want to be able to do things, say, maybe 200 mega pixel images and play around with that, and you just couldn't do that with any cheap commodity technology. So ever since then I've been trying to make digital cameras be able to do these kinds of things, and basically I've been doing things that I think of as computational photography since 1999, but I'm not sure that it's exactly what the term computational photography is really associated with now. So just to give you a quick background, these are all weird camera things that I've done. So you can see, yes, that actually is a USB plug coming out of that old 4-by-5. So I hacked the sensors on that. This was back in -- I think it was 1999. I did a couple of autonomous robots that had tethered Nikon 950s with the 185 degree fish-eye lenses on them, and I was doing realtime stitching and doing spherical pan and zoom on the video wall tied to a cluster wirelessly transmitting things. I have fleets of these little guys all over the place. That's actually a peephole being used as a fish-eye. And you go, well, what a horrible way to try and abuse a lens, right? Well, it turns out I was actually working on a project called Firescape where we're trying to make a way that people could basically go in to fight fires with some knowledge of what their surroundings were. So we had a helmet-mounted system that we were working on, and we needed something that could actually go into a fire and give them a wide angle point of view and was basically disposable. Well, turns out those stupid little door peepholes cost 4 bucks and they're fire rated. So they're disposable. They're good. They're not good optically, but, hey, what can we say? That's actually another one of the Nikon 950s from years ago that's still been clicking away in one of my super computer machine rooms. That particular camera has actually taken over. 3 million images as a tethered system over a period of more than 10 years. This is actually a little thing that we did with CHDK. We're basically using the Canon Hack Development Kit. I had an undergraduate senior project team do some things with spherical -- excuse me, with 360-degree images and in the camera directly creating a 360-degree video view. Very nice that you can do that in those. This is another one of these mutant things. This is actually kind of from the Firescape stuff. So you can see I've been doing a lot of camera hacking, and in terms of the stuff that I've done, certainly, as I say, I'm pretty well known for doing a lot of video wall stuff very early on. This thing here was the 30-megapixel display that I was telling you about. And, yes, that is 32 pentium 2's running it. You'll see I've had lots and lots of video walls. And we did our own video wall libraries, things like that. I have the dubious honor of having been the first guy to actually have an MPEG player that would run on a Linux video wall. This is another thing that I was doing with a couple of my colleagues where this is actually a handprint, and the interesting thing is that we're actually doing structured light 3D capture. So I'm the camera guy in that little crowd. And this is something that I did quite a while back. This is from 2001 where basically I was able to extract the near infrared image directly from a conventional camera without any extra filtering, just [inaudible] and software. So I was doing a lot of little multi-spectral tricks. And in case you're wondering what this is, this is actually the infrared remote control, and you can see there's the infrared light from that separated out. So enough kind of hand-wavy background. You can see I've been playing around with this stuff a lot, but I'm not a very formal guy about this. So here's my definition of computational photography which may or may not agree with the kinds of things that most people would call computational photography. I think of cameras as computing systems. And obviously for me that makes it a very reasonable thing for me to be involved in. And so what I'm trying to do is I'm trying to use computation to enhance the camera abilities and/or to process the captured data. And I don't think of it as an image necessarily. I think of it as data. The kinds of things that I've been doing, new camera sensor and processing models. Some of that involves some things that I've been kind of playing around with at the nano technology center over at UK. Intelligent computer control of capture. Obviously I've been doing a lot of computer control capture over the years. Detection and manipulation of image properties. And that's really what I'm talking about this time. Basically I've done a lot of things with CHDK. How many of you are aware of CHDK? Has this been used a lot or -- so CHDK basically -- some people in the former Soviet Union with way too much time on their hands decided they would hack the Canon cameras and figure out if they could actually reprogram them. Well, basically what it boils down to is the thing that makes the Canon camera a little power shot into a camera is after it runs autoexec.bat it runs camera.exe. So it's actually a very accessible interface. And it turns out that what they did was they managed to come up with a way of having an alternative operating system load with the camera thinking that it's actually just starting up the firmware boot loader. And so it doesn't actually override anything in the camera. You can put whatever code you want in there. It's a full C programming environment which you can use with GCC to generate the code for it. So you can run arbitrary code inside of these cameras with full access to all the camera features. And that's pretty cool because here we're talking about hundred dollar little commodity cameras, and they're actually quite reasonable. They're a little slow processing-wise, but it's still nice to have everything contained. Okay. Well, I've had a lot of you said graduate student projects involving the CHDK and doing interesting things with cameras that way. And back in spring of 2009 be an undergraduate EE senior project, I had these three students working on the idea of modifying a Canon power shot to capture depth maps directly in the camera by when you press the shutter, basically quickly fire off a sequence at different focus distances, analyze where it's sharpest and basically interpolate and get a full depth map out. Pretty cool stuff, and they did a really nice job of doing it. So essentially use CHDK with some custom C code to measure the blur and combine the images, and they did a pretty nice job. They really did a very serious job. In fact, they actually won an award that year for best senior project. The theory behind this is, of course, contrast detect auto focus. If you take a look at how cameras normally are focusing, assuming it's not an SLR, what they're normally do is they're trying to detect when the image is in focus by basically looking at local contrast, and when the local contrast is highest, presumably the image is in focus. There are lots of different algorithms in the literature for that, and my students were pretty diligent about that, came up with about 30 or 40 different techniques that they actually tried. And it turned out Sobel happened to work best. No big surprise on that. No big thrill either. But the interesting thing is there were some quirks about this which I'll talk about in a moment. The processing they actually did on the raw sensor data and they eliminated the red and blue channels because they were noisier and really didn't improve the quality. It was limited to what they had as far as the resources in the camera. And the big one on that was the camera didn't have enough main memory to keep all the images around, so there was some neat little processing tricks that they had to do. Kind of cool. Not really important to this talk. Here's what's important to this talk. I realize this is a little hard to look at, kind of a dark slide in here, but this is basically a bunch of cards kind of lined up on a table, and you can see this is actually the depth map image that was generated in camera using their particular project. And you can see it's pretty good. Close stuff, further and further away, far away stuff. Cool. Right? So I was very impressed with this. Everybody else was very impressed with this. And then we started looking at it a little bit more closely. And here's the problem. If you take a look at this, the depth of the edges in the cards are perfect. They're absolutely great. But around the edges we've got these weird little echos of the edges that say that it's in sharpest focus way far away from where it really should be. And basically it's not just way far away, it's way far away in either direction. So when we were trying to do things like ah-hah, let's interpolate to fill this in so that we know what the depths are in between, this screwed up everything. And so that really bothered me, because I looked at that and I went, geez, you know, why is this so far off? And so I've spent two years basically answering that question. And basically, as I say, it's wrong by a lot, it's not wrong by a little bit, which is kind of an interesting property already. It's wrong in both directions, and as I say, it echos the edges. It's something whereas you take a look at these things here, it's not actually just random spots. It's a very simple small distance away from each of the edges. So what went wrong? Well, basically it comes down to understanding what happens to a point that's out of the focus. So we take a point light source, basically a bright spot in a sky or whatever. Looks kind of like this when it's in focus. It's just a nice, tight little dot. Right? If we take a look at what happens to that point, basically what happens to the point spread function, what happens to this shape as we bring this image out of focus -- and, by the way, I'll explicitly say right now I'm not sure that anyone would be approving the fact that I'm talking about out-of-focus point spread functions because usually people like to talk about them as being in focus, but, hey, nothing's ever really in focus anyway, so I think I'm still justified. So here's what people think happens to a point spread function when it's out of focus. Basically they think you get a Gaussian blur. Right? And in fact this is what people really, really, really desperately want to have happen. And this is what the vast majority of the image processing code out there assumes you're getting in order to do things like contrast detect auto focus. Well, here's what actually is there. What actually happens is that essentially when you're out of focus, you don't get a blurry image at all. What you actually get is you get a larger spot for that point spread function. And that larger spot happens to have a nice sharp edge, and that nice sharp edge was the highest sharpness thing that it saw when it wasn't at the real edges of the image, and there's our false echoing of the distances. When it was too far out of focus close, too far out of focus far away, it was hallucinating these sharp edges in areas that really had no edges. Right? So why does this happen? Well, first of all, this is just basic physics, but let's talk about how this really gets to be an interesting thing. Well, when we talk about the point spread function, it's really the response of an imaging system to a point source which is basically the impulse response. Normally people like to talk about the resolution of a system as the modulation transfer function. This is the spatial domain representation of the modulation transfer function, so you can kind of use this to get back to what the resolution is and so forth. I'm not really concerned with that, but that's the standard way of looking at these things. An image is really the sum of the point spread functions of all points of light in the scene. Okay. I'm lying. There's all this nasty wave stuff that really happens, right? But if we play our cards right and we're careful, we can avoid the nasty wave stuff, just like I'm a computer engineer, I know that these circuits are really analog, but I do my best to avoid that. I'm much happier if I can ignore that. The point spread function grows in proportion to how out of focus for most lenses. And let's take a look at what the point spread functions really look like. Well, I showed you before this point spread function. This is actually from a Pentax Takumar 135 millimeter f/2.5 lens, basically a nice, old, telephoto lens. And it's a very good lens. And in fact this is almost a perfect point spread function for an out-of-focus point spread function. This is what you would expect. This is what physics predicts, a nice, evenly shaded disk. In case you're wondering what these little spots are there, this is a 50-year-old lens. It has dust in it. So when you have dust in the lens, you actually see those dust defects basically as interference patterns in this. And in fact, as I'll talk about in a little bit, this is one of the cool things that you can do. You can actually identify lenses because you can actually get a very clear picture of what the defects are in the lens. The optical formula affects the point spread function. So if you have two lenses that have different lens formulas even if they're the same focal length, et cetera, you actually see different point spread functions for them. The out-of-focus point spread functions are different. Retro focal lenses, basically if you have an SLR, a single lens reflex, there's this nasty little mirror that has to be able to swing out of the way. It would bad if that slaps into the back element of your lens. Right? So essentially you've got about a 45 to 50 millimeter gap between the focal plane and the flange mount on most SLR cameras, so you need to have the focal length of the lens be larger than that. Otherwise you're probably going to have some elements extending back into the area where they will be slapped by the mirror. All right. The way that that's gotten around is from back in the 1950s, so basically it was discovered that ah-hah, you can basically build a wide-angle lens by taking essentially a conventional wide-angle lens and an inverted telephoto to project the image and that way you can get this larger back distance. The down side of that is that retrofocus lenses, retrofocal lenses like that basically have twice as many pieces of glass in them, and they kind of have, shall we say, the compounding of two sets of defects. So what you, not surprisingly, often see is you see a bright edge -- bad stuff, because it looks a lot less like Gaussian blur than what we'd like -- and you also see a little bit of a bright center. It's not real obvious in this one, but this is actually slightly brighter than around the edges. Recognize this? Classic mirror lens, right? People either love these or hate them because you get doughnuts all over the place. Well, basically, yeah, if you take a look at the out-of-focus image of a point light source, what you get on this 500 millimeter f/6.3 mirror lens is the classic doughnut shape. And the reason is because literally you have light coming in around the edges and then it bounces off the rear mirror and it bounces off another mirror that is blocking out the center of the light and, not surprisingly, there's your doughnut. Okay. Obviously easy to identify whether it came from a mirror lens or not. It turns out that you'll find most modern lenses may not have so much dust and so forth in them, but they actually almost always have aspheric elements. And aspheric elements are very cheap to make now, but they're also kind of not so good. Basically they're made in ways that leave them slightly imperfect, and those slight imperfections actually leave a mark on the point spread function. So we can actually identify the slight aspheric abnormalities when we look out the out-of-focus point spread function. So, for example, this is the Sony18 to 70 millimeter for a kit lens at 18 millimeter f/3.5, wide open. And you can see here a couple of things. Basically you can see this actually has a fairly nasty-looking distorted character to it. The little ripples that you see and so forth, classic of the minor imperfections you get in aspherical elements. And you'll notice something else is wrong with this. Anybody notice what else is wrong? It's slightly decentered. This lens wasn't quite assembled right, which is also very common in modern lenses. Less common when lenses were made out of solid metal like this thing that I have here, more common now that they're very light plastic because the auto focus has to be able to move these things cheaply and easily and you don't want big hunks of metal with tight tolerances to do that, you want loose plastic that's light mass, et cetera. So here comes the fun part. Right? Even compact cameras actually have an out-of-focus point spread function that is significant and you can actually measure the properties. Here's, for example, an Olympus C5050Z. Nothing particularly spectacular. Kind of an old camera. Interesting thing, though, you'll notice it happens to have this dark spot in the middle. Basically what you're seeing is the point spread function is still there, it still has all these interesting properties, but you're starting to see some of the wave-like interference. So basically a little bit more complicated structure there, but still a structure that we can actually recognize and understand. All right. For Bokeh, the wonderful little generic etherial term for the out-of-focus properties of an image. Everyone wants beautiful Bokeh. It's a Japanese-derived word, and you'll hear people talking about this as though there's some Oriental mystique about how you have good Bokeh. Basically forget that. It's really just a matter what was the point spread functions look like. It's really just a matter of what the out-of-focus point spread function's structure is. Good Bokeh basically are Gaussian blur point spread functions. If you have an out-of-focus point spread function that has that nice blur, you get beautiful Bokeh. If you have something that looks more like what I was showing you for the mirror lens where you have a very sharp edge and it's got a bright line to it, you get the worst of all which is what's called Nisen Bokeh or broken Bokeh. Basically double line artifacts. So let's take a look at a Bokeh legend. The old SMC Takumar for a 50 millimeter f1.4, this is, again, another one of those 50-year-old lenses but a very nice lens in terms of the out-of-focus smoothness. And if you take a look at why, you'll notice the edge is not very bright and relatively bright center. This is pretty close to being a Gaussian blur. So this is one of those reasons why people have been seeking out this particular lens because it happens to generate this nice pattern. By the way, you'll notice it's not really symmetric. Anyone know why? The reason it's not symmetric is because -- it's not a decentering problem or anything like that. It's because this was actually mounted on a camera. And you go what? Well, the camera has a mirror box, and it has a relatively shiny sensor in it, and what you're seeing here is this is actually getting a reflection from the sensor that's causing it to blur out a little bit that way. So you can actually tell the orientation of the sensor by this too. Yeah? >>: So this has no back coating at all? >> Hank Dietz: When you're talking about 50-year-old lenses, yeah, they didn't expect you to have a shiny sensor behind that. And between you and me, it's not even the back coatings that are the big issue. The big issue is a lot of old lenses will have very flat surfaces in them, and the very flat surfaces, even if they're coated, are just a bad piece of news for that. Right? By the way, that's also why -- and I feel very -- I feel very much confirmed by this. That's also why I have never liked using just plain glass filters in front of lenses. So I'm always going commando on all my equipment, right? Because I don't want the extra reflections because a flat piece of glass is going to do all sorts of evil things. So the long and the short of it is, yeah, that is definitely generating a pretty pattern. It looks more like a Gaussian blur, but it still is very easily recognizable. It still is a very specific pattern. Remember I was telling you before that the dust and so forth is very clear? Well, this is a lens that I got on Ebay, as most of these, and -- ever bought a lens on Ebay? Yeah? I've bought like 100 lenses on Ebay, average maybe 40 years old each, and, yeah, I've -- I haven't seen too many with fungus, thank God, but basically here's the fun part. This looks awful, right? This looks like this is -- this looks like somebody just, like, ran it through, like, dead meat that was laying on the street or something. Just terrible fungus infection. The interesting thing about this is that fungus is not visible with the naked eye unless you actually take a bright light and go looking for it. This is a very, very, very, very minor fungal infection in the lens. The reason it's so obvious is because basically you're looking at the interference pattern for it, not the actual fungus. So this makes it a really great thing for detecting specific characters of dust, defects, et cetera, in the lens because they're amplified. They're very easy to spot. In fact, I had a bunch of people after -- I actually returned this lens along with a copy of the point spread function and I got a very embarrassed email back from the person I bought it from, oh, gosh, I never realized it was that bad. I'm sorry. Gee, everyone should be required to ship the point spread function with their lens when they're doing it on Ebay. So, at any rate, the long and the short of it is clearly we can recognize this. Now, here comes a bit of a surprise. Remember I was saying that optical formula actually makes this a little bit different? Well, it turns out not only can you recognize the differences consistently for different optical formula but you can also recognize differences between particular copies of the same lens even if there isn't any dirt in these. These are pretty much brand new Sony 18 to 70s. There's not really any dust in here. But I think it's fairly easy to distinguish these two, mostly because one of them has a decentering problem. Actually, they're both slightly decentered, but I think it's very clear which one is a little bit more decentered. Right? So another thing that people talk about, and especially, as I said before, I was doing a lot of work with some colleagues, especially Larry Hassbrook [phonetic], who are very known for structured light 3D capture, things like that. One of the big things that people always talk about as the advantage for structured light is, well, you have these ambiguities whenever you're trying to do things detecting depth by looking at blur and so forth. And it's ambiguous before and after the focus point. Well, actually, it's not. It turns out for most lenses, it's not ambiguous at all. The point spread function, the out-of-focus point spread function, is essentially kind of turned inside out when you go to the other side of the focus range. So, for example, if you had blue fringe on the image points that were past the focus point you're going to have basically yellow-ish fringe when you're on the other side of the focus point. This allows you to actually disambiguate. Things in front block the point spread function of things behind. This sounds so obvious, but I haven't seen anybody else actually making use of this. Maybe I just haven't found it, but I haven't seen it. If you have a big, fat circle that's coming from some particular out-of-focus point and you put some object in front of part of that, what you'll see is the sharp outline of that object will cut into the point spread function so you can actually tell that, gee, this thing's in front, right? And I'll actually show you a little example of that, an image, in just a few moments. Now, when things are almost in focus, okay, all bets are off. That's the normal point spread function stuff that people talk about, and there are all sorts of issues. Different colors focus at different depths, so that kind of messes it up a little bit. There's this inversion of the pattern that I was talking about before and after the focus points. When you're near focus, some things -- some portions of the pattern invert before others so you get all sorts of weird structural changes. And there are also these wave properties when you start looking at things that are very small points. So basically things don't really just add when you look at the focal plane. All right. So here's the classic thing that people talk about as the big difference between before and after the focus point. If I have a lens with spherical aberration, and most lenses have some amount of spherical aberration, basically if I over correct it versus undercorrecting it, whether I'm too near or too far from the focus point, you can see that what I get is I get a very different pattern. Basically I get these sharp, bright edges if I undercorrect and I'm near, and I get a nice smooth thing if I overcorrect and I'm near. And in far I've got the opposite thing. So essentially what you have is you can actually tell the difference between the near and far just by looking at, again, how the structure has actually changed. So spherical aberration is one of the ways that kind of inverses as you go before or after the focus point. And I will say I got this image from Wikipedia, which we all know is the ultimate source of knowledge, right? But it's kind of right anyway except for all the wave interference. Yes? >>: If you want to use this to do before and after [inaudible] is this something that has to be calibrated per camera or is there [inaudible]? >> Hank Dietz: So the quick answer is what I'm actually advocating is having it calibrated not just per camera but actually having it calibrated at the point of recognizing exactly what it is for that lens at that aperture setting at approximately that focus distance at each point in the entire image, which I know sounds horrifically complicated, but as I'll tell you later, I use a GA for that. So it's not a calibration procedure. It's more -- basically the way that I'm analyzing these things is by trying to create them rather than trying to match them. All right? Another thing you have, actual chromatic aberrations. Remember I was talking before about blue shift becoming yellow? Well, that nice little bird was not sitting on a tree that I painted some of the branches blue and some yellow. If you take a look at the blue and the yellow, that's basically the before and after focus point issue. It's essentially chromatic aberrations. Very common on most fast lenses. So, again, I think it's fairly easy to unambiguously say whether you were before or after the focus point because it's just a matter of, well, which color is it, right? Okay. Cat's eye/swirl vignetting. A lot of people, again, in the world of talking about Bokeh, this is one of those magical features. Oh, I take this picture and there's this swirly pattern behind it. It's really impressive. What that really comes from is basically artificial vignetting. Essentially what's happening is near the center of the frame I've got a nice circular spread function. Out here, that ain't a circle anymore, is it? And the reason is it's basically a circle that's had an edge clipped off because the lens does not have infinitely thin thickness. It actually has a thickness. So essentially you're internally vignetting some of the rays. You're basically clipping the structure. So that's the primary thing I have on this slide, but let me just point out -remember I said before I'd show you another slide? Notice this one? That one's kind of interesting. Because what you see is you see this nice big blurred spread function, and look at how it's clipped there. It actually is clipped by the fact that the edge of this chair rung happens to be in the path, so it's giving me a nice sharp edge on that even though the chair rung is out of focus too. Kind of an interesting property, right? I'm not going to pursue this too much right now, but let me say you can kind of almost look behind things using this. Just not very much behind them. Okay. Computational photography using point spread functions. So what did I do? Well, after playing around with this stuff and convincing myself that, well, gee, there really is a lot that I don't know about lenses and optics and I'd like to find out more, I basically went out, as I said, bought basically about 100 lenses on Ebay and essentially went and characterized them. These are the point spread functions and all that. And this is what I've basically been able to do with my nice little database of 100 lenses. So one thing is that you can do depth-from-focus or defocus much more easily. I also can do refocus or converting things to all-in-focus very easily because, again, I have a pretty good model of point spread function. I can diagnose lens defects like contamination or fabrication flaws like decentering, for example. You can forensically identify the type of lens. This is a little bit of a price. If you look in the field of forensics for cameras, they're usually looking at things like sensor pattern noise, issues having to do with the, oh, for example, [inaudible] information, JPEG tagging, things like that. I haven't really seen people looking at the out-of-focus portions to try to recognize structure there. You can also identify the specific lens because, as I said, again, dust, dirt, what have you, or even aspherical anomalies can actually mark the thing. Point spread function substitution. Basically, in general, if I want to, I can always change the point spread function to something else. So you want a Gaussian blur? Hey, no real lens is going to generate a Gaussian blur. I can make it a Gaussian blur. I recognize this and I go and replace it with my nice blur. I can also do things with structured apertures and apodization, and that's part of what I'm going to be talking about today. But before I get into some of the additional theory that I have to go to to the main thing that I'm talking about today, let me just take a little break and give you some low-hanging fruit. How am I doing time-wise here? >> Andy Wilson: You're doing fine. >> Hank Dietz: Okay. Good. So turns out Minolta and now Sony are known for this thing called the STF lens, the smooth trans focus lens. It's a 135 millimeter f/2.8, which is a t/4.5. It's basically not as transmissive as a normal lens. T-stops are basically an F-stop adjusted by how much light is actually being passed through, what the transmissivity is. And the reason that everybody loves this lens, it's absolutely the Bokeh king, is because when you take a look at the lens, it actually has this very funny little element in it. If you look at this, this is basically a flat piece of glass. And, in fact, this piece and this piece are made of exactly the same optical index, et cetera. It has exactly the same properties. The difference? This one is kind of smoke-colored. It's basically gray. Since that is now machined in a very precise way as this spherical curve, what you really get is essentially there's less of the gray in the middle, there's more here, you get this beautiful smooth Bokeh. Very cool idea, right? So Minolta did this. It's still available. I think it's about $1200 to buy this lens. It's a cool lens. But now here's the fun part. Just before Minolta went out of the film/camera business they produced this camera, the Maxim7. And basically this was kind of the ultimate achievement for Minolta. This was their highest-end camera in terms the sophistication of the control logic and such in the camera. And it turns out among the custom modes in this camera, custom mode 25-2 -yeah, that's not buried somewhere, right -- so custom mode 25-2 -- and, yeah, you really had to go looking for this to find it. It wasn't something they really advertised. It turns out what it does is it fakes the smooth trans focus mechanism by taking multiple exposures varying the aperture. So essentially if you think about it, if I have the exposure with the aperture relatively small for a relatively long fraction of the exposure, that's going to make a bright center and a little bit adds on a little bit of a diffuse larger circle around that, then a little bit more and such. So basically by taking something like seven multiple exposures, they've constructed something that looked kind of like a Gaussian blur point spread function. Actually, it wasn't really so much Gaussian blur, but here's the fun thing. Oh, I guess I should say one more thing about that. One of the executives at Minolta was asked at some point exactly why they don't do this in any of their digital cameras. And he had a very nice little statement that I should kind of frame that basically said, oh, because it doesn't apply to digital cameras because they don't do multiple exposures. Good. Since he said that, now I'm free to do it for digital, right? So took the good old CHDK, right, and on an A640, which is actually -- it's not this one, but, you know, just to show, a little power shot like this. Basically stuck that in and here's the native point spread function at F4, and basically doing a controlled varying of the aperture during exposure I get this with a Gaussian blur approximation. And I think we can agree that's kind of miserable, right? Gaussian blur is too hot in the middle. And this is basically closer to the spherical pattern that is actually what they did in the STF lens. So believe it or not, the spherical actually tends to look better than the Gaussian that everybody thinks it should look like. All right. Well, that was cool, right? So we've got some low-hanging fruit already. And keep in mind, you can do that on just about any camera that you can control the aperture. And it's left to the reader to find out if Minolta is going to sue people if they do that. >>: What was the low-hanging fruit there? The fact that you can go [inaudible]? >> Hank Dietz: The fruit is that I can basically get myself a nice, smooth, out-of-focus blur pattern with lenses that don't naturally create that by dynamically varying the aperture while I'm taking my pictures. >>: Thanks. >> Hank Dietz: Which is a huge thing. Okay. So -- well, let's put it another way. It's a huge thing to photographers. I'm not sure it's a huge thing to everybody, but it's -- issues that at least a sizable market that cares about that. So let's talk about apertures a little bit more. Okay. The aperture of a lens represents the area through which the light is admitted by the lens, and I'm not going to get into the admitted -- excuse me, the point at which the lens has its admission and so forth. We don't need to talk about those details. A circular opening, the F number is basically the focal length divided by the diameter of the circular opening. If it's not quite circular, you really should adjust this, but people usually don't. Light is admitted in proportion to 1 over the F number squared. So basically if we start at 1, .7 is going to give you twice as much light, 1.4 is half as much, 2 is half as much as that, again, and so forth. The effective F number for a given transmission is the T number. So you just basically divide the number down by whatever your transmissivity is. Actually multiply it by the transmissivity, which is a fraction. So here's an aperture on a nice old lens, 20 blades. You don't see them like this anymore. Why not? Because if you try to open and close that fast, there's way too much friction. Can't really do that with wimpy little motors that you're using now. On the other hand, when it's a thing you're turning like this, no problem. Right? Okay. Aperture can be implemented by the lens barrel. That's what you get when the lens is wide open. Nice circular lens barrel, very easy to make. Movable blades. An iris. That's what most of us are used to. And it may or may not be circular, but, you know, it's got the familiar hexagonal or pentagonal or whatever shapes. It can also be done with a waterhouse stop, which is basically just a hole in a plate. Right? It could also be done with liquid crystal or other controllable opacity materials. So you can basically make yourself a variable aperture that way. The aperture cannot be placed arbitrarily. Basically if you put it in the wrong place it vignettes rather than becoming an aperture. The correct placement for an aperture makes it form a stop. Now, I'm not going to go get into the details of how you figure out where that is. Let me just say, though, that there's more than one place that is okay in most lenses to have the aperture appear. It's not just that one spot in the middle. And in fact, traditionally it wasn't in the middle. It was very often in the front instead. Usually there's more than one place that it can go, as I said, depending upon the optical formula of the lens. In front of the lens is really great because I don't have to open up the lens. In other words, I can make my own aperture without actually having to modify the camera or lens in any way. Very nice feature. It means users can modify and remove the modification harmlessly to any camera. Inside the lens. This is really what you see in most cameras now, and it works great in a lot of ways, but, frankly, this is really awkward for us to get at. Right? This is not someplace that you really want to be messing around with. Behind the lens. A few camera buddies did this so that they didn't have to have an aperture in every lens, they could just have it in the body. This is kind of problematic in some ways, but suffice it to say it can work too. Where you put the aperture does change the optical aberrations that you see on most lenses, and I'm not going to get into whether it's a really good thing or a bad thing to place it in different spots. Just suffice it to say at least you can and it will kind of work in multiple spots. To be a stop, the aperture effectively must be no larger than any other stop. So if you have multiple apertures that you're trying to treat as stops, whichever one is the smallest is basically going to act as the real aperture. Now, vignetting. Remember when we talked before about the little cat's eye stuff? There we go. How do you get the cat's eye vignetting? Take a lens, and if you look through the lens sideways, what you see is no longer a circle, right? Well, when you're projecting out to the sides, guess what? It's look through the lens sideways. That's why it's vignetting. Because essentially it's not a lens with an infinitely thin material, it's a lens that has thickness and you're bumping into other portions of that thickness that are blocking the light. All right? This is a really bad thing. Artificial vignetting for the stuff that I'm talking about here is a killer. It makes some of the stuff that I'm talking about not work. On the other hand, for recognizing the actual shapes and the point spread function in an image, it actually helps on that because it's, again, distinctive. It's another distinctive feature. It is annoying, by the way, because that also is a feature that, again, it varies over all the different points in the image. So you have to actually take a look at that. On the other hand, that also means that you can do things like, for example, you can tell the image has been cropped, right? Because the vignetting will be different. Natural vignetting. This is the cosine for fall-off. I really don't care. It's bad. It causes the corners to be dark. It's a bigger problem with wide-angle lenses, especially non-inverted telephoto type things, non-retrofocal low lenses, but the long and the short of it is this is there, you live with it, no big deal. Mechanical vignetting. This is the one that everyone knows is bad, right? You decide, gee, wouldn't it the be fun if I put this filter on the front of my lens, ah, let's put another five filters there, and the next thing you know you're doing something that looks like you're looking through a little keyhole, right? Because everything is all shadowed on the edges. Basically mechanical vignetting you want to stay away from too, but it's kind of a matter of common sense. Most people are not going to accidentally have mechanical vignetting and leave it there. It's very easy to recognize that. The artificial vignetting is the killer. Okay. So remember before I said that the point spread function gets larger as we get more out of focus? Why is that? Well, the point spread function echos the shape of the aperture. It's clipped by the shape of the aperture. That's a really important point. So when a point is in focus, all the rays from that point pass through the lens and end at the same spotted on the film or sensor. But when it's not in focus, they don't really land at the same spot. That's why the point spread function gets larger. So out of focus, the rays are going to different positions, and what we're really doing is we're actually separating the different views through that single lens. So let's take a look at it here. So here's some object far away that we're focusing on, and basically if it's in focus, everything's all at one point. If it's not, we're going to get a bigger image because we're basically separating out the views from the different points in the lens. Different points of view is how you do stereo vision. That's how you do plenoptics. All that information is already there in an ordinary everyday image that has an out-of-focus region. It's just a matter of can we recover it efficiently. All right. So the aperture blocks some of the rays. Right? And that's why we see its shape being imposed on the point spread function. So if we make that shape easy to recognize, that should help us. Right? And you'll see that's generally referred to in astronomy as coded apertures, and in the computational photography field you'll see it usually talked about as that, but it's also known as structured or shaped apertures. Same concept. It's deliberately making the aperture have some structure that makes it easy to recognize or causes some other desirable property. In astronomy it's actually used because if you shape the aperture properly, you can basically resolve stars that are too close otherwise. Okay. So what kinds of aperture shaping has been done in the past? Well, there's the good old famous Imagon and this Fujinon lens that has this built-in sink strainer-like thing. These are photos from MF Lenses and from M42.org because I am not about to buy these lenses. They're way too expensive. But suffice it to say what does this really do? Well, this is supposed to be soft-focus lenses. What they actually are is really they're just superimposing a whole bunch of images from slightly different points of view. That's really what we're getting for all of these things in the out-of-focus regions. And that does actually give a fairly interesting character to it because it gives you some sharpness structure, but it's not really a sharp sharpness structure. It basically gives you these very fine artifacts that tend to be more textural rather than something that is disturbing for some bright pattern. So this is actually somewhat effective, but I have to say personally I think it's a sort of disappointing effect. I don't think it's really that great a way of doing soft focus. The cool thing is, though, by the way, you do soft focus like this, what happens to the image in focus? It's still in focus. Right? This only affects the things that are actually in the out-of-focus region. Okay. How many of you have seen this kind of thing? There's a whole pile of instructables talking about this sort of thing. There's all sorts of stuff out there. And basically what it is is you cut some pretty picture, you know, some shape into an aperture and basically, voila, your out-of-focus points now take that on and the in-focus stuff still looks kind of in focus and you can do all these cute little things that are probably really cool to use for one day and then you never want to do that again. Okay. Then there's also the more serious stuff in the computational photography community where they're doing shaped apertures that look like this, structured apertures that basically have relatively easy-to-recognize patterns of one kind other another. And, again, these are images from MIT. They're not from me. I'll make a public scary statement and see if anyone calls me on it. I think this is dumb. And the reason I think it's dumb is because they have square corners. Because the square corners cause it to not behave in that nice, wonderful way of actually having the light add that basically causes interference patterns. So when I make mine, I do them with rounded corners. But almost everything I've seen out there has square corners on everything. So we'll see if the community out there tells me that I'm an idiot and I'm screwing up on this. But, at any rate, the point is you can computationally recognize these. They're usually doing it in the frequency domain for matching these patterns. It's pretty straightforward stuff. Okay. Point spread function substitution. Replacing a bad point spread function with a good one. Good one being defined as what I want rather than what I got. Commonly attempted for image refocus, right? And there is that wonderful little camera that's come out now that is doing plenoptics for refocus. There may or may not be a market. I guess we'll see soon. You can improve the image Bokeh by replacing the native PSF with a Gaussian blur or whatever other function. I would say you probably actually want a spherical blur instead a Gaussian. And you can directly synthesize 3D stereo pairs and enhance their apparent depth. Sorry about that. More next lecture. I merged slides and I didn't catch myself doing that. All right. So why would we bother doing this? Well, soft focus effects, we said. Cool Bokeh effects? Yeah. Recognizing point spread functions and captured image means that we know the depth of the points in the scene. Anything that you can do with plenoptic you can actually do if you can properly match the out-of-focus point spread functions, and you get to do it with basically a stock camera, and you get to do it with higher resolution. Okay. So how do you recognize these? Well, deconvolution. That's the standard technique, or at least my impression is that's the standard technique. Usually in a frequency domain. So you're basically trying to find these patterns, right? By the way, that doesn't really work because remember the patterns can be partially occluded by things that were in front of them and so forth. So there are all sorts of issues. But at least this gives you a pretty good estimate as a starting point. What I actually many doing is I'm actually trying to generate them. So I'm a big fan of genetic algorithms. Let me put this a slightly different way. I'm a super-computing weenie. I have two machine rooms full of super-computing hardware and they're just sitting there waiting to do whatever I feel like doing. I have no problem with running really expensive GAs and so forth to just try things out, see how well I can make it work. So basically what I've been doing is instead of trying to recognize these structures, what I've been doing is I've been trying to use a genetic algorithm to essentially evolve the pattern that, when I go through the transmission, generates the same image. So basically I'm doing a search for the parameters for each individual pixel. And you can see this is really a much more powerful technique because any weird things that happen spatially over the frame or any other kinds of issues can be accurately modeled this way, whereas trying to get a closed-form solution thing, forget it. You're not going to be able to model these things. I can still use this, by the way, as a starting point to speed this up. Why did I start doing it this way? Well, there's a guy named Jim Allenbach [phonetic] back at Purdue University, and I was faculty at Purdue for 13 years before I went to UK 11 years ago. And basically I was helping him with some defractive optic design stuff using direct binary search, and essentially he was doing better defractive optic design by doing something very similar to this, essentially trying to reverse the direction of the design process. And so that's why I thought, hey, this is a reasonable thing to try. And for the record, yeah, it's a reasonable thing to try, and right now it's really, really, really, really expensive, but I don't care because I've got supercomputers. You can also use spectral color coding on the point spread function. That makes it easy. I can recognize colors. The only problem with this -- and I thought I was so clever at coming up with this -- is that it was actually patented in 1973. Don't you hate it when you find things like that? Of course, 1973 is a good year for it to have been patented in because that's gone now, right? But basically Songer [phonetic] had this wonderful little patent on doing single-shot anaglyphs with a conventional lens that was modified only in that it had a funny looking aperture stuck in the middle of it. And the funny looking aperture -- he was very specific about exactly what the shapes were. This was his preferred shape. Oh, well. Didn't get everything right. Then these were the other choices. We got one good shape there, right? If you take a look at these, these don't really line up very well on top of each other because, remember, if you're trying to fuse them for stereo vision, these shapes are going to be superimposed. So when you look at the images generated from something like this, they look kind of blurry and yucky, and it's not a big surprise, but basically this was used in the old Vivitar Qdos lens. How many of you have heard of that lens? Yeah. Very famous lens. It was not in production for very long. They didn't sell a whole lot of them. And mainly the reason they didn't sell them is because if you take a look at the images that it generates, they look really kind of sad. They're very blurry. But reason is really because they had a very fancy internal filter like this was split in two parts, and you turn it -- actually, it's split in four parts -- you turn it and it would come together kind of like an iris to form this dual filter. Very complex mechanical mechanism. Works very well mechanically, but optically it kind of mucks it up. So here's what I do. It turns out there are problems with the color choice and with all sorts of other issues about this. So I basically went, okay, I'm interested in this as a digital computing system. I'm not interested in this as just, you know, some optical thing. I don't care whether I'm generating an image to be viewed as an anaglyph or not. I'm trying to use anaglyphs as a way to capture information that it can reprocess like it you reprocess a plenoptic image, for example, right? So what I'm doing is basically I'm encoding the left and right views by color. And, in fact, I'm using green and magenta. And it's very important that I'm using green and magenta. It doesn't work with red and cyan, which is kind of weird, but, you know -- I'll even give you the little sneak preview. Think about what the better color filters look like. Very bad choice to have reds singled out. Also think about how JPEG compression works. Really, really bad idea to try and separate red and blue on different channels. Because JPEGs screw that. So transmission from an anaglyph from a full color stereo pair, it turns out, does not require stereo matching. Let me say that again. You would expect that if I was trying to take this one image that basically had the left and right views color coded and try and separate that out into two full color images, I would have to do stereo matching to figure out what colors each thing was. Turns out that's not necessary, and in fact, when I tried it, that wasn't even the technique that worked best. Of course, because I can actually put my aperture anywhere so long as I'm careful about it -- well, I shouldn't say anywhere, but I can put it in any of multiple places, including in front of the lens, guess what? I get to do this with a nice user add, user removable stop in front of the lens. So I can do this with just about any camera. All right. So what are the big issues? Well, the color choices. I gave you a little bit of a hint about that. It's not about how it looks when you view it as an anaglyph. I don't care what it looks like when you view it as an anaglyph other than it's kind of nice to be able to check it out by just putting on the funny glasses, right? And, by the way, I do have a box of the funny glasses that I can pass around glasses and show you. I have some anaglyph images here. You want to balance the average value for the pixels. Why? Most camera have the kind of bold assumption in them that all the pixels are seeing about the same amount of lighted even though they're filtered to basically get different portions of the spectrum. So if you screw that up, basically what you end up with is higher noise levels on certain color channels and it messes things up. Color isolation. As I said, the Bayer matrix and JPEG encoding both say you do not want to be using red and cyan. They mess that up horrifically. So it turns out green and magenta happen to have a nice property that way because green is basically your luminance channel and it's pretty much preserved independent of the color information, which is basically the magenta pair of channels. You also want the same pixel count per spectral band for the left and right sides. Guess what? Bayer filter has two greens for every red and blue. So we balance them out again. Point spread function shaping and location of the stop. Well, locating the stop, I actually have a little web tool that I've put up for that called anaperture that, given the parameters for the lens and so forth, designs what the stop is, tells you how to do that and basically generates an image that you can then feed to a little laser printer -- not laser printer, a laser cutter or to a paper cutter, which is what I use because I have no budget, and you can basically get it to cut these stops out for you nice and precisely. The -- where was I here? The shaping, of course, if you actually go through analyzing it, basically you really want the shape to be the same on the left and right sides. And essentially a circle is really good. So circular shapes seem to be the preferred winner on this. Especially if you ever want to view it as anaglyph as it's actually out of the camera or perhaps even on the back of the display as it's in the camera. Vignetting. Artificial, natural, and mechanical. As I said before, the natural and mechanical ones, natural I don't really care much. Mechanical, you definitely don't want to ever have problems with that. Artificial actually causes big problems. Why? Because if part of the view at an angle is clipped, guess what? That says I don't have the same depth on the angular measurements because it's basically saying I don't have those views there. All right? It's literally removing some of the views. So I only really get good stereo images during the portion of the lens's aperture that isn't clipped. That's actually part of what's incorporated in figuring out how to do this shape and sizing and placement in the anaperture tool. Depth ambiguity versus the aperture width. Basically the wider the aperture, well, any aperture is going to be having different views in it. The wider the aperture, the more different views I have there, the more ambiguity in the depth. Same problem, right? Because an aperture inside of an aperture is still an aperture. Still has the same properties. Depth of focus. People usually like to see things in stereo views with very large depth of focus. So we have a little bit of an issue there. Depending upon how we choose the aperture sizes, et cetera, you can have different amounts of depth of focus. Although, of course, since you're recognizing the patterns anyway, you can always substitute a different point spread function. Bokeh shape issues. Basically, again, if you're viewing it, you want things that line up naturally. Okay. So basically here's what I did. As I say, here's the tool. It does look like this. And to be very physical about it -- did that get shuffled just enough to lose it? Well, I will have to dig around a little bit. But I actually have a couple of the apertures that I cut here. But they look like this. They're basically just a simple circle with a little tab on it so that you can literally stick that inside of the filter thread. And I'm just cutting them out of black cardboard. There's nothing really tricky here. And what are these? Basically you take a pair of these glasses and you cut out the film from them and you take a little Scotch Tape and you basically -- and I know you're going, ugh! Yeah, I know. Low budget, okay? If you want to build a real filter that does this, by all means, do. I did too, but, you know, I only did, like, one or two of them that are real filters like that whereas I can do these without even trying. Very cheap. Costs less than a buck to make these filters. Okay. So this is really accessible, right? So last year I figured, okay, I haven't really got a publication record in doing computational photography stuff, and, you know, as the phrase goes, I'm tenured, so, you know, I'm established in another field. I've got something like 200 publications in high-performance computing, that kind of thing. So I figured, okay, well, let's try and get this thing out and see whether the public thinks this is a big deal, see if there's really interest in this. So I wrote an instructable telling people how to do 3D anaglyphs as a single shot this way. And as you can see, it's had about 20,000 readers. And, more importantly, there are about 7,000 people who have used the design tool to design the apertures for their lenses. So this tells me that this is actually people actually trying it. So at this point, time for me to throw out some of the funny-looking glasses, right? And for anybody who is conveniently hidden behind a computer monitor watching this online, they get to do this without looking funny in public. Assuming they have the glasses, right? Of course, I have my special one that has the Linux penguin in the middle. >> Andy Wilson: Do you have one more? >> Hank Dietz: I have plenty more, yes. Have some more. Because we must have the entire audience looking funny, right? Now, I will say I did not really clean up any of these images. These are basically raw from the camera. And, remember, there is some internal vignetting, some artificial vignetting, that causes some defects on these. That is kind of a small image, so, here, here's a big one. And this was shot, again, with an unmodified camera, just sticking that stupid filter in front. I haven't even reprocessed this. This is just the raw stuff. Here's another one. I think you can agree, this is really pretty good depth from a camera that hasn't been modified in any permanent way, and the total cost of the modification is well under a dollar. >>: Which camera did you say it is? >> Hank Dietz: The majority of the stuff that I've been shooting has been shot with either a Sony 350, a Sony A55 or a Sony Nex-5. >>: Are these point-and-shoot class cameras? >> Hank Dietz: No. So these are -- the stuff that I'm usually using is like this. So basically a little bit higher-end camera. It does work for point-and-shoots and it also works for things like camcorders, but the precision with which you need to make the apertures gets touchy. So for something that's being homemade and cut on a paper cutter, a craft cutter -- actually, I'm cutting them on a craft robo, if you've ever heard of that particular model. It's about $200 paper cutter. It really pushes the technology to be doing it for these finer points. Keep in mind also that the stereo separation is going to be relatively small. So if you have a point-and-shoot -- here, I'll give you another. Here. If you have a point-and-shoot camera, if it's got a high enough resolution, you're still fine. But keep in mind your ability to resolve the depths and to separate the views is going to be dependent on having a lot of pixels there. So in some sense we do have that similarity to plenoptics, right? If you want to get a high resolution light field, you really need a lot of pixels that you're kind of wasting to get the plenoptic field for each point. We're kind of doing the same thing here except they're not really completely wasted. We still get relatively high resolution. The catch is that we've kind of lot of some of the color information. >>: When had you say a lot of pixels, I mean, 20 megapixels? What is a lot of pixels? >> Hank Dietz: Let's put it this way. It ain't gonna work with a web cam. All right? If you have something that is typical little like the power shots or something like that, your typical hundred dollar camera or perhaps even some of the latest versions of cell phone cameras, you should be fine. If you're in 6, 8 megapixel, you're probably going to be okay. What it really depends on is the physical size constraints of how large the lens is and all of that, so it gets into a lot of the raw optics design. And I guess you're more of an optics person. Can you kind of say something about that or -- yeah, basically what it really boils down to is it's very specific to the properties of the optics that you have and the sensor size and all of that, but you do need a relatively high resolution sensor to be able to get good information when you have relatively small stereo separation. And relatively high resolution, you know, 8 megapixel is probably more than you would need for almost anything. >>: The effect of color matching projected to the spectral properties would presumably impact on the clarity, but you wouldn't probably have to do much more than that in order to get all the information out. >> Hank Dietz: So, again, yeah, I'm doing a really crappy job here, right? Because I'm having you look at this on an uncalibrated projector. Everything's uncalibrated here. But I don't care because, again, I'm not really doing this for people to look at with these funny glasses, I'm doing this for me to reprocess into full stereo pairs and things like that. So I think a good way of thinking of it is if you were shooting plenoptics, not only would you be stuck with the lower resolution kind of no matter what, but you would also have the problem that basically if you're doing it with a plenoptic, how do you view that? Do you just ignore all the other view information? It gets kind of funny. Whereas, here there happens to be this convenient way that you can get at least an approximate view. And I will say, by the way, keep in mind this works on live view. So it may look funny when I'm taking the pictures, but like when I'm taking the pictures, I can actually do that too. Doesn't work so well with an electronic viewfinder because I can't get two eyes in an electronic viewfinder. But for life view on the back of a camera, it's fine. And I'll even -- I'll give a completely unwarranted, unjustifiable plug. I love the Sony Nexus. So I hope Sony recovers and continues building them. But, at any rate, may the floods recede. Okay. Let's go through a few more here. These, by the way, are not really great images, but I kind of threw together stuff. These are actually some of the older images. These are mostly the ones that I had when I published the instructable a year ago. You can see they're all pretty good. They're pretty convincing. This I actually shot a few days ago. And I see it conveniently had Seattle-type weather. These are actually in my office. Am I allowed to show a Linux penguin? Never mind. Okay. I know I can show that, right? Because that's a UK thing. So just remember, UK, University of Kentucky! Woo! All right. So, again, why am I dealing with anaglyphs? It's not because I really want to be viewing anaglyphs, although, I admit, it is a cool thing to be able to look out at an audience that has actually all put on the stupid glasses. What power I have up here, right? So why am I doing that? I'm doing it because single-shot anaglyph computer is easy and it gets me stereo information. It gets me more than stereo information. There are lots of anaglyph images out there. Ah. Now here comes the fun little thing. Let me say this very specifically. I don't care how you got your anaglyph. So, for example, if it's already an existing anaglyph movie, guess what? Now I can reprocess that. I can do the whole thing on that too. I don't care whether it was shot through two matched lenses or if it was shot through pieces of one lens, although I will admit, one lens is usually fairly well matched to itself. Right? Not at well matched as you might think because you're looking through edge portions of it, but it's still fairly well matched. Right? So this is really a potential win because there are a lot of stereo TVs out there and such, and there aren't really a whole lot of stereo things that they might want to watch, but there are a fair number of anaglyphs out there. So I'm trying to use anaglyphs basically like people use light fields. So I can do refocusing, I can do point spread function substitutions, I can get full color stereo pairs. That's the sort of thing that I'm really doing here. And how far long am I? Well, I'm going to show you some kind of old stuff, but, you know -- let's take the basic one, getting a stereo pair from this. So remember I was saying you don't have to do stereo matching to get the stereo pair full color? Well, one thing that people have tried is just doing blurring and masking. It's cheap. It doesn't work. You get all sorts of horrible artifacts if you try and blur and mask to restore the colors from one channel to the other. Stereo matching, basically you have trouble finding matches. Why? Because unless you're photographing a gray scene, there are going to be marked differences in the features that you find in the different color channels. So you really don't have something to align it on that's all that great. Modified super pixel or shape matching worked okay. So I have about 15 different ways that I tried doing super pixels modified to follow shapes and that at this tried just doing literally shape matching directly. And this has some promise. This actually in some ways works better, but it's much more computationally complicated than the stuff that I'm going to be pointing at having to do with colors. Point spread function matching in theory could work, but, again, really matching them is a little bit computationally complex right now. So believe it or not, what actually worked best was this weird color analysis. And now I point to another thing that's not one of my pictures, but how many of you have seen this before? So this is from the cover of Scientific American in 1959, November of 1959, actually, the monthly I was born, and this was done by Land, and what it is is basically you see there's a projector there. It's projecting two monochrome slides, one with white light and one with red light. Now, how many of you see blue and yellow and green? What up with that? Basically this inspired a whole pile of work from Land. He spent about 20 years trying to explain what the heck was going on with this, did not really fully succeed. One of the artifacts from this is something called the retinex theory, the retinex algorithm. Basically what I'm doing is very akin to what he was doing with this. What I'm doing is essentially I'm saying, okay, based on the color adjacency properties that I see in the image, what I'm doing is I'm saying, okay, let me take the side that has two different colors and try and make a guess at what the third color should be for each of those points based on nothing but the color analysis. Then what I do is I take that solution and I plug that in to refine the color estimate on the other channel. Then I take that solution and I plug that back in to refine the estimate on the first channel, and I flip it back and forth a bunch of times and that's basically how I'm getting my colors, which is kind of weird because you go, where'd they come from? They came out of thin air. Right? The thing that people keep forgetting is color is not a property of wavelength. Color is a perceptual thing. Right? And basically -- by the way, it's even more disturbing because, notice, this is a color photograph of this thing and it even photographs, right? So it's perceptual, but it's actually something that you can photograph with an ordinary camera. I'm not pretending that I've solved in the past couple of years this problem that Land worked on for 20 years and didn't solve, but at least I was smart enough to notice, hmm, there's something to this, and so that's why I started doing these little weird things in color space. >>: [inaudible] >> Hank Dietz: At least as far as I'm concerned, yeah. I mean, is it? Because, I mean, I have never seen anyone who had a really great explanation of it. Actually, the best explanation of this that I have seen is, oddly enough, at the website of a woman named Wendy Carlos, who's better known for her music. And it turns out she got into this. She's a true artist. She does all sorts of things. And she actually ended up simulating this where what she did was she took two images -- well, I should actually say one color image, split it up such that she basically created two gray images from different color bands for it, projected one with white light and one with red by basically making arbitrary line patterns where you had one line was gray and the next line was red -- actually, what the heck. I'll show you. I have it in here, but I wasn't sure whether I should show you. So I'll show you because it's coming up. It's this slide. Oh, that's not working. Unfortunately, it's getting scaled. It doesn't give you the good effect -- well, actually, you can still see it. This still looks kind of yellow, right? It's not as good because this is being interpolated and reworked in all sorts of weird ways. But to make a long story short, if you actually do alternating stripes of red and monochrome with just two monochrome source images that are from different color bands, even if they're taken as close as something like two yellows that are only 10 nanometers apart, you actually see full color, which is really freaky. And, again, that actually was primarily out of Land's work, but it's kind of interesting that you can digitally do this. And so the long and the short of it is colors have much more stable properties than people think. So it is possible to recover. Now, this is not the world's best image. You can see there's a fair amount of vignetting impacting this. You've got some fringing on the two sides, but it's not terrible. Here's another one. It's not particularly great, but it's not terrible. Well, here are the stereo pairs generated from those. Only by looking at the color. And you can see they're not perfect. There's that shading there that was really the defect from the vignetting. And I can tell you, I've done a lot better than this. These are kind of old images, but you can see this actually works. And if you have a better set of anaglyphs to begin with, obviously it works a lot better. So the long and the short of it is -- I guess I should say one more thing on that too. These are not great, and I'm not giving you the full algorithms or anything like that, but in January at the SC11 electronic imaging conference, I'm going to be presenting some of this. So it's a little bit pre-publication that I'm talking about this. That's why I'm not giving you too many of the details. But at least I can kind of whet your appetites and also hopefully find out if I'm doing anything bad before I do it in the audience there. Right. So basically, in conclusion, there's a lot more information in out-of-focus image reasons than people have been using in general, much more than people have been using. And there are low-hanging fruits. There are things that you can trivially, easily do. Dynamic aperture control using point spread function shaping, anaglyph capture and reprocessing. These are not hard things to do. These are definitely feasible. The reprocessing, okay, it might be another six months or a year before I have something that I really like that can run at video frame rates, but, you know what? I will. Absolutely confident of that now. More accurate models are needed if I'm going to be doing this point spread function recognition and substitution and all of these nasty things. And in fact, I can actually do it, believe it or not, on a completely ordinary image. If you just give me an image that was taken with an ordinary camera, no filters, nothing, no a priori information about it, I can actually get all the same information out of it, but basically when I've tried it even on simple test images, I'm talking about a week plus -- actually something more on the order of a month on a supercomputer to process one image. So not really a useful technology yet, but, you know, give me time. Or, even better, give me time and funding [laughter]. But in any case -- so this has a lot of promise, but there's still work to be done. And as I said, I'll really just starting to publish in this field. I'm really a -- I'm known as a parallel supercomputing guy, compiler guy, and architecture guy. So this is a new branch for me. And because of that, I'm, as I say, quite confident that I don't really know what I'm talking about here. And if any of you have any nice things to contribute that way in terms telling me, you know that thing you said on the fifth slide, that's totally wrong, please, go right ahead and tell me, because better to find that out now before I've actually published a whole bunch of papers than later. So thank you very much. [applause] >> Hank Dietz: Yes? Your photographs of the point spread functions for the different lenses are very much like open shooting at a point diffuse source. But how did you actually set those up >> Hank Dietz: Okay. Now, setting them up is actually harder than it sounds because you really need a good point source. And it turns out that -- so, okay, you can sample a point spread function out of focus past the focus point or before the focus point. And you'd really like to sample both to make sure that the inversion is really correct and all of that. Sampling them before the focus point is very hard because any minor detail in that point source, any flaw in that point source is going to show up in the point spread function. And to be honest with you, I haven't found a really satisfactory way of doing those yet. What I've been playing with is I've been using some very fine fiberoptic fibers and using that as my point sources, but I'm very open to suggestion on that. >>: [inaudible]. >>: Lasers? >> Hank Dietz: No, a laser won't have the same properties because you'll get all the -- yeah. So I think that the fiberoptic cable is not a bad way to do it, but I'm not happy with it yet. The ones that I have been using, the ones that I've showed you, most of those are really easy because once you get far away enough if you have a relatively clean light source, it's not that hard to make something that's a good approximation to a point source. So my standard thing that I've been doing is at a distance of 10 meters with the camera focused basically at something like five feet, so like a meter, a meter and a half, something like that. I forget what I had exactly. Actually, I have a web page that gives the guidelines for how to do that because I was trying to get other people to sample lens that is didn't feel like paying to -yeah. At any rate -- did I say low budget? Yeah. Okay. So basically what I've found is that, believe it or not, if you choose wisely, one of the little white lead pen lights can actually be a reasonable point source. >>: So the LED's got enough spectral breadth to it that it doesn't crease. The wave effects are the ones you'd be concerned about, right, to get a clean spectral source? >> Hank Dietz: So while the wave effects are really not going to be that relevant so long as I have a relatively large aperture, et cetera. So basically it's not -- it's not touchy in that way. It's that if there's structure there, that structure gets magnified in much the same way that the lens defects get magnified. So basically if you have something that's a relatively clean white lead source -don't get me wrong. I'm not saying it's a good source. I obviously need a better source. But that's good enough that I've been able to repeatedly see the same patterns on different lenses multiple times over periods of months. It's been good enough that way. Let me say the ->>: [inaudible] >> Hank Dietz: I did that -- well, actually I tried using halogen light sources through the fiberoptic from distance, and that wasn't too bad. But the lead pen lights, believe it or not -- just at 10 meters the lead pen light, if you choose wisely which lead pen light, that actually works pretty well. And if you think about it, it makes sense. Because if you think about the physics of how the led pen lights are actually making the light, you really do have a pretty clean point source. So it works better than you would originally think, but when you think about it, it really is a micro structure that's quite accurate. The issue that I actually ran into that I wasn't expecting but I obviously should have expected this, as I've gotten all these old lenses and I'm changing lenses all the time, I'm having to clean my sensor like three times a day. And if you've ever cleaned a sensor, this is not fun. So basically there's always dust on every sensor, and I'm not doing this in clean room circumstances, I'm honestly doing it in my basement because that's the only place I could get it dark enough that had a long enough shot. And, by the way, just to tell you how bad it is, I'm doing it in my basement and the target is actually in my shop area where I have woodworking tools and all that. All right? And, by the way, I do have a lab that's long enough at school, but it's got windows the whole length of it. So I can't make it dark. So, at any rate, the long and the short of it is that the dust has been a problem. And the way that I've gotten around that is for many of the point spread functions that I've captured and that I've shown you, I actually go through a procedure where I essentially capture the point spread function multiple times and then basically rotate the camera to essentially eliminate the defects from some of the sensor dust. And if you rotate or move the camera slightly, you can basically compensate. So the marks that you've seen in those I'm fairly confident are the marks on the lens, but it gets convolved with the dust on the sensor and that sort of thing. That's been more of a problem than the quality of the point light source. >>: Well, and some of the cameras have an automatic dedusting thing which I presume is -- it's good for me, which is probably not worthwhile. >> Hank Dietz: Some of the cameras have a marketing device. >>: [inaudible] >> Hank Dietz: Well, so like the Sonys, what they do is they do ultrasonic vibration of the glass over the sensor. And, by the way, I should be precise. We're never touching the sensor. The glass is on top of the -- but at any rate, so Sony actually does this vibration, and on the next 5 it's supposedly an ultrasonic. On some of the others it's actually they use their image stabilization thing. They just literally shake it violently. And there are a whole bunch of other techniques, some of them using ultrasonic pieces of plastic that basically try and projectile vomit the dust away. I don't know how else to put it. But from what I've seen, none of these techniques really work. And do they get the dust off? Yes. But the dust that they get off -- there's another device that is really high tech that I can show you that gets that dust off really well. It's called this. So the dust that comes of with this is the dust that they tend to be really good at removing. The dust that is stickier or whatever and, you know -- I'm not proud here, right? So, again, I get no money from any of these people, which is an unfortunate theme here, but at any rate, this is a lens pen. And I'm using this to clean my sensor all the time. And even so, I don't know how many of you have interchangeable lens cameras, but if you ever want to really be truly disgusted, just take your camera, set it on a nice slow shutter speed, set it to overexpose a little bit, point it at a blank wall, and basically take the shortest focal length lens you have, stop it down as far as it will go, just move the camera violently while you're taking the picture so that there's total blur, there's nothing there for it to actually interfere with the dust pattern, and then take a look at the image and you're going to see just -- it's incredibly scary. There's just tons and tons of dust there. By the way, you usually don't see it because basically if you're stopping down, you're getting very narrow shafts of light, it casts basically nice sharp images of the dust that's really far away from the surface of the actual sensor. But if you're wide open, it's not sharp shadows, you don't really see it. So, yeah, dust is much more of a problem than I ever anticipated. I'll also tell you too, it really hurts, because in terms of doing things like macrophotography and such, I've actually found that good old bellows actually work better than a lot of macro lenses. The catch is bellows are like the ultimate dust infuser because -- yeah. So to make a long story short, I obviously need a clean room, don't I? But it's worked out okay. And as I said, I really go by how repeatable the measurements are that I take. If the measurements are repeatable over a period of months and so forth using a similar setup but with minor changes, then I feel like I must have actually compensated pretty well. >>: In your paper would you be going into the actual PSF reshaping? >> Hank Dietz: So ->>: That's -- I don't understand how you even go about doing that, frankly. >> Hank Dietz: Oh, it's very simple. So you recognize the structure. Basically what you're going to do is you're going to take the -- well, I'll tell you directly what the idea is. My representation, the thing that the GA is trying to come up with, is basically for each pixel, what the total energy is at that pixel in the scene, you know, corresponding to that pixel in the scene, in each of the color channels and basically a positive or a negative diameter for the spread at that point based on whatever the point spread function for the lens is. So what I'm really looking at is red, green, blue diameter for each pixel. And the diameters don't have to actually be circular because if I know that the pixel is physically off to the side, if I know what the point spread function is for that lens, I can recreate that. So all that I'm really doing is when I have that set of data for the image, when I have the red, green, blue and basically that spread diameter, that point spread function diameter, that is sufficient for me to then perform any kind of transformation I want to change the diameter systematically -- for example, enlarge them and then re-render with a wider stereo separation with a point spread function that looks like two dots instead of one or a dot that's offset for the left side or a dot that's offset for the right side, so two separate images. So it's basically really all I need -- the gold image for me is really that red, green, blue, and point spread function diameter with negative before the focal point and positive after the focal point. And if I know that, I can reconstruct everything. And, by the way, the reconstruction is more complicated than you might think in that it is a true reconstruction. So what I'm actually doing is, for example, when something is in front of something else, it clips the point spread function for the thing behind. So that actually happens. Basically you solve that by just doing a rendering order. It's really very simple. >> Andy Wilson: Okay. I think we're just about out of time. So I think Hank will be around today and again for another talk this afternoon, so he's available if you want to chat with him. >> Hank Dietz: Right. And if you want, I guess I can say one more thing. So I have a research exhibit at the supercomputing conference, which is how I could swing being here today. Actually, I've had a major research exhibit at the supercomputing conference -- well, I've had my own research exhibit. My group has had my own research exhibit ever since 1994 when we first showed off whole bunch of Linux PC clusters. As I say, we were the first people to build one, eight months before the [inaudible] guys. And I'm the author of the parallel processing how-to for the Linux documents project, which, ironically, even though I've published like 200 technical papers, there are 30 million plus copies of the parallel processing how-to in circulation. It doesn't count as a peer-reviewed publication. But it's obviously the biggest impact document I've ever written. So there's all this weirdness, but, at any time rate, yes, I will be at supercomputing, and we're Booth 202. So I think we're three over from where Hewlett Packard is. >> Andy Wilson: Thank you. >> Hank Dietz: Thank you. [applause]

>> Andy Wilson: So today we have a guest... Kentucky who happens to be my older brother, Hank Dietz. ...

Related documents

Products

Support

&gt;&gt; Andy Wilson: So today we have a guest... Kentucky who happens to be my older brother, Hank Dietz. ...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Andy Wilson: So today we have a guest... Kentucky who happens to be my older brother, Hank Dietz. ...