22315 >>: Rick Szeliski: Good morning, everyone. It is my great pleasure to introduce Bill Freeman to give our talk this morning. Bill is a Professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology where he's been a professor or faculty there since 2001. Before that, he was a research scientist at Mitsubishi Electric Labs just across the street. He obtained his Ph.D. from MIT in 1992, and before that he also worked in Polaroid. So he's alternated between industry and academia. Bill is extremely well known in fields of computer vision and computer graphics having done a lot of very seminal work. And today he's going to be giving us a little tour through time and photography. >> Bill Freeman: Thanks a lot. It's a pleasure to be here. So yeah, this is a really fun talk to make and I'm glad to share it with you. So I'm interested in photography and how photography tells stories over different time scales. So I thought it would be illuminating to just organize it all and just go through from the practically the shortest possible photograph you could take to practically the longest possible one. And talk about how photography tells stories over these different time scales. So let's start off with the shortest possible photo you could take. And that would be a picture of light itself. So this is a photograph of a very short pulse of laser light. You can see it here. And it comes through a diffraction grading and splits into well the primary lobe and the first diffraction, and then the second order lobes. And so how do you take a picture of light? Well, this was done about ten years ago using holography. So you have your pulse you're going to photograph going wherever you want to have it go through. In this case on ground glass. Then there's a second reference beam which exposes the photographic plate and makes a hologram at the same time. And they're very both short pulses. The photographic film only records the coherent beats of the two waves when this one passes over the holographic plate. So this acts as sort of a gate for the photograph. And then if you take the developed hologram and look at it and move your viewpoint across space, you get a picture over time of this light wave progressing. And that's quite remarkable. Recently there's been another remarkable thing of actually recording photographs of light using electronic means, not just holography. And I wasn't going to talk about it this because it's sort of unpublished work and people pointed out to me look it's on YouTube. It's not my work it's by Ramesh Raskar at the Media Lab. And let's see, what do they do? They have an extremely short laser pulse, and then they have a special sensor which is just a one desensor that I guess you scan electron beam very quickly and it's only sensitive to light when the beam scans it. So you can get temporal resolution on the order of tenth to minus 12 seconds so you can only record one horizontal row at a time so the photograph is going to show you are of light traveling actually took hours to expose, because they had to just do the laser pulse over and over again and they recorded one row and then the next row and the next row and so forth. So let me show you these. These are by ram mesh Raskar's group. So here's a light beam passing through ->>: Should we turn the lights down? >> Bill Freeman: For this one maybe we should. >>: Could you turn them all off for one second. >> Bill Freeman: It's going to go through it a couple times. And I don't know if you -- I don't know if you can see him. But there's Ramesh down there describing his work at the international conference at computational photography. So and again they can record this by measuring with very fine temporal precision one row at a time. Now, this next one they're going to show overlay that with what we just saw on top of a static photograph of the scene that was there. So now you can see it all in context and this light beam comes through the Coke bottle, could have picked a better subject, but anyway you can see it progress. So this is just -- you know, this just blows my mind actually. And you can go to that Web page and look at the videos yourself. So that's in the very, very fast range on the order of ->>: I think that was the darkest video. So we'll bring the lights back up. >> Bill Freeman: On the order of ten to the minus 12 seconds. Now let's slow it down by a factor of a thousand and get like ten to the minus nine seconds what sort of photographs can you make there, is it useful to make there. So this is a realm that's useful for time of flight depth imaging. So there's a camera which I brought in my lab eight years ago called 3DV systems. I think Microsoft might own it now. I'm not sure. But they sent a slab of light, about 50 centimeters long out to the object and then this bounces back. 50 centimeters is on the order of ten minus nine seconds worth of light. It bounces back, and now the things that hit an object which is close to you are ahead of the things which hit objects which are further away. So you've distorted this wave form. And then at the camera end they go and quench the detection with a very fast shutter. So they, the camera only receives this amount of light. So now you've traded off depth coding for intensity coding. Now the intensity of the position is proportional to the depth away from you. And you can make a depth image this way. So. >>: Proportional modulate by [inaudible]. >> Bill Freeman: So they actually have to take -- they use two exposures to get this, right. And so here's the RGB version and here's the depth camera version. It's pretty noisy, but it's serviceable as a depth camera. Okay. So that's the very, very fast. Now we're going to slow it down again by another factor of a thousand roughly to the one to two microseconds range of time. And this is the range of high speed flash photography and ordinary flash photography. So of course the way you take a picture this way is by keeping the aperture open, having a dark room and then just for this brief moment expose the subject with flash of that duration, and you can capture events that happen over that sort of time scale. So how do you make a flash that short? Well, there's two ways you can do it chemically. So these people got Noble Prize for these fast chemical reactions they studied. You can also make it electronically. So Harold Doc Edgerton well known MIT professor pioneered these electronic flashes. In preparing this talk I discovered the wonderful Web pages that has his lab notes online. So here's a page of his lab notebook showing the design for this very fast strobe. 1930 Harold Edgerton scope tests. He was a master of just taking beautiful photographs with these high speed strobes. So here's water as it cascades down. Here's a little assortment of photos. Bullet going through an apple. Many photos over time of a diver. Schleerin [phonetic] photography of a bullet going through a candle. A bullet going through a card. This tells you when these photos were taken this is a footballer of the day kicking a football. This is one I really like. It actually tells a story over a number of different time scales. Here's the bullet and here are three different balloons that the bullet has passed through. So this shot took about one ten -- the exposure is about one in ten microseconds but at the same time we get a picture on the order of the two milliseconds that it took the bullet to travel this length of time by having the identical structure repeated and now we sort of code time over space and we see the progression of the destruction of the balloon as a function of time now displayed over space. And, of course, now high speed flashes are commonplace and anyone can do them. So there's Flickr groups on flash photography. We have the updated version of the bullet going through the Yoplait yogurt. There's the bullet there. I find them so delightful because they relate to everyday experience but they let you see it in an entirely new way. Here's somebody putting their finger in water and you see it in a way that you've never seen it before. Here's everyone likes bullets going through things. Here's a bullet going through a cookie, which looks like a cookie sneezing. >>: [inaudible]. >> Bill Freeman: Yeah. >>: Just kind of wondering like how the society at large lead into that work, they sort of ->> Bill Freeman: I think they embraced it. I think he was like -- at least at MIT they tell us he was like this hero, really well known. >>: That's the dogma, but I don't know if that's the truth, though. >> Bill Freeman: Well, he took part in the war effort, too. He made these -there's a photograph of like Cambridge from up in the air at night. He just had a humungo flash that lit up the entire city and used these in World War II for reconnaissance. Okay. So that's flash. Now, let's slow it down again. Now finally we're in the realm of sort of conventional photography. Say 1-5,000th of a second to 1-25th of a second ballpark. And of course there's billions of photographs one could show that were taken over this time range. It's sort of interesting to look at the very first photographs that were taken or movies that were taken over these sort of time scales. So, of course, then we have the names Marey and Muybridge and Edison who, in the late 1800s, made motion pictures and photographs in these kind of time frames. So here's a photographic rifle that Marey made, shoots 12 seconds fascinating trained as a doctor, interested in circulation and how things move and studying hummingbirds. He came to photography as a way to study his passion of how birds fly and animals move. So he took a lot of photographs of birds in flight, and they recorded these 12 frames around the circle of this photographic film. So here's one of his shots. And then he made these beautiful sculptures, I don't know if it's taken off for landing, but a bird in flight and also this photograph of a Pelican landing, which again reveals a whole new viewpoint on an everyday thing. He made bronze and plaster sculptures of these. These are photos of people hammering, jumping, and of course Muybridge was a contemporary and they talked to each other. So Muybridge addressed the question of the day, which was when a horse gallops, is there ever a moment when all the hooves are off the ground at the same time? And this wasn't known before these photographs were taken. And now you can see that there is a moment when all the hooves are in the air. So now let's slow it down yet again to, telling stories over the time scale of 1-25th of a second down to one second. So this is sort of the realm of photographic blur. And blur tells a story. Here's photos of blurry photos that tell you a story of what's going on over that sort of a time scale. And there's a Web page I want to point you to Ernst Haas photographs. A whole series of blurred photographs, gorgeous, artistically done. Here's a bullfighter, a bird in flight. And on a blog I was pointed to a Web page by this anonymous Flickr photographer called Just Big Feet who made delightful photos of a marathon taking exposures on the order of a second long. But you get these beautiful stories of runners. So this is almost not all my work but there are insertions like product placements in a movie of my work. Like in a movie when you see a product placement you are like why are they focusing on that Coke can so much. Here's the first product placement. This is joint work with Ayan Chakrabarti, graduate student at Harvard and Todd Zeigler. And you can use blur to help you learn about the image, use it to help segment out the blurred from non-blurred objects if you have a carefully designed prior model for how images ought to look and how blurred images ought to look. You can make a one-dimensional search for all possible motions and find the most probable speed of any one region and segment out the individual region according to how it was blurred. Here's a segmentation of the blurred runner. >>: Why are the shoes blurred with the foot attached to it? >> Bill Freeman: Why do they? Because they're on the ground. >>: No, there's one shoe off the ground. >> Bill Freeman: Oh, this one. I see that could be a number of things. There are a lot of steps in this segmentationnal algorithm, and it could well be because the contrast between the shoe and the grass is not as strong as contrast between the skin and the grass. >>: Or it could be measuring horizontal ->> Bill Freeman: Yes, we were. Yes. >>: Even the curve of the foot and shoe go together. >> Bill Freeman: That's true, they do go together. As I said, this is actually, the local blur is just one input to this segmentation algorithm. And there are other inputs as well. So now let's go down, slow it down even more, down to the time frame of seconds to hours. So this, of course, is the world we all live in. This is the world that movies are in. And there's a lot of wonderful ways to tell stories over these time scales, too. Typically you often have time one and then time two, maybe, several hours later, minutes or seconds later. And okay so there are a number of ways you can describe what's going on over this time. You might the what's going on between these two time frames is a stationary process. And you just want to describe what's this constant process that's happening between these two times. So it might make sense to average images over these times, and I'm show you some of those. You can also again the it's a constant process but select different ones that stand out in some way. And so there's useful work to do with what you would call image selection. And then finally you might the there's a process that's actually somewhat changing over this time. And you might study analyze how things have changed over that time. So let's go through these one at a time. First let's kind of describe what's going on between these two times separated by seconds or hours through averaging. And now you get to another artist. Jason Salavon, who has made these wonderful pictures that are averages of the, well averages of many things, but these are averages of the late night talk show hosts. So here's an average of many images of Jay Leno. Can you tell who this one is? Conan O'Brien and Dave Letterman and they're all different in a different way. And I can recognize them. Here's another averaging photo. An eight-hour photograph by an artist Atta Kim, which again tells a nice story about urban life, really. So that's averaging. And then as I mentioned there's selections. So how are you going to select pictures which frames to show of many frames you might collect between these two time frames? Well, one way to do it is by which one is closest to you or which pixel is closest to you each time. So this brings up something called shape time photography, which is our second product placement, I should say. So this is something I was involved with. So here's the deal. You take stereo images of something, many times a second, and so you record both depth and image information. And you can use that to tell the story of things moving over time. So, for example, suppose you have five frames from a video of the death rattle of a quarter on a table. How might you composite those together to describe the action of the coin rattling on the table. Well, you can just average them all together with the averaging method, and that kind of works. It tells you let's just see what happened. But there are a number of problems with it. You can't get any sense of the depth of things or the temporal order of things and you have reduced contrast of the things where things are averaged together. Oftentimes computer vision methods are used to extract out the foreground thing from the table. And then you can layer by time. You can put the first things in first and then layer on top of that, the thing that happened next, and so forth. So that fixes the contrast problem. But now you've got another problem. The shapes are all wrong, the thing that's on the bottom is actually on the top in this photograph so it doesn't really tell you the story of how the shapes relate to each other. So instead you can make your selection of which pixel to show from each time according to their shape. So you show the, at every pixel you show the intensity corresponding to the thing that was closest to you out of all of them. So this gives you sort of an approximation to what you would have seen if you looked at the union of all those shapes at the same time. And we call this shape time photography. And you can use it to tell little stories of short-term things. So here's three photos of my wife sewing something, and then here's the little composite picture telling you how to sew. And here are two pictures of my brother-in-law's head and I'm sure you're all wondering what would his head look like if it were in the same place at the same time, it would look like that. So and this relates to more generally what you might call another selection method you could call, called lucky imaging. So here the story is that there's this process going on over this time. But maybe there's something obscuring it at some moments and not at others. Or maybe what you really are looking for occurs at some moments but not at other moments. So you want to select out those lucky shots. And this is used in astronomy with great success. So here's a single exposure of a distant astronomical object. Obviously under very noisy conditions. And here's another exposure. When the atmospheric turbulence was a little bit better, as you could see the object a little bit better then. And it's enough better that you could actually measure out of all your photos this is a good one and that's not a good one. You could imagine looking at perhaps the local variance or something. So if you take an average of 50,000 of these, you get this. And if you take an average of just 500 of these selected good ones, then you get this. And so this is called lucky imaging, and you are just grabbing those moments when the turbulence happens to line up just right to give you a better view of things, and either show those or average over those to get your lucky picture. >>: It's turbulence, because the noise you'd expect to be fairly. >> Bill Freeman: Yeah, I think that's how we put it. Yeah. And Microsoft Researchers have exploited this. So Michael in the back really should be telling the story. But I'll just say it anyway. So this is work he and Neil did in Rick's group looking at Mount Rainier. So, first of all, I'm told just getting a picture of it at all is one lucky thing right there. But on top of that -- so let's see, here's an input, hazy, you can apply image processing techniques to it and get this dehazed version. So this is pretty good. >>: Actually very clear day. >> Bill Freeman: This one was. >>: So far away. >> Bill Freeman: Clear day, right. Clear day, imaging processing makes it look even clearer. Not quite there yet, it's still rather noisy. So let's take an average over many photographs of these dehazed images and we get this. But again we're subject to the atmospheric turbulence that slightly disturbs the light paths over that long distance. Instead, what they did was a local shifting of each patch to get them over the ones that they're averaging over to line up much better by making local adjustments rather than a single global adjustment. And so by doing that, that sort of local lucky imaging and adjustment, they can make a photograph like this of Mount Rainier. Did I miss my lines? >>: One more after this. This is without the lucky -- this is just the line. There's one more that's aligned with all of them. Anyway, I'll show you those after. >> Bill Freeman: Thank you. >>: One more slide or ->> Bill Freeman: I'll get it right. This is very helpful to have the author there. So here's another one again with the same author. This is, again, a form of lucky imaging but although a different kind of context. So again this is Michael's work. You have a group shot. And again this is sort of if you will a continuous process over a lot of time where people randomly smile. But they don't all randomly smile together. You can't get that one shot that you want. And you want to combine the different locations at different times to give you a single composite shot where everyone is smiling and everyone looks good. This is back in the days when Michael was still in the witness protection program so we don't see his face here. But this was I guess to protect anonymity of a submission. So that's another form of lucky imaging. And then one thing that I wanted to do was a form of lucky imaging I always wanted to go to a large plaza and get a movie of just the countless people walking along. And then construct a single composite image where everybody was walking on their left foot as if they're all marching together. Well, it turns out an artist has done this, and of course I'm sure done it much better than I could have. There's an artist named Peter Funch who has wonderful photos with just that idea in mind. So here's one where everybody's in the air. Except this one guy here who is like wondering what's going on. And, again -- so he doesn't say how he made this photo, but I assume it was a form of lucky imaging, stood there with a tripod and photographed many people who came across, and whenever anybody was in the air and composited all the appropriate photos together to get this single collage, although it doesn't look like a collage because I'm sure it was all taken at the same place. Just a whole series of these. So here's another one. Everybody carrying a manila envelope. And this one's really nice. This is everybody in time square taking a picture. And can you tell what the story is here? World of children. Everyone's young. So those are again lucky imaging, but done by an artist. There's another artist whose work I really like, and she makes found animations. So here's a photo of a horse. She went and collected lots of photos of horses and made a movie out of it. So you can think of it as lucky imaging, the time goes over many hours, I'm sure the photos were taken over many months but they're all composited to make a single story. And then she has another one so this next one doesn't fit in with pictures over time but it fits in with the story of random selection, and I like it so much I'm just going to insert it into the talk here. So we're going to take a two slide break from the talk about time and show you two other images from Cassandra C. Jones. So there's one. I just love it. It's random selection. Here's lightning forming a little bunny rabbit. And here's a lightning forming a little squirrel, a chipmunk. So these are -- again, random selection. They're not telling you a story about time. But they're telling you a story about. >>: Does her process go all the way ->> Bill Freeman: I'm sure it's not. Okay. >>: A lot of modifications to be moving the shapes around to make something. >> Bill Freeman: Yeah. >>: So you can also show how things have changed. So now we're on to describing changes over time. Again, showing things by one of my favorite scientists Michael and collaborators. So this is selecting images from a short sequence and compositing them together to tell a story of changes over time, daughter on the swing set, monkey bars. This is another one just like the balloons by Edgerton makes a story of time separated by spatial position. So here's a sequence of photos of a building being blown up but here they've composited them together with the latest ones on the left and earlier ones on the right but you now code time spatially and you get to see in one photo of the whole spatial structure how it deforms over time. Then just to show off they then went and did it in the other direction. Now it's times later on the right and earlier on the left. In case there's a few of you in the room who haven't seen this, I'd like to show you this we call motion magnification which analyzes small motions and exaggerates them. This is my wife on the swing behind our house, we track feature points carefully to avoid collusion artifacts and then we cluster them according to similar tracks over time. And then we group them in, to get a layered representation of the motion. And the user says take the red stuff and amplify its motion by a factor of 40. So we'll make a motion microscope. We'll let you see small motions. You put the pixels back in and push them around that way. But now we have holes where we didn't see data before. You use text or synthesis methods to fill in the missing holes. Now we have a motion microscope that lets us see the small motions as they would have appeared if they were amplified by some factor. She looks at this and asks if I felt the video makes her look heavy. [laughter]. And here's the motion that someone undergoes the balance upside down and push down on wooden things aluminum supports and it actually moves. Oftentimes parents of perfectly healthy newborns wonder is the baby breathing. So now we can tell that. And you can actually use it for science. So here's small deformations of a membrane as an acoustic wave goes across it in the ear of some small animal. So magnify, you can see the deformation, but originally can't see the deformation really at all. So we're publishing a paper on that membrane. So that's sort of taking something and exaggerating its difference relative to zero motion. Recently we've gone and gone a little further with that and tried to exaggerate the difference of one motion from another motion. So here this is a two -- the same car going over the same speed bump, once with its trunk empty and once with its trunk full. Might imagine you sponsor this with the defense department sponsor but here it is again. And so we track each one carefully. And now we're going to exaggerate not the motion relative to zero, but the motion of the full trunk car relative to the empty trunk car to see if we can see a difference in how it moves. And you indeed you can. You can sort of see an exaggerated version of those differences in the motion. So now we're going to slow it down even further, down to hours, to months. So this is in the realm of time lapse photography. So let me start with a nice time lapse. This is from planet earth, PBC documentary. This is a beautiful time lapse and great care was taken to make that, I the, because it's really hard to get it so smooth. Of course there's camera motion very slowly at the same time. You've got these two different temporal processes. Feels like just a conventional pan. But that panning took a long time to do, of course. >>: Several days? >> Bill Freeman: The same time scale as it took for these flowers to open up. >>: It may have been shot -- like there's a -- so there's a -- well, so there may have been a pausing going on there, because there's a making out thing that's a different sequence where they actually talk about the fact that, you have [inaudible] and the sun comes in, it's possible to do it slowly, they actually bring all the flowers into the studio. >> Bill Freeman: Great, this is very helpful. I'll look at that. >> It's well worth watching. [inaudible] BBC. They have this wonderful tracking shot, going through I think like a glen or something like that. And it's astounding, but it's all about time. >> Bill Freeman: Great. Good. That fits in with my point, actually. That there's really a lot of room for computational photography to make an impact here. Typically with time lapse, there are all these things happening over this period of time. Maybe the lighting's changing, maybe the object is moving around in a way you don't want. So you've really got to understand the sort of higher level -- you want these controls over things which are normally difficult to control in a photograph such as the changing the lighting or changing the position exactly. And so if we have better computational understanding of those things, we can do a better job of recording events over these long time scales. So a first attempt at this was made by researchers at Harvard and at Merle. I guess they're now at Harvard. [inaudible]. So they worked with time lapses and first developed a method to remove cast shadows from the time lapse, and with the cast shadows removed you can identify them because the intensity goes way down with the cast shadows removed they did a low rank factorization of the sequence to separate out the sequence into different components time of day and lighting components. So here they're rerendering the sequence without shadows. We can also then manipulate their low ranked decomposition further. So this is kind of a first step at this sort of thing you want to do. Of course it requires a stationary camera and stationary subject for this case. And a key piece of making good photographs over long time sequences I believe is tracking. And the better we can do tracking, then the more flexibility we have at rendering things sharply that are moving overtime, so forth. Just to address the point what's the state of art in the computer vision of tracking, here's what we think it is. So we took sort of the best, a good candidate for the best trackers, Fox and Maleks [phonetic]tracker, and reimplemented it by my extremely good graduate student Michael Rubenstein. This is kind of a picture of what the state of the art is in published literature as opposed to what Fox and Maleks have in their code they haven't released or whatever. So here it is. Okay. So each track is coded with the color of when track was lost for that piece. And so you can tell what the color code means by looking at the color of these things as they slide off the end. Let me play it again. So these last all the way until the red stuff slides off the end. But others not quite as long. And you'd like ideally all the dots here would be the same red color throughout the whole tracking of the cheetah. Is that clear? This color-coding? It's demonstrating that these tracks are actually much more short term than you like them to be. You like them to be stuck on for the whole length of time that the cheetah is in view and even remembering when they pass through occluders. >>: The features are not -- it's an adaptation in the feature tracks? >> Bill Freeman: Right. I don't believe there is an appearance model that changes over time for this one. And of course that's what we're working on. So just another piece of work that we've been involved with, in a time lapse, there's things going on all these different time scales. So here's some sprouts growing and you've got a short scale of the sprout ends, flickering back and forth and you've got the longer time scale process of the plants growing themselves. And you'd like to -- I think you'd like to be able to make a photo that took those things and treated them separately. You'd like to be able to just see the long-term effect by itself and that would maybe clarify this time lapse for you. So without actually tracking we've made what we call a motion denoiser that addresses that problem. So again let me just go through this in a little bit more detail. The game is we want to make new video that's going to use the pixels only from this video and just reorganize them in space and in time. So we're not going to use a pixel we've never seen before. We're just going to put it into a different position. So the desired output of our algorithm is a warped map which tells us where we've grabbed each pixel from that we're rendering. So what do we want the warp map to look like, so W is a function of the position P. So we have several terms that tell us what a good work map is. Number one is it more or less respects the original video. So the intensity of the warped video minus the intensity of the non-warped is small. But that would just give us the original video back if we didn't do anything else. So we're also going to say that we want the warped map to change very little over time. The warped map video, the output video to be pretty slowly changing over time. So the output at one time should equal the output at another time. And finally we want this warped map where we grab our pixels from to be spatially and temporally smooth. So there's another term there. And these three terms define a Markov random field and you can find the optimal -- find an approximation to the optimal warp map which gives you the optimizes this objective function we've created. You can do it a number of different ways. Iterate conditional modes is one solution method. Graph cuts is another. And loopy belief propagation is another. We've tried all three. For this particular problem, we believe loopy belief propagation worked best. This here then are the approximation to the optimal warp according to this objective function we've made. Here's the original video and here's our motion denoised video output. So what we're trying to do is show only the long-term effects and not the short-term. And here's a little story that tells us how this video was made. Here's the spatial displacement at every position and here's a color code telling what, how the color displayed here corresponds to a spatial displacement from the center of this figure. And here's a color map showing the temporal displacement of every pixel. So you take these pixels and you grab across space and across time according to this map and you get this output video. Let me just play it again. And the thing you might notice is it looks pretty good but we've clipped off some of the ends. That's because this whole thing, the state space that you're trying to solve for the translated pixel position in space or time, there are many different translations we have to consider. And so that slows down solving of this thing and we had to only consider a relatively small-volume in space and time. If we just take a little crop of this video and allow ourselves a bigger search space in space and time, then we get output frames which look much closer to the desired ones. So this is just a matter of computational time to fix that artifact. But so now you can take your input video and separate it into the short-term kind of low frequency motion components and the high frequency long-term and short-term components. >>: Is it possible to get some sort of like a temporal Marey effect? >> Bill Freeman: Sure. That would come in actually right at the time lapse itself. And so that might mask a high frequency thing as something that's low frequency and then we would look at it as low frequency and not smoothing out, that's true. Here's just a comparison of different ways you might do this motion denoising problem. Here's the source. Here's taking, just taking the average at each position, the average value over the sequence over some temporal window taking the median value over the temporal window, and here's our motions denoised output. As you might expect taking the average over time gives you something that's kind of smooth but it's blurry. The meeting's a little bit better and the motion denoise is a little bit better. Here on the bottom we show a space/time drawing of this. So you can kind of see better what's going on. Here's one scan line displayed over multiple times. So in the original it just goes scccr and it wiggles back and forth as we saw in the original video. The mean and median are blurred out somewhat and here's the motion denoised. So just to show you a few more of these. And this will appear in CDPR in June. Here's a source long-term components, short-term components. Here's a swimming pool being dug. And you can see in the output this grill cover is stabilized and you can again separate into the long-term and the short-term components of the video. And this isn't perfect, but it's taking steps in the direction that I think computational photography should go for long time scale events of giving you kind of independent controls over these different components of the video. Here we're looking at the kind of low frequency motions and the high frequency motions. You can also imagine -- there's a beautiful set of time lapse images made by the extreme ice survey of glaciers. Here's the original time lapse of a glacier. It's really noisy in many ways, but here we've applied our method to pull out just the long-term and the short-term components of it, and we think it gives better rendering of it. So now let's go beyond time lapse up to years and centuries. How do you tell stories with photography over time, years and centuries? One way to do it is to just photograph very, to look at photographs from very long ago. I mean, that tells a story over time. So I just want to show you these photos I like so much. These are some of the earliest color photos made. They were separations made sort of temporal multi plex, to obtain the color. So make a black and white photo through three different colors of filters. If you combine them together you can get a color photo. We now with digital methods we can combine them to get much richer color pictures than they could have seen when they took them back then. And now we have some of the world's first color artifacts from the fact that the motion waves, of course, didn't stay stationary over the three different exposures. Another way to tell stories over years and centuries is to compare photographs that were taken a long time ago with photographs that you take now. And so there's a really nice sort of -- there's a book called "New York Changing" rephotographs of old photos. So a person went and retook photographs from many different locations. So this original set were taken in the mid-1930s and the new set were taken around 2000. So it's very illuminating to compare the two. So here's the Manhattan bridge looking up. It changes very little over the course of time. Here's looking from on the bridge. 1935. And how do you think it's going to change in 2000. >>: Wider. >> Bill Freeman: Pardon? >>: I was going to say wider. >> Bill Freeman: But actually this is the pedestrian part of it. Constraints on what you can do. Fences. And here's an old photo of a street corner. 1936. And 2001. And just even something as simple as this would make a nice sort of masters thesis computational photography project. You'd like to have these two as inputs and get a separate picture of what was there in one picture but not in the other and there in the other but not in one and do it in a nice artifact-free way. I think it would be nontrivial and I think it would be nicely useful. Another way you can apply computational techniques to these problems is to use the computer to help you with making these rephotographs. So this is work by my colleague Ferado Durand and Agarwala and his student Somi Bae. They use computer vision methods to help you line up your camera at the right position to match an input photo that you want to make a rephotograph of. This uses the kind of computer vision methods you might expect. Local feature detectors and knowledge of the geometry to tell you how to move the camera, how to adjust the focus. So this is a test, here's a reference photo, and rephotographing using their method, and rephotographing kind of naively trying to make it match. >>: How many years ago did they do this work? >> Bill Freeman: No more than four. >>: The only reason I'm asking is that back then they might have had to do it with a laptop and a camera. Now it could be an iPhone app. >> Bill Freeman: True. Definitely I know it wasn't just on the iPhone. And so here's some real test cases. Here are -- let's see, reference photographs. Okay. Each row is the same place. Reference photo. Other reference photo. Their rephotography results and comparison against the professional photographer who is also rephotographed the same thing. You can see they get it slightly better, although it's a shame they had that car blocking the view there. But maybe that's part of the story, too. And then again another way to tell stories over hundreds of years is not just to compare individual photos but to compare an aggregate of the photos. So here again is the artist Salavon. This time showing an average of high school yearbook portraits from 1967 and from 1988. And again this is just kind of delightful that you can sort of see a story in these average pictures and then see how the averages change over time. One of my favorite, and kind of forehead-slapper wish I had thought of that thing for telling stories over long periods of time is this Picasa face movie application. So here's a YouTube video of it. It's really simple. >>: I think this is coming out as a SIGGRAPH paper this summer it was on the preview submitted yesterday. >> Bill Freeman: Let me just finish this and I'll take your question. So you take all your pictures, put them in a shoebox as it were, I guess you have to tell what the ordering is. And then the only computer vision technology really is identifying the face and lining things up so that the transitions work well but it's such a compelling story over many years that tells the story of this woman growing up. And anyway again it's a nice use of computational methods to help tell stories over long periods of time. Yes? >>: Two questions, one there's a series of four sisters over I think 25 years Nixon sisters, I'm not sure, artist -- photographed and then they tried to what you call it canonical, the same sister in the same position. It's the something sisters, Nixon, maybe not. >> Bill Freeman: I haven't seen that. >>: Has anybody tried to rephotograph the original building that was photographed in the 1820s? >> Bill Freeman: Which ones in the 1820s? >>: In the 1820s I believe the first photograph, for MIPS. Maybe it's a stepping? >> Bill Freeman: I'm not aware. Okay. Now finally I'm going to slow it down again. I'm going to go beyond centuries how can you take photos beyond centuries? There's two ways you might look at it. One way is let's take photos over a centuries time scale over human scale things. Can we make instead of a time capsule can we make a time capsule camera that records things over hundreds or thousands of years. And this becomes as much a hardware project as a computational project. It reminds me of the 10,000-year clock that was, I don't than if they're still working on that project or if they've launched it or what but the goal was to make a clock that would keep accurate time for 10,000 years. So if you want to make a camera like that, I think you'd have similar obstacles to work against. But then the second way to make very, very long, make photographs of very, very long ago is to give up on the notion of taking pictures of things at a human scale and go back to relying on the finite travel time of light and look at astronomical images. Again, if we give up on taking pictures ourselves this is from 5,000 years in the past because it's an astronomical objects, 5,000 light years away. Let me run through a little bit of this. Let me jump ahead here. This is 50 million years in the past. This is 200 million years in the past. And this is 3.8 billion years in the past. There was actually an event that occurred over the course of several, just a day or something. It was they think some star slipped into some black hole and made this huge gamma ray burst which astronomers then directed their telescopes at and looked at and said here's a picture of that area where this catastrophe occurred 3.8 billion years ago from our frame of reference. >>: How did they date how long ago the light left this galaxy or how far it is? >> Bill Freeman: I believe the number of ways, there's sort of these reference galaxies that, where they think they know how far away, how fast they're moving away from us so from the red shift you can tell how far away they are. I don't know all the details. But tricks like that are used. And you're not going to get a photograph much of something much older than this because that's, you know less an order or magnitude away from the age of the universe itself. So that's the other extreme of the photograph you're going to take. So we've covered the gamut then from the very, very, shortest -- taking a photograph of the fastest anything can be moving, and to a photograph of as long ago as we can see. And photography lets us take pictures anywhere in between really. As far as what, where the research is to be pushed, I just don't think we're going to beat the short time edges and the beauty of these photographs By Edgerton and others as well. But I do think there's a lot to be done in the area of long time frame photography, because again these things that you want to remove are lighting effects or changes in position and that's a computer vision problem handling those properly. And I think we can -- there's a lot to be done in this realm of the problem. So that's it. Thanks. [applause] >>: Rick Szeliski: Before we take questions I want to say one thing that I forgot to say at the beginning which is Bill is here for the whole week. Bill is a consulting researcher with us. So you can talk to him about anything you're doing in the company. He signed nondisclosure agreements. So please either drop by his office which is in our hallway or send an e-mail. He's got an interim e-mail account or reads his MIT e-mail. So please take advantage of his visit to chat with him. So are there any questions now on the talk? >> Bill Freeman: I think you've been good about asking questions during the talk. >>: Can I ask you very fast because you've known directly in film -- two beam of light can be way more than black, but the rest can be done ->> Bill Freeman: I think that's the story. I mean, also it's the fact that these two beams of light, it's not just a nonlinear interaction it's the fact they're coherent with respect to each other. So that would give you a different signal than just averaging each one by itself. >>: Rick Szeliski: Any other questions? . Okay. Thanks a lot, Bill. [applause]