Document 17834954

advertisement
>> Dan Fay: For those of you who were here last year for Jeff's initial talk, this is similar work
from Jeff Dozier from UCSB who had done some work on the snowpack but wanted to kind of
update us on the latest activities from his visiting portion this year. Thanks Jeff.
>> Jeff Dozier: Okay. I've decided I like this title, being an intern. It's sort of good [laughter].
So what we're looking at is the idea of trying to really do snow hydrology in areas where runoff
from snow is really important and yet there's not much of a surface infrastructure to really
make any sort of measurements. So this is the Hindu Kush range of Afghanistan. We're looking
at an area that's eight times the size of the state of Washington, so we're trying to run these
models over pretty large areas. Okay. And the problem that we face, or that the people there
face, is illustrated by this advisory that came out from the UN's information on, Institute For
Regional Information Networks that sort of monitors conditions all over the world and tries to
alert the international community if something is troubling. And you can see this fairly sort of
desperate warning and the problem is if you look at the date on it, it came in in September
after the harvest had failed. So the question is could you have done a better job, and even
looking at passive microwave data we see that this year in this basin just in terms of total
amount of snow it was a pretty low number and so the ideas that we could have given that
warning in April rather than in September and therefore could have better organized a
response to it. In fact, it was ironic that the following winter, in 2012, was a fairly big snow year
and this caused the problem of not being able to get food supplies into places where people
were starving. One year of not enough snow and then the next year of too much snow and the
combination of those led to a lot of starvation. The problem with this, though, is one of the
questions is is do passive microwaves in fact give you a reasonable signal of the snowpack in
the mountain environment. We've done some of this work in the Sierra where we've got
measurements at the surface to compare with.
>>: Can you back up? How do you do these passive microwave measurements?
>> Jeff Dozier: Oh. What a passive microwave signal does is…
>>: Is it a plane flying over?
>> Jeff Dozier: No. It's a satellite. And its 25 kilometer pixels or so. The reason is that the
emission from Earth's surface at those long wavelengths is very small, so you are not getting
many photons to count, so the only way to do it is you got to open up the -- you can't get a very
good resolution. The principle upon which it operates is in the microwave part of the spectrum,
snow is not very, ice is not very absorbed and so what happens is, but it does scattered
radiation, so you get radiation as being emitted from the soil and then it's being scattered by
the snowpack above it and scattering, of course, causes extinction and that's why on a cloudy
day there's less sunlight under the clouds. So what happens is that by looking at the emission
from the longer wavelengths where you are seeing through the snowpack and then looking at
the shorter wavelengths you're actually trying to estimate how much attenuation is coming
from the snowpack and therefore the snow water equivalent. The problem is if you compare
this with a method called reconstruction that we've worked a lot on, you can see that it's an
order of magnitude less if you look at the numbers on the y-axis. In the mountains at least the
passive microwave estimates are only seeing about 10 percent of the total volume and there
are some physical reasons for why. Therefore, we've tended to focus on this idea of
reconstruction, but you can see this problem. This is a time series map of the passive
microwaves. I'd like to make it go just a little faster. Okay. This is a daily map and we know
snow doesn't really behave this way. It's flickering on and off and so one of the questions is
well, are there ways that we can still use it. And in order to figure that out we have to have
ways of estimating the spatial distribution of snow. The way I do this is I've got the passive
microwave data over here. The nice thing about them is that they are timely. You get the
estimate of the snowpack right away, but a lot of uncertainty and very coarse resolution. From
Modis, or other satellites, but from Modis I can get estimates, daily estimates of snow cover
and the reflectivity of the snow. Then from the global land data assimilation system, we can
estimate solar radiation and longwave radiation and so forth, and therefore we can put that all
together and we can model the snow melt day by day. We can't tell with this how much snow
there is, because in this part of the spectrum we don't see through the snowpack; we just see
the surface. On the other hand, if you can model the melt and if you can tell when it
disappears, then you can back up that calculation and figure out how much there was on a
previous day. So the idea is to do that, we want to get them an estimate of the snow water
equivalent trying to figure out if we can correct for the passive microwave data. Then, you
know, the way that this will be used in an operational sense is we could put that year into the
historical context. Automatically, we can sort of say well is there reason for concern or not, and
if the answer is yes, then the Army takes that information and, you know, looks at what
happened in previous years and then issues warnings. Now what I worked done for this
summer is a computationally intensive part of the problem, figuring out how to use cloud
computing to help with this. The issue is I actually need to have a daily value of the snowcovered area, and along with that I get an estimate of the grain size and the albedo. How do
we do this? We start with, we kind of try to go to basic physics. This is a graph of the optical
properties of ice and water and this is the index of refraction, the kind of thing that you learned
about in high school about how light bends when it goes through a substance. This is the
absorption coefficient and I've actually got a slide that kind of explains what these are. This is
what the, the definition of the refractive index, how the light bends as it goes through a
material. This is the definition of the adsorption coefficient, that is that you get a decay as
you're passing the light through a pure substance and you normalize it by the wavelength so
that the absorption coefficient can be dimensionless. Then if you solve that differential
equation, you get a…
>>: I found you.
>> Jeff Dozier: Hello Tony [laughter]. So we were just defining the absorption coefficient. So
we get an exponential decay. Now, in order to kind of explain what a number of an absorption
coefficient might mean, I simply can take this exponent and solve for the distance at which this
number is going to be -1 and so we can call that an E folding distance for light as it's passing
through snow, or, excuse me, or through pure ice. Let me back up and do that. And so what
we see is if we look at the E folding distance for ice that it varies by seven orders of magnitude
over the distance of the solar spectrum. In the visible part of the spectrum that number is tens
of meters, so when you go diving in Hawaii and you're under the water, you can see a long way.
And similarly if you were frozen in bubble free ice you'd also be able to see a long way.
Whereas, when you get out to the longer wavelengths, that number, you know, gets down to
less than a millimeter. The consequence of that is we can then put those kinds of numbers into
a calculation of the scattering properties of an individual snow grain. This used to be a really
interesting and difficult computational problem. That is Gustav Mie published his equations in
1908, the first really fast and useful solution to those equations by computer was published in
1980, 70 years after the equations themselves appeared. But I don't have to worry about that.
That's been done now and been done really well. What you can do then is take those
properties for scattering from a single grain and you can do a multiple scattering solution for
what's going on in the snowpack. What you end up with is something that is intuitively pretty
obvious. What this graph shows is the reflectivity of snow, of deep snow for the wavelengths of
the solar spectrum for a variety of grain sizes, going from very fine grains to coarse grains.
>>: Is .05 deep powder?
>> Jeff Dozier: Pardon?
>>: .05 [indiscernible]
>> Jeff Dozier: .05 is pretty small, yeah. That's about as small, maybe .03 is about as small as
snow gets. On the right-hand y-axis I've got that absorption coefficient plotted and what you
see is where the absorption is low the reflectivity of snow is really high and it's not very
sensitive to the grain size. When snow is across the visible spectrum, snow white, right? And
it's white -- but if you look at an individual snow grain, it's not white; it's transparent. But the
multiple scattering makes the reflection white. If a child asks you where does the white go
when the snow melts, you've got a way to answer it.
>>: It turns black in my driveway [laughter]. I've got an actual question. Is there an intuitive
reason why certain wavelengths have the high end of wavelength, weird spikes. What does
that mean?
>> Jeff Dozier: Those are rotational and vibrational moments in the quantum mechanics of the
absorption of ice.
>>: The rotational states of the water molecule.
>> Jeff Dozier: Or of the ice molecule. Ice and water are only shifted a little bit in this part of
the spectrum, whereas, without the microwave ice and water are really different. What you
see here is a couple of things. One is as you get out to the region where the absorption is
moderate, then the grain size makes a difference, so what happens as a result of that is as the
snow ages and the grains grow, it becomes less reflective. Remember, that about half of the
sun's energy is out beyond the wavelengths of the visible. And then out here snow is pretty
dark. That helps us distinguish snow from clouds, because clouds have little particles. That's
why they're still up in the sky. I mean that's really the difference between an ice cloud and the
snowpack is the snowpack is a cloud that got, where the particles got big enough that they fell
out of the sky and landed on the ground. What this means is that if you compare snow to the
other things that occur on Earth's surface, here's vegetation. Here's different kinds of soil, first
of all, there's a lot of variability in snow. If you go out beyond the visible snow is one of the
most colorful substances in nature, but you also see that if you compare it to the wavelength
bands of Modis that are here that it really is distinctive. It allows us to distinguish snow from
the other elements. And out here we can distinguish it from clouds. In other words, snow is
about the only thing that is really bright in the visible part of the spectrum and really dark in
what we would call the shortwave infrared and sensitive to grain size in the middle between
those. What that allows us to do then is if we have satellites that have this sort of spectral
information, we can distinguish snow from other substances and this is with Landsat and this is
very nice. This is at 30 meter resolution, but it's got a 16 day repeat pass because the swath is
only 185 kilometers and so therefore you miss opportunities. A lot can happen in 16 days and if
that day happens to be cloud covered and now you are 32 days between acquisitions, so we'd
like to do something a little better. This use of the shortwave infrared part of the spectrum
allows us to distinguish clouds, so here's the visible bands and you can see the clouds and the
snow are a little hard to tell apart. But there's the, if you use the bands out in the further end
of the spectrum, you can see that the clouds are pretty distinctive.
>>: The clouds are the blue?
>> Jeff Dozier: No. The clouds are [laughter]
>>: [indiscernible]
>>: Is that Eastern Sierra?
>> Jeff Dozier: Yeah, that's Mono Lake. What Modis has is it's got very similar bands to what's
on Landsat but it's got a swath of 2300 kilometers and so this is a Modis tile that is 1200 x 1200
kilometers, so that's, you know, 1.44 million square kilometers. The state of Washington is
180,000 square kilometers, so this is eight times the size of the state of Washington. This one
we get daily coverage, but the spatial resolution is 500 meters and so what we're able to do is
by this, with this spectrum we can at least compensate for this courser spatial resolution by
estimating a fractional cover of snow within each pixel. What this does is we can calculate what
are called end members, so in this case this is the green is the concentration of vegetation. The
red is the concentration of soil and the blue is the concentration of snow and so we can solve
for each. That's also a pretty computationally intensive problem, but it can be done as a
parameter sweep. Each pixel's computation is separate from the neighbors, so it's a nice
application for trying to put onto a cloud. It works pretty well. This is comparing what we get
with at 500 meters with what we get from 30 meters and this is the scatter diagram. There are
some things that I want to do with this part of the process but that's not really what I focused
on this summer. The problem is I do want to get a measurement every day, and sometimes
what you see is you get clouds. What I'm thinking here, this is now a data cube, so this is, the
plane is the spatial dimension, and it's in a projection so these are kilometers north and south
on the left and east and west and then date on that axis. The red color shows the absence of
data, so I picked it. I'm not sure it was a great choice, but on the other hand, there are holes in
my data so it's bleeding [laughter]. And so we can look through the year, or in this case just a
32 day period, and so my first step is in trying to get to the daily data is to use a threedimensional Laplacian to try to fill in those holes. But I don't mess with, I don't change any of
the observations themselves. You can see that we get something where the holes are all filled
in, but where it still looks pretty messy. We need to have, and this is the thing that makes it
kind of a harder computational problem is that these really are three-dimensional data where
there's lots of neighborhood effects because sometimes we want to slice the data this way.
Sometimes we want to drill down through the column and so in trying to sort of fix this, to
make it smoother, we want to be able to use some knowledge about what we have. There's a
couple of different kinds of glitches in the data. This is a pretty clear day, so what we end up
with though, is we have both low frequency dropouts caused by the clouds, but we also have
high frequency because some of the, one of the Modis bands is starting to go bad a little bit.
It's got some periodic noise in it indicated by these little red dots. And if we look at it in, if we
zoom in on some of these areas we can see, again, both this low frequency noise that we can
identify, but also some high frequency noise. Not only that, we see this break in this image
right in the middle and the reason for that is in this case this image was stitched together from
two different orbits. Part of the cause of this variability is the fact that we're getting this wide
swath, you know, of more than 2000 kilometers from only a 700 kilometer orbit and so that
means you've got to be looking at things that are pretty high viewing angle. So this is a map of
that images and it's what we would call the sensor viewing angle, so that's the angle up to the
satellite if you were standing on the surface. If it were a plane parallel system, that would be
the same as the nadir angle from the sensor, but because the earth is curved, those two
numbers are different. Now the problem is what happens as your -- so where it's blue it means
that that place was right underneath us at this time of the orbit. Where it's red, you know,
we're up to 60 degrees or so off nadir, off the zenith where we're looking at it and this thing in
the middle is where the two orbits were stitched together. What that means is that at the edge
of the swath, so the pixel right underneath the satellite is a half a kilometer square. The pixel at
the edge of the swath is about 5x1 kilometer, so it's 10 times the area at the edge of the swath.
That's part of the problem that is introducing some of this noise into the images that on
different days you are actually looking at a different piece of real estate on the ground and you
want to try to put together a -- how do you put a picture together? I think what this is -- it's a
class of smoothing problems that, where I have more confidence in some of the data than I do
in other measurements, and so I want to adapt a smoothing method that in fact takes
advantage of the fact that I have a physical reason for having more confidence in some points
than in others. What I do is I actually use a smoothing spline, but I weight the smoothing spline
inversely to that viewing angle. Were you raising your hand?
>>: Yes. Do you have to deal with [indiscernible] air column and then more moisture
[indiscernible]
>> Jeff Dozier: We actually start with an atmospheric corrected value, yeah. I guess the point is
that if I have a bunch of clear days, then the off nadir shots don't contribute very much to the
signal. They pretty much get ignored in the smoothing algorithm, but on the other hand, if
that's the only view I have in a two-week period then I'll use it. That's what results.
>>: This is after smoothing?
>> Jeff Dozier: What?
>>: This is after smoothing?
>> Jeff Dozier: This is each [indiscernible]. This is the first time I've seen this [laughter]. So
what it is is it's that same cube that we looked at before and it showing every day from over a
32 day period.
>>: As a result of the smoothing you did?
>> Jeff Dozier: As a result of the smoothing and I think it's pretty good.
>>: Yeah, it actually looks much better than the other cube.
>> Jeff Dozier: Yeah. [laughter]. That's the idea, yeah.
>>: Yeah, definitely. You can see the, where before there was [indiscernible]
>> Jeff Dozier: Yeah. Okay. So it works really well and I'm really pleased to have done this.
The only problem is this is just 800 pixels by 800 and it's only a 32 day slice, so this is a ninth of
a Modis image and a 12th of a year and this took about two hours.
>>: Computationally or it's a clock time?
>> Jeff Dozier: Yeah.
>>: Wall time the Azure process [indiscernible]
>> Jeff Dozier: Actually I ran this just on one node, but I know now how to break this up and
that's [indiscernible] and we're going to do that. I got another week. In other words, what I'm
going to do is use, so I do the Laplacian smoothing over the full tile, you know, of 2400 x 2400,
and then what I'm going to do is divide that into nine parts and then take the whole year
column and do the smoothing on the whole year. And that's a way of getting by the, of actually
taking advantage of multiple things. And then the way that the reconstruction works is once I
have that then I can run a snowmelt model and the way I do that is this is illustrated with the
measurements from a snow pillow that if you know, if you don't have a snow pillow, but if you
know what day the snow goes away and you can calculate the rate of melt, then you can back
up and figure out how much there would have been. And so this gives us a couple of things. It
gives us a spatially distributed estimate of how much snow there was back to about the peak of
the snow cover. And so that allows us then to compare with passive microwave data. It also
allows us to compare with models, because one of the problems especially in precipitation
models is you've got a grid. You are modeling at some spacing on the grid of 10 kilometers or
150 kilometers or something like that and how do you compare that to a measurement? What
is it that you compare to? Now we've got something that we can use to compare. The way this
works, we compared in the Sierra where we've got some surface measurements with
measurements at snow courses which are the ones done monthly by people skiing through the
mountains and poking a tube in the snow and weighing it, and then also with snow pillows
which are an automatic measurement. The good thing is, obviously there is some error in that,
but the error is centered around 0, so there doesn't appear to be a bias. And part of the error is
that the snow pillow is only representing a point within a half a kilometer pixel, so the snow
pillow is not a perfect measurement either. And then if we compare the inputs from -- we
estimate the incoming solar radiation pretty well. Air temperature we do pretty well. A little
bit of a bias in the incoming longwave radiation. That could also be a measurement problem.
That's a difficult thing to measure at the surface and in the Sierra Nevada there are really only
three long-term stations that do it. The reason it's difficult is that your measuring the same
thing, your instrument is emitting the same thing you are trying to measure [laughter] and so
the temperature compensation has proved to be hard with those. Okay. So is this
reconstruction giving us a good answer, because we have other methods at least in well
instrumented places of getting alternatives? One is if we have a lot of surface measurements
we can just do a spatial interpolation, or if we have a lot of surface measurements we can do a
data simulation model. And the reconstruction is, in fact, showing greater amounts of snow
than any of those. So are we right, or are they right? Here's our estimate that shows the
reconstruction is right. What we've done is to use the stream flow in these basins, do a
calculation of evapotranspiration and then how much change in storage it is. In other words
taking the hydrologic balance equation and then estimating the precipitation from that, backing
it out, and both the interpolation method and the assimilation are giving you some negative
numbers, and negative precipitation can't happen.
>>: [indiscernible] are actually [indiscernible]
>> Jeff Dozier: Q is discharge in the river. E is evapotranspiration, and Delta S is the change in
storage from groundwater. Usually over long timescales that's going to be small. It's a way of
backing out the precipitation estimate and you hope that when you do that your precipitation
estimate ends up being positive. In the case of the reconstruction for 12 years, 19 drainage
basins our numbers, the numbers in the reconstruction are all positive, whereas, for some of
the others the reconstruction is showing negative values for some. And then we've got other -and so the way that we…
>>: Time is running backwards.
>> Jeff Dozier: Pardon?
>>: Time is running backwards. I'm joking. [laughter]
>> Jeff Dozier: Okay. So the way we do this, again this is another computationally intensive
problem, but this is, we can do day by day, so it's a pretty easy thing to move onto Azure. We
take the short wave radiation estimate from either the national or the global land data
assimilation system. The upper left shows the resolution at which those data come in, which in
this case are an eighth for the degree. We smooth those to the size of the Modis pixel and then
we scale it based on the topography. Because the problem is in that eighth degree pixel there's
a lot of topographic variation and so we scale it just doing a pressure scaling relationship. Then
we bring in the slope and exposure, and then we calculate for attenuation by vegetation.
We've got then a map of the albedo. We then estimate how much solar radiation is being
reflected back upward and what we end up with is a net. That's what energy that goes in the
bell. And then we do something similar with the longwave radiation but I'll, in the interest of
time I'll skip that detail. Again, that takes a lot of computing, but that's pretty easy to
parallelize because each day is independent of the other days. If we do the same thing for the
Hindu Kush, in this case this is just showing the data for a day, we can get over a mountain
range that's very large. Some of these drainage basins, the Amu Darya itself is slightly larger
than the state of Washington. It's 200,000 square kilometers, so we can get each of these
inputs to the model over every pixel and then we can calculate how much melt is coming from
the radiation, what's coming from the sensible and latent heat flux, which is a function of
temperature and humidity, and then we can get all the melt for a particular day and then we
can do it for every day for every year. So this is showing the variability that we've seen in the
years [laughter]. 2008 is missing. This is my -- my colleague Karl Richter was running this on
the Linux cluster at Santa Barbara and he -- there's a difference in Linux between the RM
command and the MV command [laughter] and he typed RM instead of MV after he had done
all of these calculations for 2008. On the other hand, it made it an easier slide with, only having
to put four years in [laughter] instead of five. Okay. That's it.
>>: That's the Hindu Kush?
>> Jeff Dozier: That's the Hindu Kush. That's the Wakhan Corridor up on the right. This is a
sinusoidal projection that is, so it…
>>: [indiscernible]
>> Jeff Dozier: Okay. I've learned how to use Azure sort of with [indiscernible]'s help and I
think I learned, I figured out how to deal with the hardest parallel part of the problem, which is
the fact that we're dealing with something that -- but it's a general problem of dealing with
three dimensional data where you want to sometimes slice this way and sometimes you want
to slice that way. And then the rest of it is I think easier to run in parallel because once we do
that then we can run day by day.
>>: So the problems of mounting a disk multiple [indiscernible]
>> Jeff Dozier: You can't [laughter]
>>: That's not the problem then.
>> Jeff Dozier: Well. I mean, now I'm wandering into territory where most of you know more
than I do. I guess the issue is that in the blob store really the, what you can do is get input, that
you can't reach into the blob store and read part of a file. You actually have to either -- I don't
know. I don't want to say -- it's too bad because I'm storing these results as HDF 5 files which
supports block compression so that you can read a piece of it even though it's a compressed
file, but you got to get it out of the store in order to read it. So I think the alternative is to take
those chunks, that is to turn every image into a 9 x 9, or a 3 x 3, so turn every image into eight
images and then I can parcel those out to individual machines.
>>: This was all the calculations you needed to do to count that snow cover of this particular
region for four years and what do I learn?
>> Jeff Dozier: Oh. Okay. So what you now have is, let me go back to this. Sorry about that. I
think it was right at the beginning here. Yeah.
>>: I sort of missed this one.
>> Jeff Dozier: So what I've got is I've got methods in which I can estimate the snow cover
while, during the year in real-time, because this reconstruction you only get, you get the
answer but you only get it at the end. On the other hand, what we show is it's a really good
answer. What we now have is something that we can use to compare with say estimates from
passive microwave and with NOAA is developing a Central Asian snow accumulation model, but
they have no way of validating it. So this gives us a method of validation for that kind of a
model, and so they -- in fact, I'm meeting with them toward the end of this month because they
are really, they keep asking us for the reconstruction results for the past decade. Then if we
can figure out how to help with the passive microwave data, as I showed before you came in,
that only sees about 10 percent of the snow. But if we can figure out how to correct it, then
that geophysical time series goes back to 1978 and so we can then do a better job of sort of
putting any current condition into the historical bracket as part of the historical narrative. A lot
of what we can -- I mean, in general, management of water works pretty well when you are
kind of near the median [laughter], you know, and so part of the idea with this is can you
identify the years that are at the two tails of the distribution. Is this…
>>: Do those four years you show the Hindu Kush look more or less the same?
>> Jeff Dozier: There was in the Kabul part of the watershed there was flooding in 2007.
>>: I don't know where Kabul is on that map.
>> Jeff Dozier: It's, it would have been the part that's draining to the south and that does show
in 2007. By and large one of the things that really helps in sort of management especially in
places where simply giving a volume in cubic kilometers or acre-feet or any sort of unit thing
isn't going to mean much, but if you can put things in historical context, you know, if you can
say that this year is say comparable 2007 when there was flooding, or comparable 2011 when
there was drought, then even the villagers will remember what things were like in those
conditions and therefore, you know, knew what got flooded then or, in the case of drought,
how badly the crops did then.
>>: And then mobilize resources early on.
>> Jeff Dozier: And then you can mobilize, yeah. Again, what happened in 2011 as there was a
big drought but they…
>>: Can you show those the final answer again?
>> Jeff Dozier: It's probably easier to…
>>: So 2007 was a flood?
>> Jeff Dozier: 2007 in the southern part here was a flood, yeah.
>>: And the other, I don't see much difference.
>> Jeff Dozier: That's true. The interesting thing about that year and about 2011 is the snowcovered area was pretty similar even though the depth of the snow was different.
>>: An interesting part is that it's a piece of layering kind of with the basin and where the water
sort of goes out of and then also the populations [indiscernible]
>>: That you can use [indiscernible]
>>: [indiscernible]
Download