21200 >> Jie Liu: So I'm happy to introduce Dan...

advertisement
21200
>> Jie Liu: So I'm happy to introduce Dan Work from the Department of Civil Environmental
Engineering at the University of California at Berkeley. Dan Work is a Ph.D. student there. He's
graduating really soon, in a couple of months. And after that he's going to be joining the UIUC as
a faculty member as a civil engineer in the department there. He'll describe his work on the
Millennium Mobile project, a project estimating traffic based on smart phone data. Go ahead.
>> Dan Work: Thank you for the introduction, and thank you for having me here today. I'm
going to talk today about real time estimation of distributed parameter systems, and specifically
with application to traffic monitoring, based on some work we've been doing at Berkeley through
the Mobile Millennium project, which is a large deployment we have of users in the Bay Area who
have downloaded mobile applications, collecting GPS data and sending it into our system, and
then we use that information to estimate traffic conditions and send it back to them.
This is part of my Ph.D. research at U.C. Berkeley where I work with Professor Alex Byian at the
Department of Civil and Environmental Engineering. Parts of the talk are also joint work with
Nokia Research Center in Palo Alto. I spent a couple of years both as an intern and visiting
researcher working on parts of this problem.
So I seem to have some problem with the slides. Let's see. Let's see if that helps. So before I
get into talking about traffic monitoring, in particular, I want to give some context about distributed
parameter systems and why they're important, especially as we're interested in managing and
monitoring both the built infrastructure and the natural environment.
So a distributed parameter system is just a system in which the spatial variation of the system
plays an important role in describing the evolution of the system in time. So this is true, for
example, in the context of monitoring air quality, looking at how contaminants propagate in rivers,
even describing how buildings respond to wind or seismic loads and of course what I'll talk about
today is how congestion propagates on traffic or how congestion propagates on roadways. So
the commonality amongst each of these systems is the fact that again they have a spatial
component which is very important that we like to characterize so we can understand how the
system will evolve forward in time.
Basically with the game that we play here is we have some -- we have the physical world, and we
want to come up with some mathematical abstraction for our distributed parameter system. One
way to represent a distributed parameter system is in the form of a partial differential equation.
You take the model, build the abstraction or take a physical world, you build your mathematical
model. If you describe the appropriate initial conditions and boundary conditions and model
parameters you can completely characterize how the system will evolve.
Okay. This is really useful, for example, if you want to understand, for example, how to evacuate
an urban corridor, or, for example, what's going to happen if you change the average sea
temperature of the world by say two degrees. The problem you're only loosely coupled with the
world, though, because in practice initial conditions are almost never available. The boundary
conditions are unknown.
And model parameters have a lot of uncertainty in them. So in the direct problem, where we
basically specify this stuff initially, and then run our forward simulations, we lose this coupling
again with the physical world.
Sometimes the problems I'm interested in are estimation problems. Where basically we try to
augment the fact that we have some uncertainty in our mathematical model with additional sensor
data with the physical world.
If we can basically augment that information with the sensor data, we solve estimation problems
which we try to get a better estimate of the state of the system than either the data alone would
give us or the mathematical model would give us.
And in this way we can create a more tight coupling between both the computational or cyber
side of our modeling infrastructure as well as the physical side, which is the actual world that
we're interested in monitoring or controlling. If we can create this tight coupling it allows us to
feedback based on our estimate what's happening in the world into our system to control it in
direct control in the context of changing traffic like in traffic signals or information giving that
information to the users and let them respond based on the new information.
So basically the game to play with these estimation problems either you estimate the state of the
system which in some communities it's called data assimilation or you can estimate parameters
which is also known as inverse modeling.
But if you can correctly estimate these things, and again you can feed that back into the system.
Okay. The major problem with doing estimation on distributed parameter systems is going out
and collecting sensor data. It's extremely expensive to sense in large distributed areas. And I
think there's really two things that are starting to change how we do sensing on a lot of these
things. And the first is the mobile Internet.
The fact that by 2013 it's expected to be like 1.3 billion smart phones worldwide. I'm guessing
that's probably the world's largest sensor network that has communication, computation, and
some form of sensing embedded in it.
Now that platforms are starting to open up we can actually have access to the applications that
run on mobile phones, that gives us a platform for developing a bunch of rich applications that
connect our physical environment and people that cell phones are embedded on in the physical
world with the cloud.
The other thing is sensor 2.0 is an emerging paradigm basically based out of some of the work
done at MSR where instead of having sort of deploy a sensor network and then hiding the
sensors in your proprietary application, make that available sort of to the outside world. Let the
sensors be a platform as a whole.
And so although you may have a sensor, for example, that's used for traffic, it may be useful,
someone else may want to use that information for estimating something totally different. In order
to get that information having a common platform in order to access a sensor data becomes
useful especially in building these large distributed systems.
Okay. So what I want to talk about today in the context of traffic is really how to combine this
modeling and this sensing for an estimation problem, both online and in real time. Online
basically I want to do this as data becomes available, I want to be able to continue to do my
estimation piece by piece as the information is available and in real time I want to be able to
produce an estimate fast enough that the physical system hasn't changed before I make that
information available.
And sort of one common approach to solving these problems is basically I look at the model that I
have of the physical world and the estimation problem that I want to solve.
And I go out and I specify like what type of sensor I should design to solve this problem. Or
where I should place these sensors so I get a good estimate of what's happening in this data
system. In today's talk I'm going to do the exact opposite specifically because of the availability
of GPS data. As I'll show later most of the mathematical models for traffic are density-based and
the sensing that comes from GPS smart phones is velocity information. So there's some
incompatibility here but there's so much GPS data that's going to become available soon we'd
like to take advantage of that. And to make the estimation problem easier, rather than specifying
and designing the sensor that I'd like, what I'm going to do is I'm going to change the
mathematical representation of the model so it becomes easier to integrate the GPS data into the
estimation problem.
So I'll show how that's done. That's really the core idea is to recast the model in a way that's still
physically consistent with the work that's been done in the transportation community for 50 years
in a way that can take advantage of this new data that's becoming available from smart phones.
So a little bit more concretely, the things I'm going to talk about. First I'll provide background on
traffic monitoring technologies, what's the existing sensing of a structure look like, what are the
issues when you start using GPS data from smart phones.
Then I'll talk a little bit about the mathematical models of traffic. So first what do these historic
models look like and why do transportation practitioners like these models. I mean, what do they
model and measure.
And then I'll talk about how I can transform those models from the density, as the state that these
models work on, how many cars are stacked on a unit on a stretch of roadway, into a velocity
evolution equation.
I'll talk about how you can expand that to networks. Finally I'll talk about the real time estimation
problem, how do you solve this integration of uncertain measurements in an uncertain model in
real time.
And it's solved using a variant Kalman filtering known as ensemble Kalman filtering. I'll show
experimental work we've done to validate that the algorithm works well in practice.
I'll just conclude with a few summarizing remarks and some future directions for where we'd like
to take this work. Okay. So let's talk about traffic monitoring technologies. If you've never been
stuck in traffic there's no real need to motivate the problem. Average I think driver in the United
States right now wastes something like a week stuck in traffic every day. It's a huge annoyance
but it also has a huge impact on the economy.
Departments of transportation have been very active and they're trying to build monitoring
systems. We know we can't build our way out of the congestion problems. We need to start
managing this infrastructure better. So they built systems that either face inward for
transportation practitioners that basically record any of the count data that becomes available,
how many cars are using a roadway during the day. They can record that and store it in a
database and throw it on a map and use this for planning.
And public facing tools like changeable message signs that may tell you how long it takes to get
to an airport based on real time traffic data.
So the sensors that feed this, basically the vast majority of traffic sensing technology relies on
inductive loop detectors. It's a sensor that basically you put a coil of wire in the ground and as
cars drive over it you register a signal and you can count the number of vehicles.
And there's been magnetometers that are developed that are wireless and do the basic idea of
counting the number of cars. The problem with magnetometers and inductive loops, if you put a
sensor in the roadway and you have several thousand vehicles cross over it every day, over time
those sensors fail. The only way to replace them is to shut down a lane of traffic and then dig it
up and replace it.
This is very expensive. In some areas it's not feasible to shut down the traffic to replace these
sensors.
>>: Does the loop give you just the presence of the vehicle, also the speed?
>> Dan Work: So it depends on the type of loop. In the case, if you see these here, there's
actually two loops. There's one small loop here and one here. They do like -- they just deal with
the time distance, or time difference when the signals register they know how far apart they're
spaced and they can -- what's that?
>>: Single loop?
>> Dan Work: There's some people that have done some work to try to get estimates of the
speeds from the single loop based on how long the vehicle was sitting over the sensor. But, of
course, this depends on how long the vehicle is.
So this -- you can get some speed measurements out of them. They're not the most accurate.
Double loops are more accurate but they still have some problems as well.
So a lot of the departments of transportation have been saying let's stop putting sensors in the
roadway, this is too expensive to install and maintain, it's not practical, let's get them off the road.
Either they use radar detection or video detection to identify license plates or just to show people
the images of whether traffic is congested or not. These systems are -- I mean, if you know
NAVTEC, they own traffic.com. This is the sensor they deploy. Deploy a radar sensor on
freeways in the United States. People started to say let's go one step beyond than that, let's put
the sensors on the vehicles themselves. Either taking advantage of the GPS units that are in
fleet vehicles like taxis or FedEx trucks or UPS, or in consumer vehicles with the toll tags on the
East Coast, it's E-ZPass. On the West Coast it's, or at least in the Bay Area, it's FasTrak.
So these have an RFID transponder, use it to pay your tolls but they use it to deploy readers in
the transportation network and they record basically when you passed one sensor, when you
passed the other, and it gives them an idea how long it took you to travel on that stretch of
roadway.
But what everybody is really excited about now is the GPS data. And why is that? Well, just to
look at what fleet data and GPS is making available to monitor traffic right now, is just look at
these images.
So this is the San Francisco Bay Area. This is one day of taxi data. There's 500 taxis, roughly,
that we have access to in Mobile Millennium. Each taxi reports its position once a minute. And
so each red dot in this image -- this is the San Francisco Bay Area, this is zoomed into a
particular area, corresponds with one measurement. You can already start to make out, we
talked a little bit earlier can you identify the roads.
I think it's pretty obvious, you can start to -- it wouldn't take much work to be able to identify what
the network topology looks like there. There's opportunities to even potentially build the network.
If you're really familiar with the area you might recognize this as really the San Francisco airport,
and if you go to Terminal A or B or C, you can drive through there.
Okay. So fleet data is already available. It's already being used by several commercial
companies to try to estimate traffic conditions. And what's got people really excited is fleet data is
relatively small. You're talking about, I think these numbers are from NAVTEK, they're looking at
basically 100 million points worldwide annually in a year.
But in a couple of years, based on just GPS data from cell phones, it's going to explode to more
than a billion points.
It's kind of funny, really early when we started this project, I had to convince people that GPS was
going to be a feature on your phone. And the only analogy I could come up with would be like it
would be looking back five years and saying we're going to have cameras on phones. This is just
preposterous. But now it's sort of obvious this is where things are going based on location-based
search, location-based advertising. There's too much inertia and too many services that can be
built on this stuff. Every phone is going to have it. Most cars will have it soon.
And this is really what we want to start taking advantage of. How are we going to build models
and estimation algorithms that can take advantage of this GPS data. Because it's just going to be
so much cheaper to acquire than the fixed sensing infrastructure that we currently go out and
deploy.
So the first real problem I worked on at Nokia and as part of my Ph.D. was really how do we
collect GPS data in a way that manages privacy of users. And so the paradigm that we came up
with that was really led by Bico, researcher at Nokia Research Center in Palo Alto, was the notion
of virtual trip line.
I'll explain a little bit later what that is. Motivate the privacy problem, okay, this is a small video of
me in 2008 driving in Berkeley, California. My car, with the first prototype application of a cell
phone just recording GPS trajectory data.
As you saw at the very beginning of the video, there's a stop right here. I'm picking up another
Ph.D. researcher, Juan Carlos Herrera, now a professor in Chilé. It's easy to identify which home
he lives on based on the trajectory alone. You can also tell we left Berkeley, if you know the area
well. Even though the data is anonymous even though it doesn't say anything about the car, it's
pretty easy to reidentify who might be in that car based on just where the trip started, where the
trip stopped and where there might have been anomalies along the trajectory.
Of course, this is -- these green bars correspond to the GPS measurements, where did the phone
actually send a measurement and how fast was it going. And we did this initially because we
wanted to see -- can you get lane resolution? Do you have problems with identifying whether or
not you're on a freeway or a frontage road or nearby road. So over time we came to realize that
the GPS data even in dense urban areas you can get good resolution most of the time even by
these low quality GPS units that are put in cell phones.
Okay. Temporal sampling with smart phones has some problems. And there's been a number of
researchers, both actually John Krumm here at Microsoft has done a lot of work on really
highlighting this problem and trying to propose some work on how to get rid of reidentifying users
who are sending information from vehicles.
So you can do things like basically adding noise to the data that's sent from the cell phone or
anonymasing it or preventing certain areas from sending measurements.
The approach that we took on this project is a paradigm known as a virtual trip line. Okay. A
virtual trip line, the way to think of it is just a virtual sensor. It tells the cell phone where it should
send measurements. It can be used as a trigger for the phone to say it's okay for me to send a
measurement here.
I want to explain exactly what this looks like. The first step as the phone is running its traffic
application, it's downloading map tiles, downloading traffic data. It's also going to download these
virtual trip lines.
It makes a request to the database stored at Nokia and says give me the virtual trip lines of the
road I'm on. You can see what a virtual trip line is, two latitude/longitude coordinates and a line
segment between those two latitude/longitude coordinates. They lay across the roadway. As it
drives down the road, it checks: Does my local GPS projecor that I have stored on my phone,
does it cross one of the trip lines? If it crosses one you can send a measurement. Virtual trip
line, ID at a particular time, and here's the speed I went.
As the vehicle travels down the road it continues to send measurements. The important thing
here is that, okay, the game in terms of understanding, does this preserve my privacy or not,
basically can I look at this virtual trip line ID, time stamp, and the speed and identify that these
measurements came from the same user or not.
And based on sort of posing the problem in this way, we can identify how far apart these virtual
triggers need to be. And the phone can run local rules that say basically like I will only
probablistically send measurements on this virtual trip line, and the important piece is that it's
really a framework from which you can better manage the information that's being collected into
the system.
You don't put virtual trip lines in an area that's like a residential area, because most of the
information that you would collect there from the beginning would be something that might be
extremely privacy invasive. But even if you're on a major freeway, say 90 from Seattle, if you're
the only person that sent a measurement on a virtual trip line, doesn't matter how anonymous you
are, the next virtual trip line we're going to know with 100 percent certainty that that measurement
came from you.
>>: So how often do you sample the GPS? Because I assume maybe you realize you crossed
the virtual trip wire only after you actually crossed it, maybe you're 500 meters ->> Dan Work: So the current client or the client that we used in this work wasn't designed to be
the most energy efficient client. We were basically looking at let's get the data and see if it's
going to work proof of concept.
So there the phone was running, basically it was pulling the GPS at either one second interval or
three second interval, something like this.
And from there you're right, at 60 miles an hour there will still be a period where you're not hitting
the measurement directly on. But you can interpolate between it or, again, sort of viewed in the
broader context, I can say send me the measurement right before or send me the measurement
right after assuming that it's relatively close to the trip line.
And the same way, you know, it doesn't -- in the first version of this work, and in all the stuff we've
done for highways, basically we only send a point measurement. If you cross a virtual trip line
you send a measurement here.
It turns out on surface streets it's pretty impractical because you send cent me a stop
measurement I don't know if it's because the traffic stopped or the traffic light is red. There it
makes sense let's put a virtual trip line on either side of the intersection and let's measure the
time that it takes you to go between these virtual trip lines.
And I can still use these virtual sensors as sort of markers, and as a framework on which I can
design an architecture to preserve the privacy or anonymity of the users that are sending the
measurements in this way.
Did I answer the question? So that's a little bit about the sensing on the cell phone side with
respect to traffic. I want to talk now a little bit about the mathematical models.
Okay. The seminal model in the transportation community for traffic is known as the Lighthill
Whitham Richards partial differential equation. The state of the system is denoted by row of XT.
That's the density of vehicles.
Take a stretch of roadway and count how many cars there are on that stretch of roadway, divide it
by the length. That gives you the density of vehicles. Q is just a flux function that describes the
flux again is take a point in space and count how many cars are crossing that point in space.
That's the flux.
Okay. So the LWRPDE is a conservation law. It conserves the number of cars on the roadway,
and it relates basically how the density changes in time with how the flux changes in space.
So how does this work? If I take a stretch of roadway with vehicles entering A and leaving at B I
look at the rate at which they're entering A and leaving at B. Based on these rates spatial
variation, I can say either the density should be going up, more cars are entering or leaving or
density should be going down. More are exiting than entering. When you take this to the limit,
that gives you the LWRPDE.
So all it describes is mass conservation. It's very simple. But yet this is quite rich in the features
that it can capture. And I'll describe those in a second. So first basically we have the LWRPDEs.
You specify an initial boundary condition and the boundary condition, on a minor technical note,
boundary conditions aren't implemented as simply as I've shown here. They have to be
implemented in what's known as a weak sense.
I'll avoid the mathematical details, but it's been well studied in the literature there's other
conditions of when the boundary conditions can apply and when they won't to have a unique
well-posed problem here.
Okay. So we ever flux function, which still needs to be defined. Okay. And the flux function is
given by the constitutive relationship, the density times the velocity. That gives you the flux. You
can check the units to make sure it works. In order to have a flux function that's a function of
density only, that means we have to embed a relationship between the velocity and density.
And this is the full-time fundamental assumption of the LWRPDE is that I can describe the
velocity of a function of density only.
This guy Greenshields, in 1935, basically went and studied two separate roads in Ohio. And on
two different days came up with this relation that shows basically, okay, let's assume that velocity
is linearly decreasing as a function of the density. So when there's no cars on the roadway, traffic
is moving fast; when there is the maximal number of cars that fit on the roadway, nobody moves
anywhere.
It's fairly simple, yet gives us this quadratic flux function, and this causes all the nonlinearities in
traffic evolution. This is exactly what we want to capture. Basically when you're free flowing,
when you're at low speed -- this is the vehicle density here and the vehicle flux is on this axis.
At low densities you don't have many people around you, you increase the number of cars on the
roadway and everybody seems to continue at roughly the same speed. The slope of this line is
the velocity.
So at low densities, lots of cars moving, you increase the number of cars on the roadway. You
just increase the flux because there's more people moving at the same speed.
Then you reach some critical point where once you start adding more people on the roadway,
something bad happens. The flux starts to decrease. The throughput of the roadway starts to go
down. That's bad news, because now you're at the position where more people on the roadway
just simply cause the flux to decrease because the speed -- that's basically this line here coming
through -- starts to decrease as well.
Okay. And the problem is if you know -- if you study transport laws at all, this PDE basically is
just a transport law that says that information is propagating at different speeds, the speed is the
slope of this line. Well, you can have information propagating forward or backward. The fact that
information can propagate forward or backward creates shockwaves which then cause all the
mathematical difficulties with this model.
Why are shockwaves important, though? They're interesting from a mathematical standpoint, but
they're actually practical in studying traffic. That's because the shockwaves are exactly what we
want to try to track. They're the congestion waves, when you have a bunch of people piling into a
queue, the rate at which that queue backs up is the shockwave. You have a bunch of very dense
vehicles, and basically a stretch of roadway that's not dense at all. And you want to know, is that
shockwave going to propagate back down the roadway or is it going to start clearing? And this
very simple model, why it's so powerful, is it tells you exactly when the queue should be building
and decreasing.
If there's enough cars feeding into the bottleneck, then that queue is going to keep piling back.
But if you're only adding a few vehicles and more vehicles are actually being able to get put
through this bottleneck, you actually start to see the shockwave clear. That model is very useful,
for example, if I want to understand, I've got some measurements of congested traffic now. I
want to know basically is it reasonable to expect that I should be getting additional slow
measurements in the future, or is it possible that it could clear quite quickly? This can tell you if
you've got the whole freeway, I don't know if you saw on the news, there was a 60-mile traffic jam
in China.
This model will tell you how long after you remove the bottleneck those cars will clear.
>>: Interesting.
>> Dan Work: Yeah. So the problem is you have this model that describes the evolution of these
shocks quite well. That's useful. But because of these shocks, it makes the estimation problem
really difficult.
Okay. The first problem is that basically because of these discontinuities, the partial differential
equation doesn't exist anymore. You have a PDE that assumes differentiability of the states, yet
the shocks are points of nondifferentiability. That causes a problem. The mathematical story is,
that instead of treating this and looking for classical solutions to the PDE, you actually have to
solve a more general form of the problem, which basically relaxes some condition on the
smoothness of the solutions that you seek, you look for what's known as a weak solution.
That solves the problem of the solution can still exist but you have too many solutions to the PDE.
And the only way to get around this is basically to further embed some condition that says I'm
going to look for a less, a more general solution to the problem because I know the smooth
solutions don't exist, but then I need another condition to say not all more general solutions will
work. I need some entropy condition to isolate a physically meaningful solution to this PDE.
So the entropy condition, all it means for practice is that this is the solution of this PDE, this mass
conservation law, which actually corresponds to what you see in practice for traffic.
What it basically says is if there's a shockwave in the traffic, the only way that this is possible is
there better be congestion downstream and free flowing traffic upstream. It prevents the opposite
from occurring where you would have free flowing traffic upstream and congestion downstream.
You wouldn't expect to see a shock there, because people would start to leave that shockwave
and it would start to smooth out the density profile. So the weak solution would permit this in the
first place. The entropy condition basically says, no, only shockwaves that have congestion
downstream and free flow upstream are physically permissible.
So that's the summary of the mathematical formulation in terms of the density evolution model.
The important thing there is the entire evolution of the traffic is described purely in terms of how
many vehicles are stacked along the roadway.
As I mentioned before, we're looking at integrating all this velocity data with cell phones. It's not
really a clear way on how to do this. So in order to solve this problem, the approach that I
proposed is basically to transform this density-based partial differential equation into a velocity
equation, and this will simplify the estimation problem because we'll directly have as the state the
velocity which is the same thing that our measurements are in.
Okay. So in the general case, you have to do some -- it's not possible to always transform this
density PDE into a velocity partial differential equation, you must go into discrete space to solve
this problem. What does it look like? We start with the LWRPDE, this partial differential
equation. And we want to get to discrete velocity for our estimation problem.
So we have this relationship that relates density to velocity. So presumably we can just take the
density partial differential equation and use this substitution, right, the velocity as a function of the
density and substitute it in and get the velocity differential density partial equation. Then we can
discretize it.
Here, again, here's the velocity function from Greenshields, this fellow in Ohio that did two
different roads two different days and came up with this relation. You substitute it anywhere you
have a density. You solve for the density and substitute it back into your PDE. You can do some
manipulations and write a new conservation law where velocity is conserved. Instead of having a
flux function Q, you have a flux function R where you have the velocity flux where it looks like.
The problem it only works for linear velocity functions. The linear velocity functions not
surprisingly people have shown that nonlinear functions tend to work better than the pioneering
work of Greenshields.
Basically, if you want to use nonlinear velocity function, you have to discretize the PDE. And the
discretization that you use is a discretization scheme known as the Gudonov discretization
scheme. This discretization scheme is important because it embeds the entropy condition into
the discretization. You start with the PDE, the finite difference approximation that is Gudonov,
basically isolates in the discrete space the entropy solution to that PDE.
So I have a discrete density model which completely characterizes the density evolution in the
discrete space with the entropy rules embedded in it. Isolates the physically meaning to the PDE
take the discrete model and apply my velocity transformation. There I've already got the
physically meaningful solution I want in the density domain and now I'll map it into the velocity
space. If I tried to do it the other way, I would have lost the consistency with the weak entropy
solution of the density problem.
Okay. So basely in terms of what it looks like on the roadway. You basically take your stretch of
the network and you discretize it into discrete space steps delta length X and into discrete time
steps of length delta T.
And you build a big velocity vector VN, where VIN is the velocity at cell I at time N. You assume
the velocity is constant in each of these cells.
So you build a long velocity vector that just describes the discrete state, the speed in each of
these cells.
And you have an evolution equation which basically takes the velocity at time N to time N plus 1
and it's denoted here on this slide by ME. The M is the model. This velocity evolution equation,
which I'll describe in a second. The E just corresponds to the fact that it's only for a single edge
in the network.
And basically it's given by this velocity evolution equation. The important pieces are basically that
you can see that the velocity at time N plus 1 is just a function of the velocity at time N at the
same space I.
The space that's one upstream and one downstream. VI and VI minus 1. V plus 1 and VI minus
1. There's details I'm not going to go into precisely, I want to point out a couple things that
basically in this velocity evolution equation, we have this function G tilde. This is the numerical
velocity flux in the discretization scheme.
The only important piece is that basically this is where this entropy condition is embedded. It
determines all the properties of which way the shocks move. And there's a minimization term
here. The min function makes the evolution equation nonlinear and nondifferential. It's important
to the estimation algorithms that you can apply on top of it, but because of that nondifferentiability
it eliminates a class of algorithms that you would like to use.
Of course, we have the inverse velocity function which maps the velocity to the density, and it's
given by a nonlinear model here, which is just hyperbolic and one dimension and free flow and
linear in another. And of course the velocity function as we have.
Again, the important piece here it's a velocity evolution equation that describes basically the
velocity at the next time step using only the velocity at the previous time step in the cell you're at,
plus the immediate stretch of roadway upstream for you and immediately downstream of you.
And it's nonlinear and nondifferentiable.
The network problem is slightly more involved. I'll summarize the important aspects. For looking
at modeling, the traffic across a city or state or country, basically we have to take the road
network and model it as a directed graph.
Each edge in the network corresponds to a stretch of roadway and each vertex in the network
corresponds to a point where you have like vehicles merging or diverging.
And the problem with expanding this model to a network is basically there's lots of complications
of the boundary conditions for each edge in the first place. Well, now you have points on the
roadway where you have to have consistency.
This point has a shared boundary condition with two incoming edges and one outgoing edge.
And in order to solve this consistency problem, you have to actually solve a linear program that
basically looks at more or less how many vehicles can be accepted by this downstream stretch of
roadway versus how many are upstream and would like to go into this downstream stretch from
this link and how many are available to be sent from this link.
So you can't have obviously more cars go through the intersection than are available. And you
can't have more cars go through the intersection than can be held by the other side of the
intersection.
So that's sort of in words what this vertex linear program solves for. It solves for the boundary
conditions for each edge, such that it's consistent with what's actually happening on each edge.
So to fully evolve the velocity model from one time step to the next, basically you start with the
initial velocity everywhere on the network. And at each vertex you solve one of these linear
problems to solve for the appropriate boundary conditions.
Once you have the boundary conditions for each edge, then you solve the edge evolution
equation, which is this nonlinear, nondifferentiable finite scheme which I skimmed across in the
previous slide. That evolves velocity from time N to time N plus 1. Yeah?
>>: I was -- for California you have thousands and thousands of linear systems that are solving.
>> Dan Work: That's a good question. I think I have a slide that has the actual numbers. But I
think in terms of -- I think we've got maybe in northern California, which is the current network that
we run, I think there was about 15,000 states. I think there's about 7,000 vertices and roughly
five or 6,000 edges. And the edges are further discretized.
And the linear programs, it's actually part of the reason I show this slide is because that's actually
the slowest part of the algorithm, is solving all these optimization problems. It turns out that the
commercial codes to solve these linear programs have so much overhead in terms of the
precomputation that they do that it's actually faster, I implemented my own linear program
algorithm to solve these things.
Basically because if you think about most linear program solvers they're used for solving really,
really large problems quickly. And this I have an exact opposite problem. I have really, really
small problems. But I have thousands of them.
So it was actually faster to build that out. It made the code run something like two orders of
magnitude faster. So in terms of real time, it was something like 100 seconds for evolving a
15,000 state network from one time step to the next using commercial and open source linear
program codes and some custom stuff that we implemented basically you're able to do it in a
matter of a fraction of a second.
>>: This process is run periodically or [inaudible].
>> Dan Work: Yes, so basically every time step. Every time you want to evolve a state from N to
N minus 1 in practice we're talking about six seconds based on some particular properties of the
discretization scheme which I didn't really discuss.
So every six seconds you have to solve every one of these linear programs in your network at all
the vertices and then evolve the PDE from one time step to the next.
But it's actually -- I mean, it's fast enough that this can be done on a laptop computer in a fraction
of a second.
>>: Solve the space [inaudible].
>> Dan Work: In fact, for the estimation algorithm which I'll talk about in a minute instead of
doing it just once every second we have to do it several times because we're using a Monte Carlo
estimation algorithm. So this has to be done hundreds of times every six seconds just to move
the state forward but these problems are very fast.
If you look at the structure, the other reason I want to put this here, because look at the structure
here, basically each vertex you have to solve this linear program, but this linear program doesn't
depend on this one. It doesn't depend on this one, doesn't depend on this one. It can be nicely
decoupled there. Once you have the linear problems solved, you have the boundaries for this
edge and this edge, so on. So each of the edge conditions can be solved independently. It has a
nice structure for basically either multi-threading, if you've got multiple cores, or distributed across
multiple machines if needed.
But, in practice, you know, the schemes are fast enough that you can run them in real time on
very cheap commodity computers.
Okay. So that's a good segue into -- so now you have this model, this nonlinear,
nondifferentiable at least the state of the system is velocity. That's exactly what you're going to
measure. How do you solve the real time estimation problem?
And so this is the third thing that I worked on as part of my Ph.D. was basically how can we solve
this estimation problem accurately but how can we do it in a way that's fast enough that it can run
in real time? And without necessarily relying on tons and tons of computational infrastructure.
Okay. So in order to fully specify the problem, I have to give one more piece of information, and
that's the network observation model. Basically, what are the measurements, what is the model
for our measurements look like.
And all the work that I just described in terms of transforming this density model into a velocity
model, the payoff is right here in the network observation equation. So the measurements that
we get from cell phones are stacked in this vector of observations, YN at time N. VN is the state
of the system. And H is just an operator that maps the measurements or the state to the
measurements.
And H is now linear. Because the velocity vector is just the velocity each discrete point
everywhere on the network. And H just basically says, which parts, which cells in the network did
you get a measurement from? And if you got a measurement, then H, it's just a matrix of 1s and
0s that picks off the locations where you got measurements from your cell phones. Cell phones
sends velocity. So it's just a linear mapping.
Of course you pick up some noise from the fact that the GPS has errors. You're looking at
assuming that the velocity is constant in this cell for all of space and all of time. There's some
other more subtle issues with the fact that spatial sampling induces some bias in terms of you
sense faster vehicles more often.
You treat all of that in the noise term for your observation operator. Okay. And the recursive
state update equation, to summarize again is this network velocity equation you take the full
velocity state. You break it up into each of the different edges, solve the optimization problems at
vertices. Get boundaries conditions. Evolve each PDE, forward independently and, of course,
there's noise involved in this process as well. Both from the fact that, okay, well there's
parameters in the model which aren't completely specified. The model doesn't completely
capture every possible detail. The model is an approximation of reality.
It doesn't capture things like accidents. So there's uncertainty that's introduced into the model
because of the fact that there's imprecision in the boundary conditions and things like this.
Okay. So you have the observation equation. You have the recursive state update. There's lots
of recursive estimation methods that are available to solve these problems. Particle filtering is
sort of like the, when nothing else works pull out your particle filter. Fully nonlinear. Monte Carlo
method. It's really useful for solving highly nonlinear problems, especially ones that are
nondifferentiable. But the problem is, for large scale systems, these become very
computationally intensive and very difficult to run in real time in practice.
And there's been some applications where there's been some success in this. But others, it's
really quite difficult. And extended Kalman filtering, if you're familiar with Kalman filtering,
extended Kalman filtering, basically you have a nonlinear system you lineralize it and apply
Kalman filtering for this problem. This is the generic approach when you have a nonlinear and
nothing else works you applied Kalman filtering. I point out we're dealing with such a difficult
model in terms of the nonlinearities and nondifferentiabilities which is in fact caused by the
physics of the system. Extended Kalman filtering can't be applied to this problem. Instead what I
use is a technique called ensemble common filtering it combines particle filtering and Kalman
filtering together.
Basically what you do, you come up with a Monte Carlo method for integrating the state through
your model. And you compute statistically, the mean and the co-variance from these samples
and then you use that mean and co-variance that you derive from your samples to do a standard
Kalman update.
Okay. So, again, what this looks like, basically you have two steps to the Kalman filter algorithm.
The first step is basically use your model to predict what the mean of the system looks like and,
what the co-variance looks like. If it were Kalman filtering or extended Kalman filtering, you'd
lineralize your system and you'd have an analytic system of how the mean and co-variance
evolve. For nonlinear system you basically generate lots and lots of samples of your velocity
state from a distribution that has the mean and co-variance of whatever it was at the previous
time step.
You run those samples through your model. And that gives you a distribution of what the velocity
looks like at the next time step. And you can compute a meaning co-variance from that. Then
you get your measurements, and you compute the Kalman game. The Kalman game is a
minimal variance estimator that combines the information that's contained in the estimate from
the model with the information that's contained in the measurements.
Then you use that to correct or update the estimate based on this additional information. Then
you feed that updated state back into the model.
So, again, just to go into a little bit more the details of the algorithm. First you initialize. You
come up with a distribution with a mean speed, say, V bar A0. A denotes the fact that it's from
the previous analyzed state or measurement update state, with the co-variance P. And you just
generate K samples from this. K is the number of ensembles or samples that you want to use.
Ensemble Kalman filtering, the samples or the draw from this realization are called ensembles.
You take each of those ensemble members or members and run it through your nonlinear
nondifferentiable differential model, to get as many samples from your next time step predict.
You can compute the mean of that ensemble. You can compute the co-variance of that
ensemble. And then you use that to do your Kalman gain computation. Again it's just minimal
variance estimator. You can then use that to update the velocity at time N based on what the
model forecasted compared to what information was contained, what new information was
contained in the measurements. The difference here between Y measurement and this
observation operator is the information that's the new information that's contained in those
measurements.
So you use that to update the state. Okay. So I'll switch now to talk a little bit about some of the
experimental work that we've done to test out just how well this works in practice.
So the first experiment that we ran was an experiment titled Mobile Century. It got the name
Century because we used 100 vehicles. Basically we hired something like 165 grad students and
had them drive these cars around a ten-mile stretch of the Bay Area, if you're familiar with this.
It's right between the Dunbarton and San Mateo bridges in California. And they drove these cars
for like eight hours sending measurements on the virtual trip lines. And it was a huge operation.
We had tons of support staff that were doing things to make sure that-I mean, when someone offers you a car, an $800 cell phone and not much supervision, we didn't
want people driving off to the beach. So we had infrastructure to make sure that for the purposes
of this experiment we could track where people were, in addition to having it feed into our virtual
trip line infrastructure.
The site location is important. Because anyone can estimate traffic conditions in free flow. That's
pretty trivial. You already have a pretty good notion if it's an empty road what the conditions will
look like.
But this site in particular is recurring congestion in the Bay Area. Gets lots of periods of both free
flowing traffic, congested traffic and accidents. And all the data that we collected from this
experiment, which is like 100 vehicles, three-second GPS measurements on the stretch of
roadway, it's all public. It's all open. If you go to traffic.berkeley.edu, you can download it and
use it for whatever you like.
>>: [inaudible].
>> Dan Work: That's a great question. Yes. There were. There were accidents.
>>: Rental cars?
>> Dan Work: In fact, this is a big issue. In terms of making things work, one of the big
challenges we had to do was get the research approved by the university. And the
recommendation was basically hire all the students as employees and have them drive rental
cars and the rental car companies will manage the insurance and you won't have to deal with it.
So that's the approach that we did. We had people renting their vehicles. Fortunately, none of
the drivers that we hired were actually involved in any of the accidents that occurred this day.
But this is sort of -- this is a time space diagram of the traffic. You've got time on the X axis. This
starts at 10:00 in the morning and goes to about 4:00 in the afternoon. And the post mile is just a
mile marker along the roadway. People are driving up here. Starts about mile marker 21 goes to
a little bit beyond mile marker 27.
And the blue dots correspond with traffic that's moving fast. 70 miles an hour. And the red or
yellow correspond to slower moving vehicles. You can see in the afternoon, congestion, there's
lots of recurring congestion. We see a lot of back-up here.
But in the morning, 11:00 in the morning, this is unusual. This was the one thing we weren't
expecting to see. That's caused by an accident coming just up at the top side here. And caused
this huge shockwave to propagate down the roadway. This is exactly what these models, of
course, that we're working with are supposed to be good at capturing.
That's why you can see this really sharp change from the free flowing traffic to the congested
traffic. Okay. This is the data that was all the wild data that was stored locally on the phone. In
terms of the size of the network. It's pretty modest. It's something like 13 edges, 14 vertices and
I think about 70 dimensional state. So pretty small.
But this accident, as I mentioned, in the morning was sort of the anomaly we weren't expecting. I
mentioned it was a big experiment. There was a press event and there were several people up
talking about how important this was for the future of traffic.
And we had these real time displays showing the output of this ensemble Kalman filtering
algorithm on a monitor. And in the morning we're assuming free flowing traffic, no real risk here
that the algorithm could do something catastrophically wrong and avoid embarrassing us. And
sure enough it started showing this bright red spot in the middle of the press event.
And everybody here is a traffic expert. So they start calling up their departments of transportation
or their, their traffic monitoring systems to say, you know, are these guys, what's going on, there's
all this red stuff here. And sure enough fortunately we were able to show that this was caused by
like a five car accident that morning and we were able to redeem ourselves by saying in fact that
you know this is precisely why this type of technology and this type of monitoring is useful.
If you could give people that hadn't left yet the fact that this accident occurred, this would be quite
useful information.
And part of the reason that we also chose this site is that there's, I mentioned these inductive
loop detectors which are quite widely deployed in California, this site also has some of the
densest coverage of these inductive loop detectors anywhere in the state.
There's something like 17 of these inductive loop detectors, more or less a quarter of a mile
spacing. Quarter mile, half mile, something like this, spacing, so you have extremely dense
coverage of this existing fixed sensing infrastructure. That's what we're hoping to say that this
might be a candidate way for some traffic monitoring applications to not have to use this stuff.
And you can see that at least, you know, at this level you can see that at least some of the main
features are retained. The morning accident, of course, is covered in these inductive loop
detectors and you can see the recurring congestion, this is much higher resolution data. When
you look at these things you have to keep in mind these have sensors as they do, these have
sampling bias, things like this.
The data that we actually used to run our estimation algorithms is much, much less dense than
what you saw in those GPS phone logs that were stored locally on the phone. In fact, for the
simulations I've shown we used ten virtual trip lines. So it's a little bit harder to reconstruct the
picture of what happened the day for traffic using only this information.
So the goal of these estimation algorithms is to sort of fill in the gaps. What happened in the
places that we're not showing information.
And in order to assess how well we do when reconstructing the velocity, it's really hard to get sort
of the true state of traffic, what is the velocity everywhere on the roadway even though we had
100 vehicles that was something like anywhere from 2 to 5 percent of the traffic on that day.
Doesn't really give us an accurate measurement what the rest of the traffic looks like.
We had a couple of teams of video cameras sit on some bridges film with HD video cameras the
traffic so we could identify the license plates of the vehicles that were not participating in the
study get their travel times from one end of the experiment site to the other.
So in the video I'm going to show next basically we have measurements reidentified travel times
from this video data from license plate reidentification, and we'll compare that with the travel
times that we compute using our velocity field that we've estimated.
Okay. So here's the setup. Basically the experiment site is in the Bay Area. Zooms in here.
And you'll see sort of the standard -- the side that we're doing the estimation on is going north.
So up the screen. You'll see that the congestion starts to occur from this accident in the morning
right away, you get your map interface a dark red and the screens on the right are actually
showing what the estimation algorithm is doing.
So on the bottom here, basically the green lines correspond to the estimated travel time,
computed from this velocity field, plus minus three standard deviation ounce that estimate.
And the blue curve corresponds to the mean travel time that was collected from the video data.
So I mention we had these video cameras. Each pink cross hair here corresponds to one vehicle
that was reidentified, their travel time across the stretch of the experiment.
The blue curve here is just the mean of these pink marks, the individual data measurements. So
we do like a five-minute moving average window of this data to get the mean estimate of travel
time. And that's what we tried to track with our estimated from our velocity field.
So right now it's about 3:00 in the afternoon. You can see the congestion clearing in the morning.
I told you these models were good for understanding how traffic builds and clears. We do a
pretty good job of tracking the mean clearing time as congestion clears 2:00 in the afternoon,
travel times are starting to increase here. At the low point here you're looking at travel times
about eight minutes across a stretch of roadway. Now it's about 3:00 in the afternoon. You can
see the congestion is building again. It's taking almost double that. It's about 18 minute travel
times that are showing here.
In congestion the variance of the estimated travel time starts to increase because of some of the
nonlinearities associated with the model but overall we're able to quite accurately reconstruct the
travel time from the vehicles that were not participating in the study using the velocity field that we
estimated from the GPS data sending these virtual trip line line measurements.
>>: [inaudible] faster than.
>> Dan Work: Yes, there's an HOV lane. That's one of the problems. Because the model
doesn't describe the evolution of different lanes of traffic, just says that everybody's going in the
same speed in all lanes.
And in most cases that's a good approximation, but in the case where you have an HOV lane,
that can be, that adds quite a bit of uncertainty or error into the model.
So the next big thing that we did was try to scale this up and try to say that was nice for a small
stretch of roadway but what would this actually look like to deploy at a large scale.
So for a year, basically, we opened up our system, download people in the Bay Area to download
an application onto their mobile phone. They could get the real time traffic estimation or
estimates that were coming out of this velocity Kalman filtering algorithm shown on their phone
and in return for that data, or the traffic information, we'd collect GPS measurements from these
virtual trip lines.
We had about 5,000 users that downloaded the application and $98 the experiment. And
basically it concluded last November. The real challenges were scaling this up to show that this
could actually not just be a fun academic experiment but that on networks of a meaningful size,
that you could still actually get these algorithms to run in real time.
So basically I mentioned earlier some of the features but I'll just recap. For the Bay Area,
basically we have an automated algorithm that builds the network topology for us, using a
database of underlying road topology that's constructed by NAVTEK. We have access to their
map database. We basically can automatically generate using an interface just draw the area we
want to do traffic estimation on.
It will build the network topology for us and allow us to run this algorithm on top of that
infrastructure. The network that we ran on, okay, so here are the exact numbers, 4,000 edges,
3,000 vertices, state dimension is over 15,000. And again because we're using an ensemble
Kalman filtering algorithm we have to solve that every time in six seconds.
Hundreds of times each time for each sample realization for the estimation algorithm. But these
algorithms are fast enough that even the entire Bay Area algorithm runs on my Think Pad which
now finally died, but it was a three-year-old Think Pad by the time that I was actually having this
production code run on my system.
So there's obviously some other pieces of the system that didn't run locally on the laptop such as
like the map database wasn't all stored on my machine. But the core estimation algorithm that
takes the data, runs the Kalman filtering algorithm, solves all the PDEs, solve all the linear
programs and puts that out, runs in real time. I think it takes about three seconds to do a
six-second update. You have plenty of time even for machine hiccups and things like this. In
terms of scalability. It's highly scaleable. With a rack of servers something you could cover most
of the freeways in the Continental United States.
Okay. I just want to take a few minutes to just summarize sort of what I think some of the
interesting extensions to this work are. As we move away from traffic and into other areas.
So I really focused a lot today about how the traffic problem -- again how the GPS availability of
data coming from smart phones is changing how we do traffic estimations, increasing the
coverage a lot. But we have a problem with reconciling this information with density-based
models.
So I showed how to transform the density model into a velocity evolution equation and solve that
problem using ensemble Kalman filtering.
I think in terms of what the next steps are, I mean I think where this stuff gets really powerful is
when you start to look at how to combine information from traffic with other cyberphysical
systems. So just like we built Mobile Millennium this traffic information system that's taking cell
phone data from all the cars and sending it back to the cloud and we're doing all kinds of
computation and feeding that information back you can do the same thing for air quality.
You go out deploy your air quality sensors all over the place. Nice environmental engineering
application. And you build a system that solves the highest end contaminant transport models for
air quality.
But the problem is that these physical problems are obviously related. Congested roadways
create emissions which degrade air quality. There's no reason if the physical systems are
coupled that we shouldn't also couple the computational infrastructure. That in addition to just
making the sensor data available, if you've got a robust algorithm for estimating the traffic
conditions, making that state available for other services on the Web to take advantage of starts
to make these cyberphysical systems much more powerful in terms of what they can estimate,
what they can infer based on the physical environment.
So I think getting to that level is something that's going to be really interesting.
Okay. The main challenge is, some of the things I'm interested in working on and what I think
some of the challenges are, obviously this platform-based design, designing these systems to
easily be able to share both raw real time sensor data as well as the real time best estimates
coming from sort of the state-of-the-art algorithms from different areas is definitely one of the
challenges.
And how to deal with authentication, how to deal with security, privacy, all these things as you try
to merge these things I think is just going to open up a host of problems that are going to be
interesting to study.
There's lots of interesting problems in estimation, specifically related to when you move the
sensors around. Anything from basically trying to integrate these moving sensor data into models
that instead of tracking the users, track some aggregate quantity, which there may be many
reasons you want to do that either for computational efficiency or for privacy. Trying to
understand how to move your sensors efficiently, especially when you're on directed graphs that
on the edges of the graphs you have partial differential equations that you have to solve. Makes
the sensor planning problems quite difficult, especially in uncertainty in the state there.
Designing real time estimation algorithms for dealing with the streaming data to handle both
mathematical models and the huge volumes of data when you start to take advantage of a lot of
this crowd sourced information.
And then more on the technical side for mathematics just deciding whether if a problem is well
posed dealing with bound fixed condition data how to actually work with trajectories in the data
streams to identify whether or not the problem can even be solved or it's overdetermined or
underdetermined.
And the applications I think are, I mean, just a few of them. Obviously I'm interested in traffic
based on my previous work at Berkeley, but I think there's obviously lots of extensions to the
same ideas and smart buildings for air quality monitoring. Understanding basically how people
move, how to do participatory sensing of buildings, using smart phones, using sensors that we
deploy, using mobile sensors. I mentioned before how to integrate real time traffic estimates into
air quality. These are a few applications that I'm motivated by in terms of where I think sort of the
next steps might be in terms of solving the distributed parameter systems estimation problems.
And with that, I'll conclude and I'll take any questions that you have. Thank you.
[applause]
>>: I have a question. So do you have a sense of how more accurate your model is that uses the
sensors at these virtual lines? Then that you are taking say the velocities you record at each of
them and kind of assuming that would be the velocity of the whole log and then just computing
the total time it takes?
>> Dan Work: Yeah, it really depends on -- at some point -- I mean as you increase the data
volumes, the model becomes increasingly less useful, which is to say if you have enough data,
you don't need any model at all. The data tells you everything. It contains all the information that
a model would have and then some.
>>: [inaudible] like you had an experiment with 5,000 people, but how do you manage the
accuracy when it's only like a thousand or 500,000?
>> Dan Work: Yeah, so these are great questions. I mean, the real challenge I think with a lot of
these systems is with validation, is to be able to -- we developed a well-defined small experiment
specifically so that we could validate it on one test segment. But to validate these systems in
practice over large networks is very difficult.
So you can obviously send test drivers out to assess how well you're performing here and there
and over time build up some reliability estimates of what you've done, but the larger your system,
the more time varying the system becomes, the harder it becomes to even validate how well
these things are performing in practice.
And I say even if you look at sort of the case for savings the models even if you have too much
data, if you have too much data, models are still useful in the sense that they can help you
identify things that are wrong with the data.
And to sort of put it in perspective, a lot of companies right now are looking at -- I want to buy -- I
want to build a traffic monitoring application or I want to build a new crowd sourcing application or
new mobile service. But I don't have any of the data. So I'm going to buy it from somebody else.
I'm going to buy it from a FedEx. I'm going to buy it from Verizon, buy it from somebody else.
So you buy this data and you have no contract or no guarantees about the authenticity of the data
except for someone's guarantee that we didn't make it up.
And by having a model in the background, even if the data alone can tell you all you need to
know about what you're interested in, the model may still help you identify whether or not you're
being given information which is just physically inconsistent.
And that may still allow you to detect things like a sensor is malfunctioning or so whether it's
malicious that someone's giving you bad information or whether it just happens to be that the
sensor is behaving poorly, a mathematical model will still help you basically identify or classify
which sensors are junk, which sensor data is junk and which are physically consistent with what's
happening.
So I don't know if that answers your question.
>>: So you talk about the monitoring. Maybe you covered this in the talk, but what's the
challenge there? How is that different from the vehicles, I guess you don't have the same
monitor.
>> Dan Work: You know, there are some similarities. And the problem of traffic estimation and
pedestrian modeling at the right scale. One thing that I didn't touch on during the traffic
monitoring talk is if, understanding the boundary conditions is hard to estimate. You can either
deploy sensors and try to measure it, but what you really are after is where people are starting
their trips and where they want to go.
If you can generate the trip demands, then you can pretty easily estimate what the boundary
conditions need to be based on what all these demands look like.
So in the same context for traffic, or for pedestrian modeling, a lot of understanding how people
want to move through a building or through an urban area, again relate to where are they?
Where are they starting from? And what are they trying to do? And so if -- again it's an origin
destination estimation problem. So you needle to know what they've been doing and what they're
going to do in the future. At that time the estimation problems are very similar. Once you start
breaking down to try to identify like how an individual will move from one place in a room to the
next place in the room, you can still work on transition models that sort of describe how a person
or groups of people will move from one part to the next. But you don't have any of the transition
laws. It's not as obvious to say just because they're spaced sort of in front of you that that's the
direction that you're going to take in your next steps. So there I think there's a lot more work that
has to be done in terms of both model identification and parameter identification for those types of
problems.
All right. Well, thank you.
>> Jie Liu: Thank you.
Download