Document 17868092

>> David Wilson: Well, today we're pleased to have David Aldous speaking. He's going to be with us for a couple more weeks, so if anybody would like to talk to him afterwards about random spatial networks, I'm sure he'd be happy to talk. >> David Aldous: Okay. Thanks. So this is the first talk on what I've been thinking about for a while, so the disadvantage of being a first talk is that there's some, you know, fancy graphics that aren't there that you'll have to imagine. Okay. What are we doing here? The actual content of the talk is math theory, inventing and studying a sort of class of slightly abstract but hopefully not too much general nonsense random processes, kind of axiomatic setup, et cetera, et cetera, et cetera. Before getting to that, there's sort of some a background. And for the first part I want to make a kind of obvious conceptual point. What's the difference between data and sort of a specific model for data and classes of processes, kind of obvious but maybe easy to forget. So the next two slides are going to deal with that as well as life universe, the universe and everything. Pretty impressive in two slides. Now, the other bit as I said, general background, is, you know, we talk about road networks, so I kind of want to talk about, you know, the view of road networks from paper maps and the view of road networks from online services. Okay, that's a weird thing to talk about but actually influences how we're setting up math model. So here are four chosen to be very different things that, you know, real things that we can choose to look at as data. Different aspects of the world. One aspect of the world is, you know, the physical world, independent of human beings. So the top left is a bit of coastline of Italy. So the part of the world is the human social world, relations between different people. So bottom line is a graphical representation of a social network. So third bit of the world, look around you, what do you see? Well, you see people's faces, hands. Everything else you see is a human artifact. Kind of forget that. We live in a world of human artifacts. So there's the engineered world, the top right, and the national highway system of the United States, sort of part of the engineered world. Of course, there's a fourth world we live in as well, the idea is demotions, motivations, you know, illustrated by a quote from Hamlet. Okay. So there you are, life, the universe and everything, the four aspects of it. Of course, none of these things are mathematics at all, so they're sort of actual, actual things somehow, a priori, nothing to do with mathematics, but, you know, we're all sort of math nerds here so everyone can sort of approach anything with a mathematic lie and try and find some mathematical aspects of it. And we kind of all know that there are, you know -- the particular things here I chose rather carefully, that there are mathematical aspects of them. So let me sort of come back to them. They sort of all fit, you know, four of these five classes of things. But let me sort of jump slightly here. Again, the difference between data and a particular process, a model for data and a class of processes, and so there's some familiar classes of processes perhaps of the top two sort of stationary processes and finite-state Markov chains are the very explicitly defined math objects. But they're sort of classes of processes. So if you think of a particular process, so a particular stationary process and, you know, if you pedantic whatever finite valued stationary ergodic process, thinking of irreducible finite state Markov chains. If I take a particular process in that class and it has sort of several interesting numbers, and I'm focusing on one of them for stationary process and interesting numbers entropy, so finite state Markov chain interesting numbers. So at mixing time, we've got to go through a list of different things here where you have classes of process. The trouble is if I start off talking about classes of processes, it kind of sounds like general aspect nonsense, and maybe it is. So how do I solve it being general aspect nonsense and try it something more concrete. And so the first two things illustrate that for every process in this class there's this particular number. The number is both interesting as theory because the entropy of a process says something about how we can compress it and then the mixing time on Markov chains says something about other properties of the Markov chain. And also it tells me theoretical things. You can actually try and estimate them from data if you were having data from something you can, you know, try and estimate what these theoretical numbers are. So we go down the list, you know, a random factor isn't such a precise class of objects, but we kind of recognize it when we see it and somewhat talk logically the interesting number associated with a fractal is the fractal dimension. To give a sort of an interesting thing I don't know how to talk about, you know, social network models. We kind of recognize what a particular model is sort of something you might use to model a social network. Again, there's no actual or general definition of a class of social network models analogous to the other classes. It will be interesting to have one. And it will be interesting to sort of fit into this schema whereby you just say, well, there's a class of models. Of every particular model in this class there's an interesting number. And of course people sort of do this badly by thinking of, you know, these as generic complex networks and thinking of degree distributions having power log tails using the power log exponent as a statistic. But that's kind of dopey. Everybody does it, but it's still dopey. So anyway, so this is sort of something where sort of a -- I don't know a good way of doing it. And if you found a good way of doing it, you would become famous and Jennifer Chase would offer you a job. So this is a kind of interesting open area. What I am going the talk about is something sort of is in the same genre. So the point of all this spiel is to try to get into views of what I'm actually talking about is sort of a class of processes rather than a precise model within that class. But I'm going to talk about a class of processes, and it has the same feature that for every process in this class there's a sort of an interesting number, an interesting statistic associated with it. Sort of a good name for this, so again, small fries or fame for anyone who sort of suggests a name for something that I'm unimaginatively calling P of 1. So anyway. Okay. That's the conceptual stuff. So we go on to actual road networks and et cetera. What kind of math objects should we model the road network as? So at one level this is simple. So this is a simple level on the bottom. And a few of these slides people who have seen previous talks of mine will have seen before the bottom sort of Xerox out of the Rand McNally road atlas is kind of a schematic for the intercity road network of the northern parts of the United States, cities and lines and distances between cities not on them. So that's sort of real data. And again, the type of mathematical -- on the top is a mathematical thing. There's sort of a set of points in a plane which is actually chosen as random and then edges between points pointing to a particular area real world, but it's clear that the real world thing to a mathematical representation as Rand McNally well new, mathematical representation of real data here is this type of -- well, math object, there isn't any mystery as to sort of the actual way of thinking about this, of course having a sort of accurate specific model is hard, but some type of object isn't mysterious. So this is sort of one thing that's clear conceptually want to do and there's another extreme, you know, the sort of -- there's a road network within a city of Manhattan or Redmond. We kind of know what maps the road networks for a city look like, and it's not particularly mysterious to a particular model of a grid variants of grids, et cetera. On the other hand, I want to think big, and I want to be -- forgetting what -- let me take that back a second. I want to imagine what an online map has, which actually understands the entire road network of the United States. I want to think about, you know, what is the mathematical model for the entire road network of the United States. Let me pull out and do a slightly better graphic here. I have to do my -- where is everything here? Escape. How does this work? Somewhere up on here -- this may end up as a disaster. No, it's not working here. Oh, it is working here. I should never have tried this. Went to the wrong thing here. Why am I not going to -- I got the wrong one. I've done the dopey thing of having an actual slide of what I wanted to look at and the actual Web page. Okay. I'm going to stick with this. So this is my slide version of something -- a factual Web page, yes. >>: You say a road network. How much does it depend on the politics of the country? Like the US being the [inaudible] from being very heavily centralized ->> David Aldous: That's a good, good question. And I don't have it, but even if you look at different state, so if you look at say Iowa and Indiana which in fact midwestern states you might not expect the road networks to look different and actually look very different because sort of Indianapolis is sort of the Paris of Indiana and sort of all roads lead to in the atlas. And so in Iowa there were these cartesians and they kind of have this thing. So yes, so for his core cal reasons that's kind of my -- precisely my point that if you want a model, if you want to write down one model fits everything, then you have this sort of trouble. But I'm not trying to fit one model just like English language and French language is different. So instead of a particular model for this, I want a class of models like stationary processes as models as sort of natural language. Anyway, so this is again doing something that's a priori sounds crazy. This is the entire road network represented as 25 million or so line segments and nothing else. That's what's sort of comprehensive about this. Had shows road segments and nothing else. It doesn't show borders, it doesn't show coastline, it doesn't show cities, mountains, rivers, anything else, just roads. And so now at last you can infer the borders and roads for the United States, you can infer the boundaries from there. What should cities look like? Well, cities, there are lots of roads in so they're sort of dark blobs over on the right are cities and you can really test your knowledge American geography by trying to identify all the cities that are sort of lying around here. Let me find my pointer here. So, yeah, it's actually which city is all of these things here. So they're not so easy to figure out. So that ->>: [inaudible]. >> David Aldous: It's just slightly clearer. I can try again and find it. >>: [inaudible]. >> David Aldous: This is why I have two Web pages here somewhere. It's my fault. I need to find the other. Somehow it's gone. I had two tabs up. >>: [inaudible]. >> David Aldous: And we have to close tabs. Okay. >>: Two browsers. There's not enough browsers. >> David Aldous: I thought I had -- I'm supposed to have two tabs and I'm kind of losing things which I probably don't want to lose here. >>: [inaudible]. >> David Aldous: I'm sorry. I'm not sure ->>: [inaudible]. >> David Aldous: Yeah. Okay. [inaudible] Ben Fry's on all streets. It's slightly clearer on the actual page but not hugely clearer. Sorry I'm going off topic. So the other cities are easy. Over here we automatically think you're seeing mountains and topography and you're not of course, so you're seeing something different, completely different over here from what you think you're seeing because we're kind of actually seeing cities in a central valley rather than the Sierra Nevadas. Anyway, [inaudible] this is sort of a bit like jokes about mathematicians, this is literally true but it's completely useless as actual roadmap for the normal sorts of things we use road maps for. So it's just sort of a bit of fun digression. I want to make an obvious conceptual point that old maps -- we grew up with old-fashioned paper maps and now we have online road maps. And they differ in two ways. For a paper map you have to start off deciding there's a level of detail you're showing and that's what you've got. And of course the point of online maps is that you can zoom in and out. There's a large scale map and then it shows major roads I could zoom in. I might try and do a demo, practice these demos. You can zoom in and you see a small area and you see more and more detail. There's a spectrum of major to minor roads we'll talk about later. There's kind of the obvious common sense [inaudible] we see the minor roads as we zoom in. So you don't have this sort of prespecified level of detail. Secondly, you know, you can ask your online service for actual routes between two addresses. And again, we're not talking cities, we're talking individual streets, addresses of 100 million, 150 million, that order of magnitude street addresses. And you can actual get a route between any two of those. So they're sort of roads and there's routes, obviously related but slightly different. So the idea I'm sort of working up to, working towards the setup of a model, the idea that I want to kind of think of routes instead of roads as the primary object. I want to sort of abstract bing maps, Google Maps as an oracle, you know, instead of 100 or 150 million points in the plane I'm being a mathematician, I want to think about the continuous plane. So I want to imagine bing maps as kind of an oracle for any two points on the plane I'm going to get a route between them, and that's kind of the way I understand this process. So again, this is analogous to sort ergodic theory of English text, Hamlet is one realization from a cessation resource I want to think that what's noticed my abstract in bing maps, that is the actual US road network is sort of one realization of some other complicated abstract process and but we sort of get at this realization by sort of asking it throughout. So that's the sort of way I'm going to do a math model. Again, we're going to say a bunch of assumptions, but the sort of interesting one is the scale-invariance assumption so in fact what makes this sort of theory so say it intuitively. So you have to imagine low tech you know. I put seven points on a bit of transparent paper, the old transparency and now I get a paper map and I sort of take my transparency and sort of randomly position that somewhere on the paper map, and then I look, you know -- my points on the corresponding to, you know, real positions in the United States, find the nearest address and sort of ask for the routes in the actual road network between each pair of points here and we're getting something like this, seven points, seven to two use two pairs of points, I put all these routes together. So this is kind of a subnetwork together of the real network but kind of looked at on the random points somewhere. So there's some statistical distribution for what I see in the subnetwork like this. And the idea of scale-invariance is I haven't told you the scale for the map I'm looking at. The width here could be, you know, five miles or 25 miles or 100 miles. Scale-invariance is saying that the statistical properties, what we see here don't depend on the scale. You couldn't tell from this. It wouldn't be sort of more likely that this came from something five miles across or 100 miles across or not. So [inaudible] this is empirically realistic as it's dubious. It's not actually as crazy as it sounds. It's like the [inaudible] music. Famous line. [inaudible] music. It's not as bad as it sounds. So this is sort of the same sort of thing. It sounds completely crazy and it's not completely crazy. So comments about scale-invariance. So this is the sort of screen play for a movie I haven't written. You know, screen play for Avatar maybe not as exciting as the thing. So Wikipedia, I'm going to try again with my little thing here. Somehow I'm not doing this. Okay. I'm not -- I just lost it. I am not -- I have a bad instinct for these things. Okay. We're going to abandon these things anyway. You use all your imagination and you have to image four things and there three things here. So Wikipedia in the process has this dynamic picture that zooms in to the Brownian motion sample path and so you're seeing the scale-invariance as you actually zoom in here. And yes, we can see that at the end. So there's a lot of dynamic demo of scale-invariance as you zoom in on zero. If you then took something you might actual model by Brownian motion like the S&P 500 in Today today you could attempt to do the same thing with data, kind of might work for a while but of course eventually there's some bottom line where you know the Brownian motion model eventually breaks down at some fine scale. Though we don't care too much about that because models don't work the same way. So anyway, we can sort of image a -- on our real road maps are like the real road maps are like the stock market data, you can actually zoom in some number of levels, probably 15 levels between the world as a whole and the finite resolution of individual houses, there's some finite levels because at the end of you can't zoom in [inaudible] arbitrarily slow loads but that's kind of like zooming in on the stock market going one day. In the end we could go back just like there's a mathematical Brownian motion on Wikipedia we could go back to the realization of the models I'm talking about and then of curious you could zoom in forever. They're mathematical objects you can zoom in already, forever. So future. Future movie there. The final point to emphasize is that scaling of often comes in the context of scaling exponents. We are doing some naive Euclidean scaling without any scaling exponents, just if I shrink everything by two then sort of everything shrinks by the same factor. And we'll come back to that in a moment. Before I get to axiomatics as an end to the introductory part by speculations on the same theme of maps and scalings by a famous mathematician from 120 years ago, who is the most well known person who is also a mathematician 120 years ago? So I'll let you read this piece. Any way, so I'm not sure what Lewis Carroll would think of our modern technology, but that's his thought on paper maps. Okay. So going to math there's an axiomatic setup. Details are technical in a boring way. So I'm not going to tell you. But we're thinking of, you know, we're given a process. So whether we're given this mathematical objects, well kind of going back to old-fashioned way of looking at stochastic processes or things at Brownian motion you're given finite dimensional distributions, FDDs in this context. Yes? >>: US in [inaudible] intersection, do you really effect to be rotation invariant? So many old field boundaries going north, south, and east ->> David Aldous: Yeah, you're right. And the things that really rotation invariant, in some sense you -- rotation invariance is just laziness like assuming stationary processes are ergodic because you can just add a random rotation force and it double change anything. So it's just a sort of a tidying, bookkeeping more than anything that's actually important. Yeah. So it's just, yeah, a big [inaudible] in a simpler -- yeah. It's not actually important. I mean, you know, yeah. Okay. So we imagined -- again, it's this idea that we can sort of query the oracle. And what this basically comes down to is I could put in your finite setup points and get routes between each pair of them and again because I'm thinking of -- you know, I'm seeing one realistic of a random process but I'm modeling the random process and so the -- distributions of these structures are sort of cornered off consistent in the natural way as things vary. The assumptions, again, translation and rotation invariant. Of course things aren't literally that because of kind of population density, et cetera. It's just sort of simplifying things. And, you know, scale invariants is sort of real sort of meat of the assumptions because invariant under Euclidean scaling means that you know, if I take two typical points distance R apart there's some random route length, the distance by road between them is some DR bigger than R and this is just sort of scaling linearly. And distribution here I could just scale points one of R is points R apart so the distance -- the route length between two points, distance R apart just scales as R times the distance for one. >>: [inaudible] non-intersecting halves? >> David Aldous: On non-intersecting halves. Okay. Good. I'm going to come to that later. Yes. So thank you. Yes. >>: What about that FDD and [inaudible]. >> David Aldous: FDD is finite dimensional distributions. So the way you think about ->>: [inaudible]. >> David Aldous: Kolmogorov consistency theorem. >>: Consistency [inaudible]. >> David Aldous: Kolmogorov consistency is sort of the idea that, you know, the simplest fashion is that in order to define the distribution of invariant sequence of random variables it's enough to define the distribution of any finite subset kind of with the obvious consistency thing, yeah. So it's sort of [inaudible] theory operating systems of probability. Yeah. We'll just assume modeling roads it's not something else, we want to assume that these distances are finite, which isn't as -- which kind of links a little bit of work. David Wilson et al., 10 years ago and has much subsequent work that if you start off with models, various models of random spanning trees in the planes, random spanning trees of the lattice and then refine the lattice and take limits and you can get random trees in the plane, which of course trees are special sort of road network if you want to think about them. But these trees sort of have fractally halves, so the actual lengths of the actual distances are on tree like road network in the plane begin to be infinite and we want to exclude that. If there were a fractal road, I wouldn't want to drive along it. So that's finite the second thing again going to what [inaudible] was saying because intuitively we think that what bing maps is telling us, so in the real road network there are so many paths from one place to another, it's picking out one with many paths in the route, it's programmed to do that by minimizing something and not necessarily route length, because maybe [inaudible] put you on fast routes, so it's minimizing something we can think of as actual driving time. To put that into an axiomatic setting kind of, you know, reasonable layer of assumptions so we ought to assume something weaker, so the idea is that for two points here there's a route for two points here there's a route and what sort of isn't allowed is sort of this route does that and we don't allow that for the obvious reason that somehow, you know, you're going -- both routes are going from the -- between those two points, so they should be using the same route between those two points. So that's the sort of compatibility property and again it's sort of implicitly not allowing paths to self intersect. So these are weak compatibility conditions we throw in. Again, so we're thinking of a process -- we're thinking of routes between everywhere and everywhere else, which is rather hard to get our minds across. And the way we start thinking about this is to instead of trying to think of routes from everywhere to everywhere else, we sort of sample points in the plane that go into a Poisson point process and just look at routes between pairs of points within the process. And so we're imagining this again going back to my pictures here and imagining something like this, but now the points are now spread over the finite plane. And again we're looking at routes between each pair of them, but we're imagining that we ought to say that within the window you know, we don't fill the whole window with roads [inaudible] kind of a discrete pattern of roads within a given window. And this discrete pattern of roads is saying that, you know, when I look at all the routes between points of a Poisson process, I'm still having some sort of finite mean length for unit area, it's not going up to infinity. Again, in a sense as a network where for every pair of points in a plane I would draw a straight line between them. But that's kind of silly. I don't want that. So a final assumption in these sort of technical things, you know, the main assumption so far is the scale-invariance. There's one more assumption that turns out to be the interesting thing. Again I'll say it informally and then resay it to a nice way. So I want to go back to my sort of Poisson point process of rate lambda in the plane. But I'm just going to show the points of the process in these two regions. And then I'm just going to show the routes between, you know, some point on the left square and some point on the right square. So three points and two points, I got six routes and I put them all together. And now the idea is sort of go on increasing lambda, so I'm sampling more and more points on the left square and right square and looking at the routes between them. And I got lazy at drawing my picture at this point. But what I'm saying is what's going to happen as I sort of look up more and more street addresses here, more and more street addresses here and keep asking for the routes. Well, I got to seize on more and more bits of roads in here, more and more bits of roads in here, occasional some road segments from a point near here joining that. But basically, you know, there are only in the really world situation there's only a sort of fixed number of routes, say the three I've got here only a sort of fixed number of places, there's sort of major road which connects somewhere in here to somewhere in there. And so that's something that's kind of intuitively clearly happens. In the real world we're going to put this into our model and the first lambda goes to infinity how many points will we sample, we're just ending up with a sort of finite mean number of crossings in an intervening line. >>: So [inaudible]? >> David Aldous: Yes, they're separated of course, yes. And again, that's the kind of intuitive way of saying it. There's a nicer way of saying it. Again, let me go to sort of relates to both technical and conceptual points of, you know, calling off consistency works fine for countable index sets when we go to continuous time processes uncountable index sets we kind of have to start worrying about measure theoretic niceties and sets of measure zero and uncountable unions and things like that, and the kind of standard way of doing this is still hard in the '50s started thinking about weak convergence just to say well, you know, what make -- in order to talk about process as a whole you want to think of it -- the process, you know, not as a section of random variables but as one object taking values in a function space, continuous functions, Cadillac functions, et cetera. So that's the kind of standard way we now think about continuous time processes as a whole while just the FDDs. It's hard to know how to do this directly in our world because firstly you know, the XT is something simple like a number, typically, whereas, you know, our individual Rs for these slightly complicated graphical arguments are paths, so they're in some space of paths that I've spared you from giving a definition of. And then if you want to think of sample path properties that are saying how does the path, you know, what continuity properties does the path from Z1 to Z2 have as a function of Z1 and Z2 as again sort of building this up at the beginning as sort of a pain dependent to an axiomatic setting. Instead of one sort of technical issue it talk about this kind of continuous parameter process. As a whole we need some regularity conditions to sort of hitch together the finite distributions. If you think about real world networks ->>: [inaudible] model left to right across the [inaudible] >> David Aldous: Okay. Yeah, I think there's a -- right, there are lots of technical ways of doing it. I'm going to kind of claim conceptually simple way of doing it. Yes. So the kind of conceptual point the real world point is that you know we have an intuitive notion of your major roads, minor roads, freeways, the dirt tracks, et cetera. To actually quantify that in the real world, again, there are several ways of doing it, you know. How big is 148th Avenue out there? You can kind of quantify that as a number of lanes. You can work within the sort of numbering system of interstates, US roads, state roads, county roads. You could do it more continuously by measure of traffic volume. These are all sort of roughly correlated with each other. And of course maps show this. This is actually and important point of sort of -- real maps we're doing a model where, you know, the model just gives us routes as sort of mathematical paths. In the plane you're sort of saying that you're -- here I'm on a freeway and here I'm not on a freeway. That's not a -- we're putting minimal things into our model. But the kind of, you know, interesting thing, first interesting consequence is that we can sort of divide an ocean of size. Again size, I should think of a better word. I mean, point on this minor road to major road spectrum, we can sort of drive an ocean which isn't quite as natural as the real world notions, but it's still sensible. Again, to be technical, we have to start -- we're given FDDs to start off with. So we can't sort of talk about kind of every point simultaneously but we can talk about, you know, the network on our Poisson processor points because that just depends on the FDDs. So what we're doing is, you know, I take -- conceptually what I'm doing is taking a sort of a small piece of asphalt, a small part of a -- the road network and then I'm sort of taking a distance -- fixing a distance R, and then I'm saying are there two points both of which are at least distance R from this particular bit of asphalt such that the route from one to the other goes past where I'm standing now. So the point is if I'm just out on 36th Street here, then kind of, you know, no one driving from somewhere 10 miles away to somewhere else 10 miles away goes along 36th Street outside the building or maybe there's some maximum distance, I'm not sure, there's probably points five miles away in north and south so someone will be driving between those two points five miles away would be going down 148th past the building. So we get a -- for each R, there's a sort of a subnetwork of the network, a subnetwork of the Poisson simple network consisting of your points on roads that are on a route where sort of both end points are more now from where you are. And the point is that that rigorously makes sense in terms of FDDs. Again, now, we can let lambda go to infinity, everything is monotone so there's a sort of lambda equals infinity limit which we can interpret as the parts of this entire network, the network on the continuum with the same intuitive interpretation. These are points on a road that are on the route between some pair of points in the continuum that are a distance more than R from where we are standing. Again, everything is stationary, so this sort of subnetwork has some mean length pursuant area, depending on R. Scale-invariance which does all our work for us shows the scales in a particular way. And of course the final assumption implies a previous assumption is that you know these rates actually finite, [inaudible] you might have -- it might be infinite. But our final assumption is that these are finite. So these -- the assumptions which, you know, the important ones are sort of scale-invariance and this finite one, again, define the path of processes. We haven't sort of defined a particular model here but it's a class of processes, like stationary processes to the class of processes so jargon scale-invariant random spatial network. And, you know, beings I'm claiming the interesting number as my sort of philosophical thing at the beginning, any particular process in this path has an interesting number which in this context is this P of 1, I kind of say why this is interesting, say it first and come back to the previous one, firstly this is -- I'm doing bad ordering. The position P1 is finite is the sort of mathematically tidy version of what we were saying back here, that that's basically the same condition as saying let's go however many points I sample on left and the right, but there's only a finite number of Poisson points on the intermediate line. >>: Have an assumption here that these two squares are far apart. >> David Aldous: It doesn't matter because everything's scale-invariant. So it doesn't matter, you know. It's just some, you know, some distance this is one, this is one, this is one and a half or two and however they scale double matter. >>: [inaudible]. >> David Aldous: Bigger than zero, yeah. Bigger than zero. >>: But a [inaudible]. >> David Aldous: No, you're going to have infinite number of things across the boundary if they're adjacent, yes. >>: [inaudible] few inches maybe those could be finite number of [inaudible]. >> David Aldous: Well, no, but this ratio, I mean, it's kind of constant, you know. The ratio -- the ratio of sort of this length and the separation has to be not going to zero or infinity, but it's just -yeah. Okay. So this final assumption kind of corresponds to that picture exception level but it also kind of works as this sort of technical regularity assumption analogous to regularity positions on paths for regular contiguous time processes that sort of the FDDs consist of FDDs plus the assumption kind of gives us this limit structure, this limit structure is somehow, you know, will be sampling the intermediary but some of the effect of the Poisson sampling goes away at this point and we've sort of gotten a picture sort of which as R comes down from infinity the [inaudible], this is showing what we imagine is just the interstate freeways and this R comes down we're showing sort of more and more minor roads that are sort of slowly sort of filling in the entire plane. And again, this is sort of a math object who is sort of existence follows from the previous setup. And finally of course the actual number P1 is a number for the real world network. It's two and a half or something like that. Again, it's a pure number and everything is like a scale-invariant so it actually can't depend on whether you're dealing with miles or kilometers, et cetera. It's the sort of characteristic of a particular model and kind of saying it backwards this way of looking at things gives any road segment gets sort of assigned a real number R, the smallest R that you are in the route. The largest road in the route between two points, distance more than R apart and that's the sort of where you are on the spectrum from minor road to major road. Going back to -- again there's a last bit about the axiomatic setup is there's technical reasons for doing it this way. So we chose three possible ways of starting out here, which is imagine the oracle gave us routes, you know, another sort of common sense way of doing it as well is start off by putting down the major roads and then fill in minor roads, et cetera, and a more mathematical way of saying this is saying well, let's just make up a random metric on the plane and define routes as DOD 6 in the random metric, so two and three are perfectly sensible ways of trying to invent particular models in this class. When you try and sort of do rigorous constructions, the actual technical problem as far as I could see is proving uniqueness of routes and somehow two and three don't help you with the uniqueness of routes. So if you start there, you kind of have to assume uniqueness of routes two in general. And if you're going to do that, then only assuming uniqueness of routes to begin with seems conceptually tidier. So let me go back to the beginning and show the side again where always the bottom stuff is what we did first there's and axiomatic setup. Again, a class of processes. What are the properties of any given process in this class? And I sort of started saying some with this kind of an emergence of the road structure there's some more mathematical ones. I'll say very briefly here they thought they relate to issue in percolation type theory. So there's a notion of a geodesic, which is slightly fussy in detail this context with sort of a semi infinity or a doubly infinite geodesic is sort of an infinite line such that arbitrarily long finite sections of it are in routes. So that's a -- so it turns out to be elementary that there aren't any doubly infinite geodesics and so the analog of this invariance first passage in more standard first passage percolation problems is sort of hard open problems. But here the scale-invariance and symmetry gives us a lot for free. When it comes to semi infinite geodesics it might take one picture at one point without the generality. The origin -- there's a route from here to anywhere so I could kind of let that end point go to infinity and take limits so they have to be, you know, so that's the A at least one semi infinite geodesic out to infinity. A priori there might be more than one. And of course you know a moment thought -- things that the only possibility, the only possibilities is that that I had have to be some tree like structure. Just the nature of geodesics. The only thing that they could be is a tree like structure. On the other hand, you know, if I then say well how many of these are sort of crossing a circle of some radius, on the one hand you know because they are tree like number of crossings increase as you branch, on the other hand scale invariance tells you magically that it has to be constant, so the number of crossings has to be constant across -- couldn't be infinite. Well it could be infinite except that P1 less than infinity stops it being infinite. So their sort of finite number, could it be seven? Well, you know, maybe it could be seven. There may be some weird examples where there are every point, almost every point has exactly seven is that geodesics is kind of like having sort of seven infinite components in the percolation problem. So it sounds very implausible somehow. >>: [inaudible]. >> David Aldous: Say it again. >>: [inaudible]. >> David Aldous: I'm not sure how it rules out seven actually. It may do. But I don't quite see it. >>: [inaudible]. >> David Aldous: Okay. Well, it didn't short it. Geodesic -- there's about from zero to anywhere and that route from zero to anywhere and then the idea is that sort of the route from zero to here may go along here and then the route from zero to here might go around there. So a geodesic you couldn't of this odd property that sort of things will only precisely define almost everywhere and points, roads have measure zero. So actually you -- so you -- there are a lot of technicalities I'm hiding here. >>: [inaudible] if you have that seven then you would be closest one to the origin and wouldn't the distance ->> David Aldous: [inaudible]. If there are seven [inaudible] branch to the origin, yes. If the seven [inaudible] that's the problem, you're right. There might be seven but they branch [inaudible] the origin and so you might somehow have -- you know, you might have construction based on four quadrants where there's always one going northeast or one going northwest, et cetera. Actually who knows. Again, I have my weird [inaudible] process. In the finite world you can do that. It's kind of a -- it's an amazing thing that on a Poisson process you can put edges such that it's actually one goes in each of the four quadrants. It's very implausible. It actually can. Probably not here. Anyway, so again, I believe that any sort of nice model in this class does have a unique semi inference geodesic. It's sort of one of many technical open questions because I'm the only person that ever thought about this so far as ->>: [inaudible]. >> David Aldous: Probably. I just haven't, you know -- you know, I like pictures and data at least somewhere three steps away in the background. I don't really have any pictures of this in three dimensions. So we have different in D dimensions. Well, let's say three. But, you know, but I'm merely thinking of road maps. And it's very hard. I don't know what I want to think about because, you know, the interesting three dimensional networks, the neurons in your brain, and those are just different from road networks. So if you actually want to think about three dimensional things you just think about neurons and something concrete and not just make map stuff up. That's my take. You want some guidance. [Inaudible] make map stuff up. You want some guidance as to what you're actually doing. Yeah, so [inaudible] I'm seeking the plainer case, the plainer case that set some constraints on things. So the uniqueness of semi infinite geodesics sound sort of a weird pure math quality but because of scale-invariance, we're assuming there are unique what it's basically saying is that if I take sort of a circle of radius one and now take things very far away, then it basically says that's one path from zero to the circle of radius one such that, you know, to every far away -- every path to a far away point starts off in the same particular way. So that's basically what unique semi infinite geodesics are saying. And now I can kind of scale everything back in, and this is equivalent to saying -- let me sort of wander over here, is basically equivalent to saying that, you know, if I start here and look at paths out to the boundary of the unit circle, then these actually coincide within some small silon here. And this is kind of getting us to a sort of path continuity property. So I want to say here's the route from this point to this point. What happens if I start near here and start near here? I want to say those routes converge -- converge in the strong sense of sort of being identical outside of neighborhoods. And somehow, you know, it's almost the same property as unique semi infinite geodesics because, you know, any path that goes a fixed distance has to start off the same way and we are again a technical level of -- it's actually the same condition but we're almost in the same ball park talking about continuity of paths, dimension of the end points and more abstract things like semi infinite geodesics. Okay. So that's a sort of sales pitch. So here's a sales pitch which isn't entirely serious. It could be done better but how your car GPS device finds routes. So there are two extremes what it might do [inaudible] all of 100 million street addresses it's going to find the route between, you could sort of imagine you are sort of 100 million by 100 million table that's precomputed and you just looked up. You can imagine it's just representing the graph and doing no precomputation and you're just using the classical Dijkstra-style algorithm to find it. And so first of all the site will be not optimum. And after people thought about this for a while, you know, the many practical things that go into it, but sort one of the sort of theoretical practical issues that these guys in 2007 came up with in the modern style you have a good idea for science paper and patent is this notion of what they call transit nodes which is basically just intersections of major, major streets, major roads. The idea is if you have any two points a long way away, more than a hundred -- hundreds of miles away, then we have this sort of prefound and prerecorded, presort 10,000 special points you have [inaudible] theoretical guarantee that the optimum route between your two points is going to go through one of these special points close to the start, another special point close to the end. So now you kind of do the algorithm finding the shortest route from your house to some number, 10 or so of these special points in a neighbor and similarly at the other end and then you just look up in your table the distance of the route between each of those 10 times 10 equals 100 possible pairs of local transit nodes and this is something you can all do in a 10th of a second [inaudible] five seconds for the other ways. So people aren't going to wait five seconds anymore. So a 10th of a second makes a difference anyway. So this is just what I've been told by other people. And you know so one level of course for a practical point of view, the fact that there are 10,000 of these points rather than 1,000 or 100,000 is empirical fact. Why is this theoretically true? So at TechFest I heard a talk by Goldberg about this paper I'm quoting here, and so they have this notion of they're looking at kind of worst case analysis so they have this issue about highway dimension takes a bit of while to pass except that it's similar to what I'm doing. So somehow if I fix an R and then say well for every ball of radius 4R, so wherever I have sent a ball of radius R, I want to say there's some number of vertices such that every shortest route, which isn't too short, passes through some vertex in the set. This is kind of close to what I am talking about without the probabilistic framework. And the point is that you know, a priori has H depending on R, they are taking -- they are implicitly taking the [inaudible] R, [inaudible] sort of sensible thing to be doing, you want something like scale-invariance to show that. Because if H was changing directly with R, you'd only have to do it on the worst scale, not on every scale. So there you have some rather intricate worst case analysis. >>: [inaudible]. >> David Aldous: It isn't. They have made up the word. There's nothing -- it's not a dimension. Someone said that at the TechFest. Maybe Yuval. Yuval said that. There you go. Great minds think alike. Yes. It isn't a dimension. They just made it up. Yes. Okay. So we can do a little back of envelope calculation within our model. Of course, we have to -- in order to sort of map our model which kind of infinite scales up and down totals all the way up and down to real -- to the real world you have to top off at some low distance, but that doesn't really affect anything. But basically -- again, I'm not sure I want to defend this too much because [inaudible] arbitrary, arbitrarily compare the sort of costs of these two different algorithms of sort of a minimum cost path algorithm and you're doing something with precomputing and looking up K by K matrices. And these are sort of rather directly comparable. But the point is one could sort of do these details if a cost comparison and some other way and you'd get out your different answer to the same general format. So within the model of SIRSN, sort of a little back of envelope calculation here kind of tells you how the algorithmic cost, the sort of time to do an algorithm scales with the various characters of the model and the point is it comes out simple and very interesting. It depends on, you know, the number of edges of graph, the number of road segments and our magic number P of 1 and implicit consistence there across the implicit consistence and the costs of the algorithms here. So we actually kind of can fit this slightly into algorithmic problems. I'm running out of time, so I'll take one minute for my last thing. So people paying attention to the logic of what I'm doing will have noticed that I've defined a class of problems but not shown it's a class of models, but sort of non shown it's not empty. So I've written down a bunch of axioms and we prove things from the axioms they better be something that actually satisfies the axioms. So again, so say in the last two slides what something is, again it's conceptually simple, though annoyingly hard to nail down details. People who have paid attention to the road network here, 148th Street is a main street. 140th Street is the next main avenue. 140th Avenue. Then 132nd Avenue, then 116th Avenue because there's a park in the way. So actually the designers here had road designers every eighth street and avenue roughly is a major one. So I'm kind of doing the -- yeah? >>: [inaudible]. >> David Aldous: These are the what? >>: The quarter sections. Fields half a mile by half a mile. >> David Aldous: I don't think this was every fields. I don't think they were literally fields. I think they sort of start -- I think the cities were laid out rather separately from pure -- I don't know. But I think this is all forest and didn't get to be fields. But I could be wrong. Yes. So the model as I [inaudible] my regular Manhattan grid but all of the odd numbered streets, avenues have relative speed limit one and then sort of things that are divisible by two and not four, have a faster speed limit, streets and avenues divisible by four but not eight have a faster speed limit. So you have this sort of binary hierarchy speed limits. And now you define routes as shortest time routes which basically means that when you start off with you know a kind of a starting point, ending point then somehow somewhere between these two the fastest route that way, some way a fastest route that way and you're probably going to want to go sort of along here and along here for a while and then sort of take off somewhere into there. So there's a -one has a route within this world but they're sort of preferentially going to the high speed roads which are done in this sort of binary hierarchal way. And so that's the basic construction and the key feature of this construction is actually invariant on the scaling by two. So it's kind of precisely invariant on the scaling by two. The kind of -- yeah route from one point to another point by scale by two that that map takes on to some route from twice C1 to twice C2, and that's actually the optimum route from 2C1 to 2CF2. Some kind of determine -- we have a sort of deterministic invariance on the scaling by two here. And that kind of gets us started and that's a very sort of strong and special property so we kind of keep going by refining into half and this consistency property doesn't alter the routes we've already got. We just sort of get in ball points so we refine by half. Intuitively we can now define routes between points in the continuum by taking limits and lattice. From the lattice points that turns out the technically hard way of doing it. So you get deep limits. But it eventually works. And then of course we have no translation invariants. These are special but you just randomly translate a long way away and take limits and kind of general aspect nonsense gives you translation invariance. Artificially apply random rotation, rotation invariance and the fact that everything is invariant on a scaling by two deterministically means you just sort of add a random scaling factor between one and two and that gives you exact scale invariants and at the end of the day you've got two little calculations to show our parameters are finite. So this all takes, you know, 12 pages to say the details of the other kind of hard part is the kind of -- what you think is easy ought to be easy just take limits and it's just hard to show you get unique limits. Partly because we don't understand them algorithmically or I don't understand them algorithmically well enough to what the optimum out are. Anyway, I am going to stop there. So thank you for your attention. Lots of technical open problems for people who like technical open problems. Thank you. [applause]. >>: So you define some conditions, some axioms [inaudible] I guess in [inaudible]. I'm not quite sure exactly what the [inaudible]. >> David Aldous: [inaudible]. [brief talking over]. >> David Aldous: It's a special situation. My sort of propaganda is the philosophy here that kind of we have let's go back to that there are kind of weird, you know, math modeling started with, you know, Newton and the law of gravity and Kessler and motion, so we sort of have gone math modeling, probability modeling has gone a long way from sort of basic physics. And we've still kept the style of imagining the complicated things can be modelled simply. And my philosophical point is we do a bit too much of that. There's John Doyle has this -- you know, this argument about a complexity theory and that your -- that ->>: You said the [inaudible] satisfy this and that. >> David Aldous: No, I'm -- okay. I'm going to a math model. I'm doing sort of -- I'm not claiming this as a very literal model of math -- of road network. So this is a bit sort of mathematically playing around inspired by road networks but sort of doing things with numbers at the end which are kind of checkerboard rather than order of log N theory. That kind of doesn't correspond so much to anything it's that's indirectly checkerboard. So it's sort of theory -- it's playing around with theoretical structures for things that are sort of hard and maybe impossible to model sharply. >>: [inaudible] have a question. >> David Aldous: Yes? >>: If you take the roadmap in the US, what's P of 1? >> David Aldous: I think about two and a half, yes. I mean the more important issue is to what [inaudible] is it, in fact, roughly scale-invariant? We have a theory of being roughly scale-invariant and what is it. And again I don't know because it's hard to find out the best evidence of [inaudible] part 2 here, they did something slightly absurd. Actually asks maps for routes between points a long way apart and then looked at, you know, the -- how much was on one named road, how much of the journey was on I-5, I-90, et cetera, and there's some portion of the journey on the sort of largest roads with the same number, et cetera. And, well, you know, so if it's scale-invariant all these proportions would be the same on different scales, and the first one varies a little. Other ones are sort of markedly close to each other. So that's sort of one bit of evidence for scale-invariance. You know, something I have done as undergraduate when I find an undergraduate to do it is just actually do my little sort of transparency project, look at four points in a square, put them down at random, look at the actual subnetwork of the four points. There's some bizarre number of topological shapes of the subnetwork could be, like 45, but probably a few of them come up more often than others because scale-invariance, you know, predicts the frequency different topologies shouldn't vary with scale. And again, one can do something similar and see if they do. And my guess is they don't vary too much because, again, intuitively these things vary to one direction, things have been getting some more extreme in one way. But it's very hard to see kind of in which -- in which way, you know, what is the actual difference between looking at things a few miles apart and a few hundred miles apart [inaudible] getting more or less straight, you know, you haven't getting -- yeah. So it's hard to imagine sort of ways in which it changes drastically. >>: [inaudible] number that you give the US has 100 million addresses and 10,000 points [inaudible] does that apply on the other countries developing. >> David Aldous: Yeah, with the [inaudible] Europe so the people who are doing this stuff, you know, they kind of thought out again the US road network and European road network and they get sort of very similar ->>: [inaudible]. >> David Aldous: No. I mean obviously it's not a -- yeah. I mean the numbers aren't the same. Just like the entropy, the entropy of different languages isn't the same. But we're not assuming it is. Again, this whole business of it's not a model but it's a class of models. Yeah. Right. And so we really want -- so that's getting back to my conceptual point that one doesn't want to try and model complicated things like language or road networks as something with three parameters or 17 parameters, one wants to think of others, something in a -- something in a sort of large class of purchases to find my structural properties and, you know, playing around with that idea. Yeah. >> David Wilson: Well, why don't we save our questions for afterwards. [applause]

Document 17868092

Related documents

Products

Support

Document 17868092

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib