Yuval Peres: And so thanks, all, for coming. ... talk before. This is a more -- this is...

Yuval Peres: And so thanks, all, for coming. Some of you have seen versions of this talk before. This is a more -- this is yet another attempt to make this -- to make the ideas comprehensible. So this is joint work with Jian Ding from Berkeley and James Lee from UW. Please. >>: [inaudible] Yuval Peres: What? >>: Where is your picture? Yuval Peres: Well -- [laughter]. Okay. So please let me know if there are questions along the way, and so we're going to consider random walk from graphs, and you can think of having conductances on the edges or just think of a simple random walk that's general enough. So here are some motivating pictures from two dimensions. This is drawn by Raissa. This is random walk on a 2-dimensional lattice run from time about the square of the side, and it's colored by the number of different sides. Here we're running it on torus until everything is covered and color by the time it took to reach different vertices. So the basic quantities on random walks are the hitting time, expected number of steps to hit a vertex from another, the commute time, time to go from U to V and back to U, and the one we'll be focusing on, the cover time, expected time for the random walk to visit all the vertices. And, of course, the starting point comes in here. So we're going to start from the worst vertex, that is, takes the maximum of the expectation over all starting vertices. So just to get -- to get acquainted, here the orders of magnitude of cover times for some different graphs. So if you have a path of length n, it's n squared. Complete graph and log in. Expander still in log in. Two-dimensional grid, n log n squared. And again -- so n here is the number of vertices. And, again, for many of these examples the constant -- for some of these examples the constants in front are known. But I'm not going to focus on the constants here. So for the 3-dimensional grid and also for higher dimensions, it's n log n. And then for the complete d-ary tree it's, again, n log n squared. And -- yeah, there's -- one could remove this log d because I'm not worrying about the constant, but you can think of d growing and then you want the log d. So some of these examples, for instance, this one, the constant was later found by David Aldous, but we're just going to focus on the order of magnitude. So people were interested in cover times for different reasons. One is as a simple algorithm for determining connectivity in a network, you run -- you know the size of the network, but suppose the operations you can do are very limited, so you can just walk around vertices, record where you've been and see when you've visited all vertices. But I think then this study took on a life of its own, and people in computer science, combinatorics probability got really interested in this with no -- perhaps beyond the practical motivations this had. So several people here worked a lot on cover times, and I'll come back to that. Now, one reason of interest is if we look at something like hitting times, they're very easy to calculate, explicitly deterministically using linear equations, so I'll -- I'll come back to this point -- while the cover time is more elusive. Here are the generally bounds that are available. So a cover time of a graph is always bounded by number of vertices times the number of edges times 2. This is a very nice argument based on spanning trees due to these people, Alelinuas, et al., and there's a lower bound of n log n, and it's somewhat tricky to get -- it's tricky to get the right constant here. This was done by Feige in '95, but basically it's -- a variation of Matthews' old argument can give you up to a constant. And these are hitting times. These are cover times. And we'll connect them to electrical resistance in a moment. So a crucial tool in analyzing, well, both hitting times and cover times is the notion of electrical resistance, which -- and so you think of the edges as unit resisters, and there is an effective resistance between one vertex if another. If I would want to define that, then think of sending a flow from one vertex to the other, a flow that has to satisfy the -the incoming flow at every vertex has to be the outgoing flow except the source in the sync. I send a unit of flow from the source to the sync and then I calculate the energy of the flow, the sum squares of the flow on the edge and try to minimize that. That's one definition of the energy, of the resistance, the minimum energy of a flow that goes from one vertex to the other. And there's a connection between the effective resistance and the commute time. So, remember, the commute time is to go from U to V and back, and that's -- the commute time is just equal to twice the number of edges times the effective resistance. So a general and very convenient formula. >>: [inaudible] Yuval Peres: Right. >>: I don't see how that work, because what if the graph is disconnected [inaudible]. Yuval Peres: Okay. So we're looking at connected graphs. So this is for connected graphs. So that's part of the assumption, that the graph is connected. >>: [inaudible] Yuval Peres: What? >>: [inaudible] Yuval Peres: So if you -- >>: [inaudible] Yuval Peres: There are no -- with this formula, there are no weights. >>: [inaudible] Yuval Peres: What? Right. The graphs are undirected. So this is for -- these formulas are for undirected graphs, and everything extends to reversible Markov chains, but let's just think of simple random walks on undirected graphs. That's a large enough class to focus on. And we'll see that this connection between the effective resistance and the commute time gives both of them some nice properties. So the question of computation of cover time I think was highlighted by Aldous and Fill in '94. So if you look at hitting times, hitting times are easy to compute explicitly, and this is because they satisfy linear equations. If I want a hitting time from U to V, well, there's a first step to go from U and then I have to average over the neighbors of U the time to go from W to V. So I can write down these linear equations and it's not hard to check. These are enough equations, and they're non-singular, so one can use them to determine all the hitting times. So just -- so a very fast algorithm to determine explicitly the hitting times. Of course, you could run the Markov chain and compute hitting times at random and average, but there's no need to do that. Linear equations are better. Now, for cover time, it's more challenging because there's no simple equations like that. So one comment is, of course, you can think of the cover time problem as a hitting time problem but going over to a much larger space. So if you go to a non-reversible chain, which is just the chain of sets you visited so far, so the sets you visited so far by the base chain itself forms a Markov chain, and you want to -- you start with a single [inaudible] and you want to hit the whole space. So it is a hitting time problem. This linear equation connection doesn't really depend on the fact that it's reversible. >>: [inaudible] Yuval Peres: Right. It contains the set and where you are now and where the [inaudible] and then that's all the information you need -- the past is irrelevant -- to determine the progress of this in a set value chain. So you can see when the set hits the full space, so that would be a hitting time. But -- so you reduce it to a hitting time problem, but then exponentially sized space, so not very pleasant computationally. The other option is we can estimate cover times just by running a random walk. So since cover time is bounded both in expectation and also the standard deviation is always bounded by order n cubed on an n vertex graph, then just by running many times and averaging the result, you get a polynomial time algorithm to approximate the cover time to any degree of precision, but, A, there's the statistical errors so you don't get it exactly, and there's the randomness. So one can -- so one question which, again, as far as we've been able to trace goes back to Aldous-Fill, '94, but Lovasz was also propounding it later, to find an approximation to order 1 of the cover time for general graphs which is a deterministic one. So not -- no randomization. >>: When you say order of one, do you mean difference or the ratio? Yuval Peres: Ratio. So here's the strongest result of this type known before. So if you look at the -- so if H max is the maximal hitting time, then there's a result of Matthews I can remind you why that says that the cover time bounded by H max times a log factor. But there's also a lower bound of this type, and I'll say that in a minute. But the key thing is that Kahn-Kim-Lovasz and Vu showed that this lower bound, if you combine these two lower bounds, of course, the cover time is bigger than the hitting time, and it's bigger than this quantity. If you combine those, that's a pretty good approximation. Indeed, it's up to log log n squared sharp in general. So let me quickly go over those. First, for the upper bound, if we don't care with the sharp constant, there's a very easy argument for the upper bound of the cover time, because if I wait until twice the H max, I start a vertex. If I wait until twice H max, well, then for any vertex of probability I'll hit, it must be at least a half. So the expected number of vertices I hit by time 2H max is at least half the graph. So at most, half the graph remains. Now I repeat from where I am, and, again, just looking at these leftover vertices, each one is hit with probability half. So the number of leftovers is going to be a quarter after the second round. So if we just go logarithmic number of rounds, the expected number of vertices that have not been hit is going to be less than 1 so with high probability we've hit and this way we can bound the cover time. What Peter Matthews is a more elegant argument that gives the right constant. So the cover time is bounded by H max actually times. This is a proxy the harmonic series 1 plus a half plus up to 1 over n. And his argument was based on taking the sequence of vertices, putting them in random order, so adding randomness to the system, and then looking at which vertices in this random sequence you hit. It turns out that each time when you go from the vertex k to k plus 1, there's some chance that the k plus 1 vertex in your sequence was already hit before. The chance it wasn't hit before because of the random permutation is 1 over k plus 1. So that's the chance you'll need to take in that hitting time. So that rough idea made precise, Matthews proved this upper bound, and the same idea proves this lower bound. So for any set S you can get a lower bound by looking at the hitting times within the set times the harmonic series whose length is just the size of the set. So you take this set you're interested -- if you're going to cover -- in particular, you have to cover this set, well, so you have to go between the elements of a set. You take them in random order and note that here we're taking them in random order. We go from 1 to the next to the third and so on. Each time we lower bound the hitting time by the minimal hitting time, but then these hitting times will only be taken with some probabilities because with some chance the vertices has already been hit before. So you get this kind of lower bound. And then it turns out it's not obvious, but it was proved in this paper KKLV that if you replace here the hitting times by the commute times, it's the same of the constant. And this quantity is, again, up to log log n squared, sharp. So there's an upper bound of [inaudible] by this random side together with this multiplied. So you take the maximum of this and this, multiply them by log log n squared and get an upper bound. This was a non-trivial argument. And that yielded a deterministic approximation up to this factor. So the question of a constant approximation remained open. >>: Your title is polynomial time because of the max over S. This doesn't seem to be polynomial time. Yuval Peres: Thank you. So another thing that was done in this KKLV paper which was -- again, this is not obvious -- is that you can estimate this quantity up to constant in polynomial time. So this -- so there's an algorithm for this. It's related to questions on packing and covering of spaces if you look for the -- the best packing of balls many a metric space and you want to estimate what's the largest packing you could do of these joint balls, it's very hard. But if you are ready to forego some constants or especially some constants [inaudible] log of the size of the set and then you can then just agree the algorithm will give you the sharp result, and that's roughly what is shown -- so this part of the argument is not the hard part of their paper to show that this quantity which, as you say, is defined by an exponential max, can actually be calculated efficiently. This is a relatively easy part of their paper, but I won't do it here. Thanks. All right. And a very recent result is that for trees, one can do a recursive approximation that gives the cover time up to 1 plus epsilon. For any epsilon you can get a polynomial time, polynomial time result. And this was actually predicted. So there was a paper by David Aldous that worked out the exact cover time or worked out the order of the cover time with some further information for random trees. So you take random labelled tree on n vertices picked uniformly among all such labeled trees and then the cover time of that is order n to the three halves. David's proof was based on the recursion, and he wrote that he expected this type of recursion could be used for an algorithm for general trees. Turns out to be a correct prediction, all though there's a lot of sweat and tears in actually carrying this out for general trees. And this can be found in this paper from September by Feige and Zeitouni. However, trees still seem very special where you can do these kind of recursions. So now I want to come to a second motivation, which is suppose a probabilist is interested in cover times, but not in algorithmic calculations. What's the relevance of this quest? Well, the KKLV paper, besides giving the algorithm, it also gave a result on the question of Winkler and Zuckerman. This is about blanket times. So blanket times are times when we cover a graph approximately in the form [inaudible]. Let me be more precise. Look at a graph like the complete graph where the cover time problem is just a coupon collector problem. So you just walk at random, you have n vertices, you wait until you cover. We know that time is n log n. Even the constant is 1. And at the cover time itself you have most vertices visited order log n times, and, of course, the last vertex to be reached is visited just once. So you have this disparity. So suppose I want a stronger notion of covering where all vertices will be visited about the same. So what's the right definition there? Well, if I just run to infinity, I know that the number of visits to every vertex is going to be according to the stationary distribution, which is just the degrees. So the right thing to look at is what we call the local times, which are the number of visits to a vertex normalized by the degree. And if we look at these quantities, they should balance out. So we know that if we drive time to infinity, all these -- the ratio of all these local times to each other is going to tend to 1. But one could look [inaudible] well suppose I didn't want to wait until infinity and have only finite time and I want those ratios to be bounded by some constant, 10, 2, 1.5. You name the -- let's fix a constant like 2, and then we want all these ratios to be 2. How long after the cover time do we need to wait? So they examined this in some examples. So here's the formal definition. The beta blanket time is the expected first time where all local times are within a factor beta of each other. So beta is some fixed number bigger than 1. You could take it or -- I'm sorry, fixed number less than 1. So you could take it to be a half. And so the conjecture of Winkler and Zuckerman is that the blanket time is actually equivalent to the cover time. So once you -- the constants in this equivalence will depend on the beta, but once you fix beta, then you only have to wait a constant's multiple of the cover time for things to even out. May be a little surprising. They proved it for some special cases, actually not that many, but in the complete graph it's an easy classical things to analyze. But they also did it for a circle and tori. And in all these cases they found that it and then they made the brave conjecture that it's true in general and that this constant will depend on anything except the beta. >>: [inaudible] Yuval Peres: It would what? >>: [inaudible] Yuval Peres: Yeah. >>: [inaudible] Yuval Peres: The ->>: [inaudible] Yuval Peres: The number of visits ->>: [inaudible] it becomes like Gaussian, but there's got to be some [inaudible] number of them because -Yuval Peres: So we have to wait for -- we have to wait for quite a while. So on graphs, you have to wait more than time -- certain more than time n log n. But order n log n could suffice if the cover time is n log n. So, well, there is something surprising to this. You have to look at a few examples to even convince yourself this is plausible. But that's what they did. And it turned out that the estimates that were given by this paper KKLV that I mentioned before work for blanket time just as they work for cover time. So since they estimated the cover time up to log log n squared and the same estimation works for blanket times, then you conclude -- even if you started with a goal of algorithmic calculation, you got a formula, and this formula is good for blanket times too. So the arguments work there. So it follows that the Winkler-Zuckerman conjecture is true up to log log n squared. Okay. So now I'm going to shift gears and start telling you about our work on this. And so to do that I need to introduce one more object, the Gaussian free field on a graph. So this is a Gaussian process. So we have a sequence of Gaussian variables indexed by the vertices. Gaussian process means all linear combinations are also Gaussian, or you can think of this as obtained from an independent Gaussian vector by multiplying by some matrix. So that's a Gaussian process. This is a special Gaussian process, has several equivalent descriptions. So here is one of them. So if I want to see -- to understand this process, it's fixed to be 0 at some vertex v0, so that's some anchor vertex where the process is 0. And then everywhere else -- to know the process everywhere else, it's enough to know these L2 norms. They determine the process. And the L2 norm between two vertices x and y is just the effective resistance between x and y. Here's an equivalent definition, and I'll give you a couple more orally. So another definition is a Gaussian process is completely determined by the covariances, and the covariance between gx and gy is the green function or the green kernel for random walk. You start at x and -- so you count the expected number of visits to y normalized by the degree of y. That's a symmetric function. Random walk is killed when it hits this vertex v0, and that's a positive definite function, and so there always exists a Gaussian process with that as covariance. Let me tell you more intuitively what this is. This is a two-dimensional picture. In one dimension a Gaussian process would just -- a Gaussian free field, suppose my graph is a long interval. Then what -- and v0 is one endpoint of the interval, then it's just a Gaussian random walk with -- going along the interval. That's all it is in that case. In the -- more generally, if I graph is a tree and v0 is some fixed vertex, say the root of the tree, then the Gaussian free field is just obtained as follows. So for every edge you give an IED -- you give the edges IED normal variables and then you just sum along -you have these -- so if this is your tree, you give the edges IED normal variables, and then to determine the value at the vertex you just sum these variables from the root to give you the new value. So that's the Gaussian free field on a tree. Now, on the graph, we want to do the same thing. So if we have a graph, we want to first assign the edge -- fix some v0, assign the edges labels, which are IED normal, and then define the value at the point to be -- well, if I have a point here, let me just take some path here, sum the values along this path, and that gives me the value at w. That's a nice choice. But what if I choose a different path? I would get a different value. So to define the Gaussian free field, one thing we could do is use this definition, but insist that we'll get the same value no matter which way we go. In other words, condition that the sum along every directed cycle of these Gaussian variables has to be 0. So you start with IED Gaussian variables, but then I impose this very strong condition on them. This is conditioning of a measure 0 event, but with Gaussian it's easy to do such conditioning. Conditioning is just projection for Gaussian. So you just project on the subspace where the sum along every cycle is 0 and now once you've done this conditioning, now you can do what you like -- what you did before on the tree. So these definitions and several others are equivalent definitions of the Gaussian free field, and it's been used a lot in recent years by Dave [inaudible], [inaudible] and others, especially in the continuum limit. So in the example I did on the interval, if you take a limit of that, you'll get [inaudible] motion. So in the plane you could -- in the lattice you could do a Gaussian free field and take a limit of that. You'll get Gaussian free field in the continuum, which is very important, but not the topic I'll discuss today. >>: Is there an easier way to sample from it? Maybe just computing these by the resistances and [inaudible]. Yuval Peres: That's basically it. I mean, the green function is kind of an inverse of an Laplacian, so you can use that. >>: Okay. Yuval Peres: So find more Smith inverses rather than computing the green function specifically, but then [inaudible]. >>: Isn't it also a stationary measure if you just take the average of the neighbors and then add a Gaussian to each [inaudible] simultaneously? Yuval Peres: Right. Okay. So here's our main result which says that -- or one form of it -- which says that the cover time is equivalent up to constant to the -- you take the maximum of this Gaussian free field, take the expectation of that, square it, and multiply by the number of edges. This is also equivalent to the blanket time, but here the constant depends on beta. So in particular, already stated this way, the Winkler-Zuckerman conjecture is true, the blanket times r equivalent to the cover times, but we can't see this directly. We have to make this detour via the Gaussian process. Any question on this statement? >>: Have you determined the [inaudible] for estimating this? Yuval Peres: Yes. So this is one randomized quantity. This is another randomized quantity. So what have we gained? Well, what we've gained is now we're ready to step on the shoulders of giants who have been analyzing Gaussian processes for 50 years. So here is another -- before getting to the be intuition, let me give you another interpretation of this which I like. So given a graph, how can we see somehow the cover time in the graph? So given a graph, I want to give it a metric. So effective resistances are a metric. So they satisfy the triangle inequality. And if you have any metric, the square root of that is also a metric by concavity. So let's endow the graph with this metric, the square root of the effective resistance. And turns out with this metric, it's a -- this metric embeds the graph in the Hilberg space, in Euclidean space. So you can take the vertices and put them in Euclidean space where the distances are exactly the square root of the effective resistance. In fact, the Gaussian free field is such an embedding, because the Gaussian free field is an L2 metric, and the L2 norm between two vertices is the square root of the effective resistance. So when we think of actually taking the graph and putting it in space with these distances, now we claim that we can -- so think of this graph as spinning in space. And now we can see -- so the effective resistance is just the diameter squared of this graph in this metric, but the cover time is also there up to constant. As follows, take this graph, project it on a random line. So take a line in random direction and project this graph on the random line and you get some random diameter d. Take this expected diameter squared multiplied by the number of vertices, number of edges, you got the cover time. So just by scaling by these constants you get from this diameter of this projection to the cover time. So this graph as embedded in Hilberg space has in it geometrically the cover time. This is just a restatement of the previous theorem because projecting -- if you have something in high dimensional space, projecting is the same as multiplying by a Gaussian vector because a Gaussian vector of length -- of dimension n, its norm is almost constant. So multiplying by a Gaussian vector is the same as -- with a smaller error -- as multiplying by uniform element this sphere which is projecting on a random line. So it's easy to unravel this and see that it's equivalent to the previous one, but I like this geometric picture. >>: Can you get the deterministic algorithm in this way by computing this, embedding, and then maybe some randomization principles involved applied to -Yuval Peres: We don't know anyway to do deterministic [inaudible] this way. So it would exciting. >>: So you can get the embedding? That part is okay? Yuval Peres: Right. >>: But this [inaudible]. Yuval Peres: The problem is this expected -- right. Right. This expected diameter in a random direction. That's the trouble. >>: [inaudible] Yuval Peres: Okay. Well, first of all, I just claim that the graph with this metric can be embedded, and you can see it in various ways. The way I like to think of it is just the Gaussian free field is itself embedding. So to every vertex I attach this Gaussian variable, and I use the L2 metric, and the definition -- let's go back. Definition of the Gaussian free field says that the L2 distance between these two variables is the square root of the effective resistance. This is the Hilberg space. Now, it's the Hilberg space with n points so you can always think of it as rn. >>: So it's possible to go from the distances to some kind of location in a Euclidean space [inaudible]. Yuval Peres: Yes. Yes, it is possible. So if you have -- if you know the metric and you know it's Euclidean, then you can write down a matrix and get this more explicit. Okay. So the key in the theorem is to relate the cover time to the maximum of the Gaussian process, and so I have to tell you something about this maximum Gaussian processes. So this is a highly studied subject. And we're just going to take centered Gaussian practices, so mean 0, and given any Gaussian process, turns out the right way to analyze it is to first look at this metric where the distance between two indices is the L2 norm of the corresponding Gaussian variables. So that defines -- that's an L2 metric. And then in terms of that classical problem is estimate the expected maximum. Actually, the way this occurred classically is people were really interested in processes in the continuum and wanted to know is the process continuous, is it bounded. But in order to do that they took some nets and tried to estimate the maximum on the net, and that's how this problem arose. So in a way this goes back to [inaudible] and [inaudible]'s proofs of continuity of running motion, this type of question, in the 1920s. I'll tell you a little bit about the history of this, but let me jump to the answer, which is -the final answer was obtained in '87 by Talagrand following an earlier work of Fernique which is that the [inaudible] can be estimated by some quantity that's deterministic function of the metric space, which is given by a formula. Their explicit form again involves some quantization over all partitions or sequences of partitions of the space, certainly an exponential quantization. But once you have a [inaudible] formula you can play with it. And in fact, we do find a polynomial time algorithm to calculate this Talagrand gamma 2. And I'll give you the definition of gamma two later, but the key thing right now is that there is this formula. So this formula was first suggested by Fernique, who gave an upper bound, and by far the harder part was to prove the lower bound, which was done by Talagrand in '87. So in view of this theorem, the cover time, our cover time result, can be stated as saying the cover time is the number of edges times this quantity gamma two of Fernique and Talagrand applied to the vertex with this square root of effective resistance metric and then you have to square that. And there is a deterministic polynomial time algorithm which we construct in the paper to approximate them too. So this answers the Aldous-Fill question. >>: I don't know if this question makes sense, but is your [inaudible] for a gamma 2 for any Gaussian process or just ones that arise in this way? Or does every Gaussian process arise in this way? Yuval Peres: Not every Gaussian process arises in this way. I think -- what we have is intermediate. It's more general than this case, but it doesn't cover the completely general case. So what struck -- in fact, this work really started from some earlier work we did together, Jan [phonetic] and I did together with [inaudible] Nachmias and Martin Barlow in UBC, and in retrospect, it struck us that this connection should have been spotted before. And let me give you some examples of the parallel between a Gaussian process and its maximum on the one hand and hitting times, local times and cover times on the other hand. But remember that the parallel is really between the cover time and the squared maximum of a Gaussian process. So what are the obvious lower bounds? Like cover time is bigger than hitting time and, well, the maximum is bigger -- the expected maximum is bigger than the maximum of the expectations. So there is -- what about upper bounds? So I told you about Matthews bound, which if I don't worry about the constant, it's really immediate, and even if I do, it's pretty easy. And if I want to estimate the maximum of a Gaussian process, well, I can just use the union bound, and if -- you use the union bound on the tail of the variables, and if you use the union bound, you lose a factor of root log n. So root log n is -- the inverse of the Gaussian tail is e to the minus x squared. The inverse of that is a root log. And so just the union bound for an a bunch Gaussians to estimate the probability that they're all bigger than something will give you a root log n, or in terms of the maximum squared, it will give you a factor of log n. Now, I told you about Matthews lower bound, which lower bounds the cover time. In terms of hitting times in some subset times the log of the subset, here's an important inequality in Gaussian process which says that the maximum squared of a Gaussian process can be bounded by these distances squared between the variables times the log of the set that I'm considering. So that's another parallel. Let me go further. There is classical Gauss concentration of variables, and there is a very similar [inaudible] I'll state exactly for concentration of local times proved by Jeff Kahn and his collaborators. And then there's a -- so an integral to bound the maximum of Gaussian process introduced by Dudley in '67, and there was a related sum or integral that we did in this recent paper with Asaf [phonetic] and Barlow, Martin Barlow, which this is actually what rang the bell for us to see that, well, these are two parallel. And then we thought, well, if we've obtained the -- we are -- on the study cover times, where the Gaussian folks were in '67, so luckily we can use a time machine and say where were they 20 years later? Let's try and catch up. So I'll give these parallels in a minute. So, again, Sudakav minoration is what I told you just a moment ago. And that's the parallel with Matthews lower bound. This is just concentration of Gaussian variables. Probability of difference is bigger than lambda is bounded by E to the minus lambda squared, and you have to normalize by the variance. So the variance is this distance. Here the concentration result was proved by KKLV that -- and, again, this fact is very easy. It's just probability of the local time at one vertex and the local time to another vertex differ by alpha can be bounded by E to the minus alpha squared. Instead of the distance here you have the effective resistance times this local time. Now, in this story L is a local time at one vertex. So L is a local time at u. We run until -- so we run the walks until u has accumulated that local time. And so we actually run to a random time capital T which is the global time we need to accumulate local time L at at the vertex u. And then the probability that these differ by more than alpha can be bounded this way. Just to explain the philosophy here, if I have two vertices u and v and I'm kind of measuring local time with u, every time I do an excursion I look at the excursions from u. So I go from u and wait until I return to u. Well, the number visits to v needs such excursions a random variable, and these random variables are all independent because in different excursions I'm adding I said things. And also the distribution is very simple. If I go from u so v I first have to hit u from v. That's a geometric variable, and the probability of success is essentially the effective conductance from u to v. And then if I do hit, then I have a geometric number of visits to v before coming back to u. So this is just a large [inaudible] inequality for sums of essentially geometric random variables or geometric times [inaudible], and that's what's proved there and then used there to prove their bounds. So you see the parallels. Finally, here is what kind of rang the bell for us. This is the Dudley 1967 integral. So given any metric space, an important quantity is known as metric entropy. Covering numbers is a minimal number of [inaudible] epsilon, you need to cover a set. And, again, you can also ask for the maximum number of disjoint balls in a set. And if you're only interested logarithmic asymptotics, these two things are the same. So anyway, that's for any metric space. And his bound was given a Gaussian process, use this metric that's adapted to the process and then integrate the square root log of this over the scales. So may look scary the first time you see it, but it's actually a very easy bound to prove and also sharp in many cases, in all symmetric cases. In this work I mentioned B, D, and P. We're actually interested in specific graphs. We're interested in cover time of nearly critical random graphs, graphs which were kind of delicate. So in order to prove the estimates there, we had to write down pretty good general estimate, and when we did, well, you know -- or, rather, after we did, you know, we looked again at this estimate and just recognizing that it was very close to Dudley's, and that's what brought the connection to our mind. And then we said, well, since Dudley is a good upper bound, but the answer there is known, maybe the sharp upper bound from there fits in our problem. Okay. And this is a reminder. This is the sharp result by -- again, for the life of us, we cannot find a picture of Fernique so you just have to see Talagrand twice. So that's his theorem that estimates exactly. And here is the definition of this gamma 2 that estimates it. So it's rather elaborate if you've never seen it before given -- so you look at a sequence of partitions. Each one is a refinement of the other and they're growing doubly exponential 2 to the 2 to the k. And then given a point x, you look at the diameter of the partition element Ak of x that contains x, add those with these weights 2 to the k over 2 -- by the way, one thing -- it's hard to parse where this comes from, but 2 to the k over 2 is the square root of the log of the size here. So that's what's coming in here. And so you take sup over all points and inf over partitions. So the way this is defined, it doesn't look algorithmic at all, and there's quite a bit of effort to prove that this can be calculated in polynomial time. I won't go into that now. All right. So I'll give you the rough reason where these kind of estimates come from. This is the method of chaining. So if you have the maximum of independent variables, you expect the union bound to be tight. So that means for maximum of k points you expect order of square root log k. And, indeed, it's easy to prove in the independent case. So we think of the independent case, you know, we have -- all these points are far away. So, formally, we can prove something like that. More generally for processes that are nearly independent using combining Gaussian concentration and Sudakav minoration. Now, once the process is not independent, then using a union bound is too crude. So the idea which really goes back to Komogorav [phonetic] is to use chaining. So suppose my space has -- geometry looks like this, and we want to bound the Gaussian process. So, remember, the distance between two vertices is the standard deviation of the differences between the Gaussians. So what you do is first bound the process on these gray points using a union bound. So those aren't that many. And then say that on the red point it's going to be not too far from the corresponding gray points. So this is two stages of chaining. So you'll bound on the gray points using a union bound, and there aren't too many of them. And then the red ones, if you just look at their distance from your base, from your center, that's too big and there are too many of them, but if you just compare each one to the corresponding gray one, well, we have lots of red points, but now they're going to kind of be tightly coupled, tightly knit, connected to the corresponding gray one. And you just have to iterate this again and again. So two levels of chaining in general won't be enough. You have to do a growing number of levels of chaining. Now, how do we technically connect the Gaussian world with the random walks? Well, if all you're interested is in upper bounds for cover time, you can do it directly using this KKLV concentration I told you about, which is the analog of the Gaussian concentration, and just repeating the Talagrand argument for his upper bound or his results immediately gives you the upper bound there. But that method, A, it won't be as sharp as the one I'll tell you about, and, B, and doesn't help with lower bounds. So with Gaussians, the key thing that allowed Talagrand to prove lower bounds -- well, there are several key things, but one is that you can understand very well what happens with conditioning. So if you have a Gaussian process you've already conditioned and a bunch of variables, you understand easily what's the distribution of the remaining ones. Now, with local times, if I know the local time at one vertex, I still can understand pretty well the distribution of the others because I can just think of excursions from that vertex. But once I started conditioning on local times at 17 different vertices, I'm in trouble, because the combinatorial intricacy of in what order did I visit those vertices, computing the effect of that on other vertices is hard. So the key is to make a more direct connection from loam times to Gaussian processes. The first such connection is due to Ray and Knight in the '60s and then Dynkin in the '80s, there was some intermediate related work, but Dynkin realized the general connection and this was continuously refined. And this is the best theorem we know due to Eisenbaum, Kaspi, Marcus, Rosen and Shi, which is -- that and Talagrand's are the most important technical tools. So here's the statement. Fix a vertex and look at the local time L at that vertex. So we're going to run the chain until we accumulate local time L in that vertex. Capital T is the time we're going to run it. Let g be the Gaussian free field on the graph. Then we have the following identity and distribution. This is not an estimate, it's an identity in law of two things. So the right-hand side is very simple. We take this constant L that we fixed, look at gx minus root 2L squared, divide it, and so this is essentially the Gaussian free field. You remove a constant, you square it. The right-hand side is pretty benign. Now, the left-hand side is a convolution. So it's a sum of two independent terms. So here on the left-hand side, this Gaussian is independent from this local time. So you look at the Gaussian free field and square it and add the local times for an independently run random walk. So the Gaussian free field is connected in the sense that the metric that defines the Gaussian free field is this effective -- square root of effective resistance metric, but note that the Gaussians here are not affected by this L, by the local time we're running. So if you run for a very long time, then this is going to grow, and, of course, this is going to grow, but this Gaussians are not going to change. This is kind of an amazing identity -- yes? >>: I just got a little bit confused. So this Ray-Knight theorem was actually about, like, local times on random logs -Yuval Peres: No. Ray-Knight was about local times for running in motion. So Ray-Knight said that if you do local times of running motion you can write them as the square of a -- and if you stop the running motion properly you can write it as a square of another Gaussian process. >>: [inaudible] Yuval Peres: So that was -- Ray-Knight was on running motion? Now, Dynkin's theorem was for general Markov processes. And the reason people were interested in this is mostly because they were studying more and more general processes and more and more general spaces. They were not interested in random walks on graphs. And they wanted continuity criteria for local times on the general spaces. So that was the impetus. But to get the sharp conditions, they really needed very precise estimates. What turns out is that these estimates are very useful in our problems that involve this finite -- you know, random walks in finite graphs. So this is somehow the key, a key identity. And so let me quickly sketch for you at least some things you can see from this identity that -- so say you want bound from above blanket times your cover times. So what do you need to show is that if the time is much bigger than the number of edges times these expected maximum squared, then the local times are about uniform. So scaling global times and local timings, what do you need to show is that this is a local time, the fixed vertex is much bigger than EM squared, then the local times are approximately uniform. So that's -- by scaling, that's basically what we need to show. And, remember, this is this identity. I mentioned the identity. Even the people -Dynkin, Rosen, others who have discovered and used these identities, there's a book by Marcus and Rosen basically centered around this connection, and they say themselves that they don't really understand why this works. Basically proof of the equality is by calculating joint [inaudible] transforms of both sides. They turn out to be big determinants which can be identified. And there are other proofs, too, which are even more intricate. But there's no intuitive understanding of why this holds. Okay. So we have this. Now, let's just open the square on the right-hand side so we can write it this way. And now I'm going to copy this but just use bigger type for the bigger quantities. So the idea is we want to run this to a time which is large, so the local time, this L, will be large, and then these local times will also be large. So this is the a lot of same formula you see before, but emphasizing the fact that we're running to a time L that's much bigger than the maximum squared of this Gaussian process. So this Gaussian process becomes like noise, and we have this identity. So L here is a constant. It's a local time at one vertex that we fixed. Here it's a local time field in all the vertices. This is what we're trying to control. Well, what this identity is telling us is that this local time is approximately constant, which is what we want. We want uniformity. So we can get this upper bound quite cheaply from the identity. So here's the same picture here. So we have this local time and we add to it this noise. Here we have this unknown field and we add to it this noise. These two things have the same loss, so this thing must be approximately constant. The lower bound on the blanket time follows a similar idea and then it gets stuck, so I'll quickly tell you that. So in the local time -- maybe I'll jump to the picture in a moment. So we start again with this identity. Just use the fact that here -- three away these squares. So you have a bound. The square root of the local time can be bounded by root 2L minus gx, and now I'm going to write that down and compare. This is the square root of the local time and this is root 2L minus -- you know, compared to gx. And now we use the fact that the Gaussian -- the maximum of the Gaussian is concentrated. So it's close to its expectation. So -Finish in five minutes. So it's close to its expectation, so this difference is relatively small. And so there must be some vertex where the square root of the local time is smaller -- is of smaller order than this height, and this height is the square root of, well, twice the local time at some other vertex. So because of that, we can pretty easily lower bound the blanket time. But lower bounding the cover time is much harder because that involves finding place where the local time actually is 0. So in the last three minutes let me sketch what's the issue there. We want to show -- so three quarters of the technical work starts here, but I won't be able to do it, which is how to find -- you know, show a vertex where the local time is actually 0. These type of estimates just -- you can't do that from the isomorphism theory directly. So the observation is that when the local time is positive, it can't be positive and extremely small because if you actually visit a vertex and let's say it's far away from your base vertex, then you're going to visit it several times before you are going to come back to your base vertex. So you can bound it below by an exponential. So the idea, in order to show -- in order to give a lower bound for the cover time we're going to take this time, because we want to show this is less than the cover time. We want a find a low time that's 0 there. What are we going to do? We're going to find a local time that's very small and then argue the if it's very small, it must actually be 0. So how do we find local time is very small? Before we used the fact that the maximum Gaussian is close to its expectation, but this close is not close enough for us. So the key is if we put a threshold that's much lower -- so I said here half the height, but actually you need one millionth of the height. So epsilon is some constant. If you look at mountains that are not the tallest but much less than that, then there are lots of mountains near that, so you can find one that's close to your target. And here's another picture. If you think of the Gaussian -- you know, the levels of the Gaussian particles as competing, here, this is the winner, and it's pretty near its expectation. But if we go below, there are lots of -- as the winter pulls up, there are lots of stragglers, and we can find some very close to a precise target. Actually, doing that involves ideas from percolation on trees, which I don't have time to go into. So the paper is on the archive for those interested. And we have to use both specific properties of this Gaussian process and electrical network theory and ideas from percolation on trees to find this kind of point. So I'm going to end the sketch of the proof here and just finish with a couple of open -really one main open question. What about more precise asymptotics for the cover time? So these are available in not too many examples. So here is one kind of striking connection. If you look at the Gaussian free field on a 2-D lattice, the expected sup is -was calculated by Bolthausen, et al., to be logarithmic times square root 1 over 2 Pi. On the other hand, in the paper with Dembo, Rosen, and Zeitouni, we showed for the 2-D torus the cover time is asymptotic to 1 over Pi n log n squared. So this relation that I mentioned before as true of the constant is actually in this example an asymptotic. So the ratio of these things is 1 plus little o of 1. Well, is this general? So if I want to talk about asymptotic, I want the cover time to be concentrated. And here luckily a theorem of David Aldous comes to help. He proved that if you have any sequence of graphs where the hitting times are negligible to the cover times, then the cover times are concentrated. So the cover time is seen as a random variable. You divide it by its mean and it goes to 1 in probability. So let's only look at sequences of graphs like that where the cover time is concentrated. And then there's a meaning to estimate the constant. There's more of a meaning to estimate the constant. And then we don't have a counter example to this relation being -- this relation holding asymptotically. On the other hand, the examples where we can really calculate both things are limited so it's not a lot of evidence. But actually using the isomorphism theorem, we can prove it in one direction. So I said this is the open question. It holds in a few examples. Probably it can be checked whether it holds for trees, for general trees. We didn't do that. It holds in a few examples. Let me -- you know, ignore this slide, but just look at this square thing. So we do know that the cover time is bounded by the right-hand side up to 1 plus little o of 1. So in this asymptotic conjecture, one side is true and follows from the isomorphism theorem. I won't labor you at this hour with the argument. But I'll come back -- end with the two questions. Is there a deterministic polynomial-time 1 plus epsilon approximation to the cover time? So we don't know that. And even if that asymptotic is true, that doesn't solve it because we don't know 1 plus epsilon approximation to this side. Talagrand's only the gives you an approximation up to constant or our version of Talagrand's theorem. So these two challenges are both -- are related but are different. Is there a 1 plus epsilon approximation? Another question which is related to Aldous's result is can you make the concentration in David's result more explicit? Is the standard deviation of the time to cover, the cover time seen as a random variable, is that actually bounded by the maximum hitting time? That would correspond to things we know about Gaussian processes, but the connection we have between Gaussians and local times is not strong enough to control fluctuations, so this is open. Thanks for your attention. [applause] Yes? >>: So you may have entered this in a language that I don't recognize, but to what extent can this be applied to [inaudible] contact manifolds and [inaudible]. Yuval Peres: I expect it can be applied, but this hasn't been done. So all tools, the isomorphism theorem and so on, apply in that setting, but it hasn't been done. The way the cover time has been estimated exactly, say, for the Z2, for square and Z2, was first to look at running in motion on the torus and then use the approximation that way. Sometimes running the questions can be easier because of additional symmetry. >>: That Talagrand gamma, you calculated it exactly or -Yuval Peres: No, no, approximately. But even Talagrand, each -- it's really only defined up to constants. If you look in different papers of Talagrand, each time gamma is different or different pages of one paper [laughter] because there is a definition using packing trees, using covering trees. There are many different definitions. They're not the same, but they're all equivalent up to universal constants. And the algorithm that we have, again, doesn't at all calculate any of these gammas. It's yet another quantity that's equivalent to it up to a constant. So it's really loosely defined. The only things that are kind of rigid are the cover time and the maximum of a Gaussian process. >>: I know you have this time machine [inaudible]. Yuval Peres: Yes [laughter]. But the time machine is restricted to the cover time versus Gaussian connection so far. >>: [inaudible] Yuval Peres: What? >>: [inaudible] Yuval Peres: Okay. Thanks. [applause]

Yuval Peres: And so thanks, all, for coming. ... talk before. This is a more -- this is...

Related documents

Products

Support

Yuval Peres: And so thanks, all, for coming. ... talk before. This is a more -- this is...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib