>> David Wilson: Good afternoon, everyone. We're especially delighted to welcome this year's Birnbaum Speaker Professor Varadhan has transformed quite a number of areas in probability and just give a small sample, the Varadhan Theory of Local Times. In general, he's worked on large deviations that we'll hear something about today, hydrodynamic limits, and one of my favorites is the introduction of the self-intersection local time. I could go on. But instead let's have the speaker tell us about the role of compactness in large deviation. >> Srinivasa Varadhan: Thank you very much. It's a pleasure to be here. So there are many problems in analysis where we use compactness of space in a routine manner. And sometimes it's not there. And when it's not there we have to work harder, and the hard work takes different forms. So look at a couple of examples of various types, things that we do to avoid the issue and then I'll give one example that leads to some interesting calculation. So what in large deviation theory one of the things we want to do is to analyze asymptotic behavior at certain integrals. These are integrals of the form experiential of a function with the large parameter multiplying it times certain familiar probability measures PN. And then you want to look at the tilted measure which you throw in the density NFX [indiscernible] of PN and normalize it by the expectation so that UN is a probability distribution now. Which somehow tries to maximize F. Now P in itself has its own behavior. It's not uniform and it decays locally exponentially different rate, different points. Such that there's a fight going on between the function F, which wants to be large but the cost it has to pay for picking a point X is I of X. So this fight has to be resolved and one can conclude that Z is asymptotically like N times exponential of N times the superior of F minus I, which is one of the results of the large deviation theory which is not easy necessarily to prove as long as you can interpret what DPN decays like experiential minus I of X you have to give a meaning to it because the left-hand side is a measure defined for sets and right-hand side is a function differing on points. So that that has to have an interpretation. And one of the things we want to prove with the interpretation is we want to show that the measure QN concentrates at a single point A provided the variational problem as a unique solution. If at some point AF minus A as a maximum strictly larger than other values then we want to show the DQN measure concentrates at the point A. So now what do we mean by this expression that DPN is like expression I minus X, well we're interested in log of PN and we want log of PN of A normalized by 1 over N to behave like [indiscernible] of I of X [indiscernible] this cannot be true for all sets because if we look at a single point set I could be finite at that point and P and of A is usually 0 for single points sets. This again has to be interpreted very carefully and the interpretation is you get a lower bound for open sets and upper bound for close sets just like limit theorems in probability. So for open sets it's easy because usually the results are obtained for small balls by some tilting argument you make law of large numbers for the tilted measure to localize at that point X and you figure out the cost of tilting and then that gives you the experiential lower bound. Lower bound you get for a small ball that's an F for an open set you can just optimize the location of the small ball and then you get the unique quality for open sets. The upper bound is for closed sets and how do you obtain the upper bound in probability what you do is you use temperatures in equality to get the upper bound. Use inequality of the probability that some F is bigger than L is bounded by U to the minus NL times the exponential of N times F like a Chernof bound type bound thing. Then if F is a nice function, you can find a ball around some X of radius epsilon contained in this half space if less than or equal to L gives you upper bound for a small ball and the upper bound of small ball is basically E to the minus N times F of X times the expected value of the expiration. And the expected value of the exponential perhaps grows by the exponential of some constant which depends on F. So in the end the best possible upper bound you can get is a decay rate which is the best possible temperatures in equality you can dream of. But this is only a local bound for a very tiny ball. And you cannot control the size of the ball, because it's a soft argument in some sense. Depends on where the [indiscernible] is obtained, what kind of bad F you have. And what you do, you use the inequality, which is simply an equality of the probability of the union of two sets. You can bind it by the sum and you can advertise the maximum. The beauty of large deviations when you take the log divide by N the 2 goes away. Once the 2 goes away you can cover a compact with a finite number of these things so immediately you get an upper bound for compact sets. So finite covering. So you can estimate the probabilities in question for compact sets the rate of convergence would be okay. As soon as it's compact, we're done. Problem is solved. Often these spaces are not compact and you need to do something else. What are the things you could do? Well, you can find a big compact set and show that what happened outside is not relevant. Now, in product theory, in order to show that something is not relevant, you have tightness estimates to ensure that the probability of a complement can be made arbitrarily small. Here we are interested in large deviations related to N so we have to show that outside the compact set the probability decays faster than any expiration you can think of. So basically the constant of decay has to be very large outside a compact set. You can sometimes do that. And that's called exponential type estimates that you do now. Or you can compact it with a space. In this space it can be compactified. Currently theorem in point topology. But then there isn't some problems because now once you compact with the space it's not clear whether you're functioning to the compactfication or not and then you have to get your local upper bound for all the points that were added in the compactfication. So you have to model as a concrete compactfication, not an abstract compactfication, in order to do it. Or you can modify the problem. You can just, if you're interested in estimating it you could simply say this is integral is less than or equal to some other integral somewhere else which happens to be a compact space and you can do that. And all of this has been tried. All just for Brownian motion. And just for the local time, in different contexts you can do different things. The Brownian motion local time is just a random measure, which is the amount of time that Brownian part spends in various sets A and the space of measures is the ambient space because think of LTA as a random probability measure. So your family of probability distributions and the space of measures in RD and this is where you want to do do large deviations, and who knows what the rate function is locally there's no mystery because it's just the, it's a reversible process standard compilation that the local rate function is [indiscernible] from the square root of F. So the question really is how do you handle the noncompactness here. The first attempt is to estimate, for example, if [indiscernible] formula for integrals like this for potentials that go to infinity as X goes to infinity. Now, the Brownian pipe wants to go too far because of it. So it's clear in this problem you can handle the noncompactness because the noncompact part just doesn't count. And the asymptotic rate is just the [indiscernible] of the integral. We have the supremum because of the minus I and so on trying to introduce [indiscernible]. But can also technically compactify the space, the easiest compactfication we can think of is the one point compactfication, add the point to infinity, semi space compact. The red function for the delta measured for infinity is 0 because the Brownian wants to get away to infinity. So the neighborhood of infinity really means most of the mass is very far away so it's clear that the rate function at that point is zero now everything is compactified but then the price you pay is you have to assume that the potential V goes to 0 or has a limit as X goes to infinity then you can subtract the limit and it's just basically going to 0. So the class of functions that you can handle gets reduced considerably because of the compactification used. Another thing you could do is you can replace RD by the Taurus. You can project the Brownian path that's wandering in RD to Brownian path on the Taurus and the function you're trying to estimate is larger on the Taurus for the protected path than the original path than if you try to estimate it on the Taurus it's good estimate on the whole space and the Taurus is compact so the estimates work for the Taurus so you can sort of do the problem for a large Taurus maybe and the end various of the Taurus go to infinity no trouble interchanging the two limits because actually domination in one case. And that has been tried to -- that's been the problem with Wiener sausage, because if you protect the sausage, the sausage and Taurus has smaller volume than sausage on RD. So problems like determining the range of random walk and things like that projecting on the Taurus is a good method of compactfication. Because the projection are estimates. So now the problem I want to look at is the following thing on the local time. In D dimension you look at this double integral of the potential 1 over mod X and you want to analyze this. Well, here what works is replacing Brownian motion by [indiscernible] process. Because [indiscernible] process is strongly recurrent and you can get super expiration limit what happens outside a compact set so you can do that and you can show because the function 1 over mod X is positively definite the integral is more for [indiscernible] process than it is for Brownian motion. So comparison works here. It's not quite satisfactory for the following reason. When you modify these problems like this use methods comparison and so on it is good enough to estimate the partition function ZT and Howard Gross and so on. And if you want to prove that the measure QT defined this tilting the new measure if you want to prove that it is -- I think I made a mistake. There's no 1 over T because I used the local time already. Okay. That should be T in this, clearly. So you cannot analyze this by modification because then the double limit is involved and it's hard to control this. So what I want to display now is joint work with Mukherjee, a post-doc at Munich. Let's look at this local time. And we have a function of the local time that we want to analyze. But this function is translation invariant. Because if you translate the measure mu it doesn't change. There are examples of this. The examples of such functions are our function which is a function of two variables or you can take functions of K variables which are invariant and rigid translations. And then you integrate with respect to the product measure of the mus and these are functions that are invariant with respect to translations so they have this property. And let's suppose these functions are tame that whenever the separation distance between a pair of points goes to infinity, these functions vanish. So they're basically supported on points X1 through XK which are tightly bound together. Sort of drift apart and the function vanishes. Function 1 over mod X minus X has set property. And then we want to analyze the expectation of something like this. So now we have to go to the question space because it's a problem if you want to -- the problem with the translation invariants so the natural space on which to do the large deviation is a quotient space of orbits of measures. Translation of it. This space is still not compact. So the question is: What is the compactfication of this space? You can't get away with one point compactfication here because the orbits -- then all the orbits collapse because at infinity everything meets compactfication. So you can't use that. So you have to have some other kind of compactification. The question is how to compactify. One point compactification is not suitable. It's not translation invariant. So let's take a function F which is translation invariant continuous and has this property of decaying when the distances go to infinity. And look at all the all such functions not just for 1 K but all the various values of K functions of several variables for all variables of K and I look at this linear functional lambda F mu which is the integral of this function against the mu. Just an elementary point topology here. These objects for each F are bounded and this space FK are all nice separable spaces because areas of functions of this type is really a function of K minus 1 variables that vanishes at infinity. So it's a nice separable bounds space. A comparable number of separable bound space. Use a subset that works for everything. And then by diagonalization procedure I can use a subspace that all these converge. So accountable set is enough. Now also one thing interesting that's to be noted is FK minus one in some sense a limit of FK. Because if I take a function of FK variables, find it as a function of K minus variables, multiply by a function of phi of X, 1 minus XK. Although this function phi has to vanish at infinity for this to be well defined in our class, but I can take the limit as phi goes to 1. If you take the limit as phi goes to 1, it's not a uniform limit, but it's a point with bounded limit it's okay for integrals. Because they can use the bounded convergence theorem. So then the integral FK with respect to the product measure will converge to measure the whole space times decay minus -- the K minus one dimensional integrals are recoverable from the K dimensional integral by limited procedure. So now what I want to do is choose the subsequence that these all limits exist. They can do it. Now I compactify the space in principle. But I have to know what is this lambda function F. Now, the answer is very simple. Or there is another way of thinking about it which is you define a metric in two measures probability measures mu 1 and mu 2, you look at this lambda FJ mu 1 which is a K dimensional integral of FJ, K depends on J because of different functions accountable sequence will be in different dimensions and you compute the integral, take the difference and it's the usual trick you do for counter product or metric spaces you put some weights in front so that the serious convergence is nothing special about it. And you try to complete this metric, which is really saying that anything that converges for each function should be some limiting object. And one way of choosing the weights is just usual choice. So here's the answer. The answer is the limit consists of collections of orbits. Mu tilde and their total masses of all the orbits, see the total mass doesn't depend on which measure you choose in the orbit. So these are sub probability measures. Each orbit will have some probability and you add them all up and you get something that's less than or equal to 1. There's the limit. And then the representation in terms of the limit you compute the same integral as you did before for each member of the orbit and add them all for the different orbits. The intuition behind this is very simple. How can a sequence mu N fail to be compact? The whole mask can go right away to infinity. And then the mask can split into two pieces that are running to infinity in different directions. That can happen. Or you can have a Gaussian distribution with a very high variance so the mass just goes away. So in the first case the orbit actually converges to a different orbit. The orbit -- in fact, the orbit doesn't change at all because they have the same mu. Here in the limit I get two pieces mu 1 and mu 2 mass one-half each but they're the same one-half mu, one-half mu appears twice. The third case it just becomes dust. So the compactification consists of various pieces you can pull together which have these sets of orbits and then something that disappears into dust, that's why the total masses when you add up you get something that adds up to P which is less than or equal to 1. Okay. And the same orbit can occur twice, so you have to count, you have to list the orbits with the understanding that you can list the same thing twice. Of course, it's possible that your collection is empty, which means that everything went to dust. So for every function F, your linear function goes to 0. Or you can just have a finite number of orbits or you can have accountable number of orbits that mass is getting smaller and smaller and adding up to something less than or equal to 1. So that's intuitively compactification. And you have a metric there which is you look at lambda F psi which is integral summed over all the orbits and the metric is the same metric except instead of a single measure mu now you do it for each member of the orbit and add them up. It's clear these are metrics but if you wanted to show it's really a metric, you have to show that the distance is 0 only if the two orbits two sets of orbits are identical. You have a problem here because one set of orbits here and another set of orbits there. And you have some objects which are equal. But the objects are collective objects and now you have to match orbit with orbit if you want to show the two sets of orbits that are identical. So there's a matching problem to be done. So let's not let, if you take a function of 2 K variables which consists of a function of K variable repeated in two blocks but the blocks linked by something, and the linking object can tend to 1 and then by dominant bounded convergence theorem what you converge to is sum of squares of these functions. So that means if you have integrals identical for two orbits, then the squares are also identical and you can repeat it for any power so all the powers are the same. That means you can identify individual pieces. So you would think that now that you identified individual pieces, then you're done, but not quite, because all I know is that any integral appears here appears in this list. But for different functions it may appear in different places. I have to locate them all in the same place. That requires an argument. So for each -- I will come to that in a minute. But what happens if you know that even if you succeed in doing it, what you know is the multiple integral for each F is the same for two different orbits. Then you have to conclude that the two orbits are the same, they measure or do you mean by -- so the real problem is to show there are two measures mu and mu which satisfies the relation of F and FK and for K bigger than or equal to 2 then mu is a shift of mu. That's what you have to prove. Okay. How do you do that? Well, let's take the characteristic functions. So the characteristic function is -- you can't use the characteristic function of all arguments because you have to be shift invariant. That means you can look at the product provided some other TIs are 0. That would be a limit, bounded limit of functions that we admit. So you have two characteristic functions phi and psi that the products are equal at all arguments where the sums are 0 than is 1 an imaginary explanation of the other. First, easiest choice to take to Ts is to take TN minus T and refers phi T times minus T, CT minus T, we know that phi minus T is the conjugate of phi T. So it tells you the absolute values are equal. Then you write for ITS times ki tee, you'd like to prove that ki T is an exponential. You take two of them you say now take 3 and that tells you criteria plus T2 equals ki T K 1 T2. That's only for Ts that are not 0. Ki T shouldn't be 0. If it is 0 then ki is not defined. But near the origin they're not 0. So and away from the origin you can prove that ki and T is ki T to the nth power because N positive Ts and 1 minus NT work. And that concludes that one is a shift to the other. So now we are faced with the problem of matching individual ones. That is, for each F is lambda F mu occurs some where on this list. So let's take one orbit and I compute for that orbit the bunch of numbers lambda F mu. All these numbers are going to come on the other list which is a collection of orbits somewhere. But I wanted to show that they all occur in one orbit. And match these two orbits. Peel them away and repeat it again. And then I'll be able to match the two orbits. So the question is Baire Category theorem helps. Now, if -- again F tend to 1 because 1 is translation invariant can be a limit of these things. So that means the collection of total masses are the same for the two sets of orbits. So tells you the size of the two orbits are the same total masses are the same. So in particular if you give me the mass there can only be a finite number with that mass. So if I know lambda F mu for all the F of 1 mu I know the total mass and only a finite number of things on the other side with which I could possibly match. So I consider this set of all functions in some FK such that it coincides with that mu. So this is a set of functions in FK for which the integrals match those that I get for a mu on the other side. These are closed sets. And their union is all of FK. By Baire Category theorem one is the interior. But to the linear functions agree in an open set, they're the same. So that's the end of that. Okay. And then you have to worry about what happens different K but notice that I said that FK minus 1 is a descendant of FK so only finite number there. I have a matching for every K. Eventually somebody has to match infinitely many Ks and then I can come down and they'll match all the Ks. So that is one of them that has all kinds of very large values of K and you can come down. Okay. Now, if you want to show that the object I have is really a compactification, then I have to show that the word [indiscernible] is dense. That's not very hard, because if I look at this, what I have in the completion, I have a bunch of measures with masses adding up to some number less or equal to 1. So take these measures, put them in space. Let's protect compact subset of these outside of the masses very small you can count approximate and spread them out as far as you can. And it's clear that that's going to approximate this, because your visualization of object that approximates this is just different pieces that are very far away from each other. So you can embed your mu and mu 2 into a single measure with pieces that are very far away. So it's clear that the original space is dense here. But then it's not -- we didn't really prove that representation theorem if I remember it correctly. I just mentioned it. I didn't prove it. I mentioned some consequences of the representation. I proved the uniqueness of the representation and so on and I proved the existence of the representation. And that is the crux of the compactification. I need to show that if you give me any sequence of measures, I can somehow extract from them these pieces. And you use concentration functions to do that. So you define the concentration function of a measure to be the superior -- how much time do I have? Plenty. I won't need that. So you use concentration function to do -- this is different. You look at the ball of radius R translate it around, take the maximum probability you can recover from it, and that's the concentration function of radius R. And any probability measure by just the tightness argument, if R goes to infinity, this goes to 1. Now, if you give me a sequence, I can choose a subsequence that converges. And the limit QK as K goes to infinity is monotone as the limit that limit need not be 1. It's something less than or equal to 1. In some sense you should think of Q as the biggest recoverable piece. And all these calculations depend only on the orbit. So it's done on the quotient space already. If Q equals 1, that means after some translation the measures of themselves are tight so the sequence of the translation converges. So the orbit converges to a single orbit. If Q is 0, the probability for any ball goes to 0 no matter where the ball is. That means the measures converge to real dust and the limit that corresponds to 0 orbits and the total mass is zero in that case. If Q is in between, you can recover a big piece. Because the concentration function gets values close to Q so you can pull it back and you will know something is just mass almost Q. Well, make it at least Q by 2. You can almost get 2 but it worked a little harder but you don't need to do that. You can at least mask Q by 2 but grow it as big as it can. That particular collection. You bring it in here and then you say what the limit is as you increase the radius. Okay. You have collected this biggest piece. It's not 1, it's something less than 1. And then you look at the measure that's left over. The leftover measure can't be anywhere near the vicinity of this measure because if it was in the vicinity of this measure, your radius by increasing more you'll recover more. The fact is that this is the maximum amount you could recover means that the rest of the mass is very, very far away. And then you remove this piece and work with the rest of the mass and remove piece after piece. And then you can show that that's a subsequence you have to do diagonalization many times at the end you get a subsequence that converges and the convergence isn't the metric we have. This is the compactification that you do. And repeat and exhaust. So now we have to worry about doing large deviations in this space. Okay. First there is some -- the orbit is not compact, because it's R over D. In the neighborhood of the orbit is the large bigger than the neighborhood of the original measure. Because they're there to, to complete all the translation. What saves us is a Brownian path is not going to wander too far. In time T, typically it wanders a distance route T. If I give it range T, then I'm in the large deviation range. If I give it stance T squared, then I'll be on large deviation. So I'm willing to say my Brownian path is allowed to honor a distance T squared. So the local time is essentially going to be limited to something volume T squared which is polynomial in volume that will express the exponential rate it by a an extra polynomial factor which I don't care about. The fact that the orbit is big doesn't matter. So I can limit myself to definite sized orbit. That is a way of doing it. You define a certain -- the way you compute large deviation in rate function is you need to have some integral which you can control. The integral you control are final cuts formula. Final cuts formula helps you to control integrals of exponentials. In fact, if the exponential is at the form one-half Laplacian mu over U with a minus sign, then by Martingale property, it's integral exponential of its integral is bounded by one basically. So there's lots of integrals you know of exponential size and cost type whose expectations are bounded by a constant. So you use all of them and look at the best you can and that's the large deviation upper bound that calculation gives you a gradient of square 1 over 8. And it's a simple calculation of variation. So that's what we need to do, except we can't use one function. We have to use a function here, a function there. A function here, to capture all the bits and pieces. So you have to take the supremum all possible locations. But a little perturbation doesn't change much because all these functions are continuous. So there's no problem in having a little variation, then your volume is only T squared so you can do it by a polynomial number of these things and it's a mess to write it down, but there are so many parameters -- you look at this integral, the expectation of the exponential of this of the 1 over T is bounded by some number. And then look at the supremum of these things, that's the lower bound you can get. That's the upper bound you can get and these integrals are all bounded by Final Cuts Formula, small variations change very little. And you don't have to shift by more than a distance T squared. The super polynomial of sets of AI. And if you do the [indiscernible] it can separate out all the components and then what you get for the red function is just some of the red function. In the end, we haven't really discovered anything new. We're just justified the compactification and pretend that this place is compact, that's all. Now you go back to the problem in question. So I want to look at this function. This function has a singularity that's an additional headache. But 1 mod over is not a bad singularity in single dimension. So usual estimation methods tell you that you don't have to worry about the singularity. For all practical purposes you can pretend it's a bounded continuous function. So if you do the variational problem, normally you can do it in our space and F is an L1 function, you can think of it as square of an L2 function and then 1 over 8 square F [indiscernible] and different components are different functions, and so the variational problem, this is what I get by taking the variational problem in the compactified space. And Q1 can then prove that the variational problem is attained at a single fee of unit mass. Which is unique up to translation so there's a unique absent for the variational problem on X tilde and that tells you the math under QT concentrates in the neighborhood of that orbit. The uniqueness of the variational of the problem proved primarily by Elliott Lieb in different contexts. Useful here. So the moral of the story is in the end you really need all this. But you needed it to be sure that you didn't need it. Thank you. [applause] >> David Wilson: Questions or comments? >>: But you said a minute ago, a statement would these precautions have been so unnecessary had they not been taken? >>: So this variational problem of your theory, the large specifications for this potential, so type some physical original, motivation? Tell us a little bit about that. >> Srinivasa Varadhan: I think garden -- all around problem. A question of electrons sort of reacting to the field that they created themselves, some in the past. >>: Because I know that if you just fix one variable and just look at the potential of the function of the other variables and the concentration of that along the random walk path has been extensively used by others in studying intersection processes of random walks, tightly related to what you're doing in your field. >> David Wilson: Any other comments or questions. Let's thank -- Thank you. [applause]