Lectures on Stochastic Analysis Autumn 2012 version Xue-Mei Li The University of Warwick Typset: April 9, 2013 Contents 1 Prologue 1.1 What do we cover in this course and why? . . . . . . . . . . . . . 1.2 Exams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Preliminaries 2.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Measurable Spaces . . . . . . . . . . . . . . . . . . . 2.1.3 The Monotone Class Theorem . . . . . . . . . . . . . 2.1.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . 2.1.5 Measurable Functions . . . . . . . . . . . . . . . . . 2.2 Integration with respect to a measure . . . . . . . . . . . . . . 2.3 Lp spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Notions of convergence of Functions . . . . . . . . . . . . . . 2.5 Convergence Theorems . . . . . . . . . . . . . . . . . . . . 2.5.1 Uniform Integrability . . . . . . . . . . . . . . . . . . 2.6 Pushed Forward Measures, Distributions of Random Variables 2.7 Lebesgue Integrals . . . . . . . . . . . . . . . . . . . . . . . 2.8 Total variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 7 7 8 9 10 11 13 14 15 15 18 19 20 Lectures 3.1 Lectures 1-2 . . . . . . . . . . . . . . . 3.2 Notation . . . . . . . . . . . . . . . . . 3.3 The Wiener Spaces . . . . . . . . . . . 3.4 Lecture 3. The Pushed forward Measure 3.5 Basics of Stochastic Processes . . . . . 3.6 Brownian Motion . . . . . . . . . . . . 3.7 Lecture 4-5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 22 23 25 26 27 29 30 3 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 5 6 3.7.1 Finite Dimensional Distributions . . . . . . . . . . . . . . 3.7.2 Gaussian Measures on Rd . . . . . . . . . . . . . . . . . Kolmogorov’s Continuity Theorem . . . . . . . . . . . . . . . . . 3.8.1 The existence of the Wiener Measure and Brownian Motion 3.8.2 Lecture 6. Sample Properties of Brownian Motions . . . . 3.8.3 The Wiener measure does not charge the space of finite energy* . . . . . . . . . . . . . . . . . . . . . . . . . . . Product-σ-algebras . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.1 The Borel σ-algebra on the Wiener Space . . . . . . . . . 3.9.2 Kolmogorov’s extension Theorem . . . . . . . . . . . . . 39 40 40 41 4 Conditional Expectations 4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Lecture 7-8: Conditional Expectations . . . . . . . . . . . . . . . 4.3 Properties of conditional expectation . . . . . . . . . . . . . . . . 4.4 Conditioning on a Random Variable . . . . . . . . . . . . . . . . 4.5 Regular Conditional Probabilities . . . . . . . . . . . . . . . . . . 4.6 Lecture 9: Regular Conditional Distribution and Disintegration . . 4.6.1 Lecture 9: Conditional Expectation as Orthogonal Projection 4.6.2 Uniform Integrability of conditional expectations . . . . . 43 43 45 46 47 51 54 55 57 5 Martingales 5.0.3 Lecture 10: Introduction . . . . . . . . . . 5.1 Lecture 11: Overview . . . . . . . . . . . . . . . 5.2 Lecture 12: Stopping Times . . . . . . . . . . . . 5.2.1 Extra Reading . . . . . . . . . . . . . . . . 5.2.2 Lecture 13: Stopped Processes . . . . . . . 5.3 Lecture 14: The Martingale Convergence Theorem 5.4 Lecture 15: The Optional Stopping Theorem . . . . 5.5 Martingale Inequalities (I) . . . . . . . . . . . . . 5.6 Lecture 16: Local Martingales . . . . . . . . . . . . . . . . . . . . 58 58 60 61 65 65 67 71 74 75 6 The quadratic Variation Process 6.1 Lecture 18-20: The Basic Theorem . . . . . . . . . . . . . . . . . 6.2 Martingale Inequalities (II) . . . . . . . . . . . . . . . . . . . . . 77 78 83 7 Stochastic Integration 7.1 Lecture 21: Integration . . . . . . . . . . . . . . . . . . . . . . . 7.2 Lecture 21: Stochastic Integration . . . . . . . . . . . . . . . . . 7.2.1 Stochastic Integral: Characterization . . . . . . . . . . . . 85 85 86 88 3.8 3.9 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 31 33 35 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 91 93 95 97 99 Stochastic Differential Equations 8.0.1 Stochastic processes defined up to a random time 8.1 Lecture 24. Stochastic Differential Equations . . . . . . 8.2 The definition . . . . . . . . . . . . . . . . . . . . . . . 8.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Pathwise Uniqueness and Uniqueness in Law . . 8.3.2 Maximal solution and Explosion Time . . . . . . 8.3.3 Strong and Weak Solutions . . . . . . . . . . . . 8.3.4 The Yamada-Watanabe Theorem . . . . . . . . . 8.3.5 Strong Complteness, flow . . . . . . . . . . . . 8.3.6 Markov Process and its semi-group . . . . . . . 8.3.7 The semi-group associated to the SDE . . . . . . 8.3.8 The Infinitesimal Generator for the SDE . . . . . 8.4 Existence and Uniqueness Theorems . . . . . . . . . . . 8.4.1 Gronwall’s Lemma . . . . . . . . . . . . . . . . 8.4.2 Main Theorems . . . . . . . . . . . . . . . . . . 8.4.3 Global Lipschitz case . . . . . . . . . . . . . . . 8.4.4 Locally Lipschitz Continuous Case . . . . . . . 8.4.5 Non-explosion . . . . . . . . . . . . . . . . . . 8.5 Girsanov Theorem . . . . . . . . . . . . . . . . . . . . 8.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 101 102 102 103 105 106 107 108 109 110 112 113 114 114 116 116 119 120 124 127 130 7.3 7.4 7.5 7.6 8 7.2.2 Properties of Integrals . . . . . . . . 7.2.3 Lecture 22: Stochastic Integration (2) Localization . . . . . . . . . . . . . . . . . . Properties of Stochastic Integrals . . . . . . . Itô Formula . . . . . . . . . . . . . . . . . . Lévy’s Martingale Characterization Theorem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Prologue The objective of the course is to understand and to learn the Theory of Martingales, and basics of Stochastic Differential Equations (SDEs). We will also study relevant topics on Brownian motions and on Markov processes. We cover basic theories of martingales, of Brownian motions, of stochastic integration, and of stochastic differential equations. We offer topics on stochastic flows and on the geometry of stochastic differential equations. 1.1 What do we cover in this course and why? What are Brownian motions? They result from summing many small influential factors (law of large numbers) over a time interval [0, t], t ≥ 0. So they have Gaussian laws that change with time t. What are martingales? A stochastic process is a martingale if, roughly speaking, the conditioned average value at a future time t given its value at current time s is its value at s. On average you expect to see what is already statistically known. Continuous martingales and local martingales can be represented as stochastic integrals with respect to a Brownian motion by the Integral Representation Theorem (The Clark-Ocone formula from Malliavin calculus gives an explicit representation). What are Markov processes? The conditional average of the future value of a Markov process given knowledge of its past up to now is the same as the conditional average of the future value of the Markov process given knowledge on its present status only. The Dubin-Schwartz Theorem says that a martingale is a time change of a Brownian motion, e.g. a Brownian motion run at a random clock. The random clock is the quadratic variation of the martingale. The time change may not be 4 Markovian and hence a martingale as a stochastic process may not have the Markov property. Markov processes relates to second order parabolic differential equations. The following selected Topics gives a guidance on what we intend to work on. We will cover some topics in depth. • Brownian Motion. The definition(s), Lévy’s Characterization theorem, the martingale property, the Markov Property, properties of sample paths. • Stochastic integration with respect to Brownian motions and with respect to semi-martingales. • Rudiment of the Wiener Space. • Diffusion Processes. Infinitesimal generators, Markov semigroups, Construction by SDEs. • SDEs. Strong and weak solutions of SDE’s , path wise unique and uniqueness in law of solutions, existence of a local solution, existence of a global solution, the infinitesimal generator and associated semi-groups, the Markov property and the Cocycle property, the non-explosion problem, the flow problem, and Girsanov Transform for SDEs. • Martingale Theory. martingales, local martingales, semi-martingales, Change of measures and Girsanov Transform, • Itô’s formula. Prerequisites: Measure Theory and Theory of Integration (MA359) or advanced probability theory. Good working skills in Analysis (MA244 Analysis III) and and Metric Spaces (MA222) . Knowledge of Functional Analysis (MA3G7, MA 3G8) would be helpful. The following course would be specially helpful: Brownian Motion (MA4F7). Leads to: This Module leads to Topics in Stochastic Analysis (MA595), Stochastic Partial Differential Equations. Related to This module is related to Markov Processes and Percolation Theory (MA3H2), Quantumn Mechanics: Basic Principles and probabilistic methods (MA4A7), and Stochastic Partial Differential Equations. 1.2 Exams This module is assessed by one single exam ( at the beginning of term 3). I hand out a problem sheet each week. The best way to review for the exam is to work on the problems. 5 1.3 References For a comprehensive study of martingales we refer to “Continuous Martingales and Brownian Motion” by D. Revuz and M. Yor [23]. An enjoyable read for introduction to martingales is the book “Probability with martingales” by D. Williams [29]. For further reads on Brownian motions check on M. Yor’s recent books, e.g. [30] also [18] by R. Mansuy-M. Yor, and also [19] by P. Morters and Y. Peres. For an overall reference for stochastic differential equations, we refer to “Stochastic differential equations and diffusion processes, second edition” by N. Ikeda and S. Watanabe [13]. The small book [16] by H. Kunita is nice to read. There are two lovely books by A. Friedman “Stochastic differential equations and applications” [9, 10], and “Stochastic differential equations ” by I. Gihman and A.V. Skorohod [11]. Another book that is good for working out examples is “Stochastic stability of differential equations” by R. Z. Khasminskii [12]. Two books that are good for the beginners are “Stochastic Differential Equations” by B. Oksendale [20] and “Brownian Motion and Stochastic Calculus” by I. Karatzas and S.E. Shreve [15]. The book by Oksendale has 6 editions. I like edition three and edition four: they are neat and compact. For further studies there are “Diffusions, Markov processes and Diffusions” by C. Rogers and D. Williams [25, 24]. Another lovely reference book is “Foundations of Modern Probability” by Kallenberg [14]. It would work great as a reference book. For SDEs driven by space time martingales see “Stochastic Flows and Stochastic Differential Equations” by H. Kunita [17]. For SDEs on manifolds see “Stochastic differential equations on manifolds” by K. D. Elworthy [4]. For work from the point of view of random dynamics see “Random Dynamical systems” by L. Arnold [1] and “Random perturbations of dynamical systems” by M. I. Freidlin and A.D. Wentzell. For further work on the geometry of SDEs have a look at the books “On the geometry of diffusion operators and stochastic flows” [7] and “The geometry of filtesing” [5] by K. D. Elworthy, Y. LeJan and X.-M. Li. For a theory on Markov processes and especially the treatment of the Martingale problem see “Multidimensional diffusion processes” by D. Stroock and S. R.S. Varadhan [27]. There are a number of nice and slim books by the two authors, see D.W. Stroock [26] and S. R.S. Varadhan [28]. If you wish to review the theory of integration, try Royden’s book “Real Analysis”. It is easy to read and useful as a reference. For further study on measures see “Real Analysis” by Folland [8]. Have a read of “Probability measures on metric spaces” by Parthasarathy [21] for a deep theory on measures. The books “Measure Theory, vol 1&2” by Bogachev [3] is quite useful. For some aspects measure on the Wiener space see “Convergence of Probability measures” by Billingsley [2]. 6 Chapter 2 Preliminaries 2.1 2.1.1 Basics Sets [DeMorgan’s law.] For any index set I (∩α∈I Eα )c = ∪α∈I Eαc , (∪α∈I Eα )c = ∪α∈I Eαc . If {An } is a sequence of sets define ∞ lim sup An = ∩∞ m=1 ∪k=m Ak . n→∞ ∞ lim inf An = ∪∞ m=1 ∩k=m Ak . n→∞ 2.1.2 Measurable Spaces Definition 2.1.1 Let Ω be a set. A σ-algebra F on Ω is a collection of subsets of Ω ssatisfying the following conditions: (1) ∅ ∈ F (2) F is closed under complement: if A ∈ F then Ac ∈ F. (3) F is closed under countable unions: if An ∈ F , ∪∞ n=1 An ∈ F. The dual (Ω, F) is a measurable space and elements of F are measurable sets. Definition 2.1.2 Let G be a collection of subsets of Ω. The σ- algebra generated by G is the smallest σ-algebra that contains G. It is denoted by σ(G). 7 Definition 2.1.3 Let (X, d) be a metric space with metric d. Denote by Bx (a) the open ball centred at x with radius a. The Borel σ-algebra, denoted by B(X), is defined to be σ({Ba (x) : x ∈ X, a ∈ R}), Elements of B(X) are Borel sets. If not otherwise mentioned, this is the σ-algebra we will use. Single points and therefore a countable subset of a metric is always measurable since {x0 } = ∩n {x : |x − x0 | < n1 }. 2.1.3 The Monotone Class Theorem Definition 2.1.4 Let Ω be a set. A non-empty family of subsets G of Ω is called a monotone class if limits of monotone sequences in G belong to G. 1. ∪∞ n=1 An ∈ G if An is an increasing sequence in G. 2. ∩∞ n=1 An ∈ G if An is an decreasing sequence in G Finite additive property together with monotone class property is equivalent to the σ-additive property. A σ-algebra is a monotone class. Intersection of monotone classes containing a common element is a monotone classes. Theorem 2.1.5 (Monotone Class Theorem) If A is a algebra, then the smallest monotone class m(A) of sets containing A is the σ algebra σ(A) generated by A. Definition 2.1.6 A collection of subset A is a π system if A, B ∈ A implies that A ∩ B ∈ A. The intersection of π-systems is a π-system. Note that some people may insist that the empty set ∅ is in π system. Definition 2.1.7 A family B of subsets of a set Ω is a σ-additive class also called a Dynkin system if 1. Ω ∈ B. 2. If B1 ⊂ B2 , B1 , B2 ∈ B then B1 \B2 ∈ B. 3. If An is an increasing sequence in B then ∪∞ n=1 An ∈ B. 8 Note that property 1 and 2 imply that σ-additive class is closed under taking compliment. A σ-additive class is a monotone class. The intersection of any number of σ-additive classes is a σ-additive class. Theorem 2.1.8 (Dynkin’s Lemma) If A is a π system then the σ-additive class (Dynkin system) generated by A is the σ-algebra generated by A. 2.1.4 Measures Definition 2.1.9 Let (Ω, F) be a measurable space. A measure µ is a function, µ : F → [0, ∞], with the following property: 1. P (∅) = 0 2. Let {Aj } be a pairwise disjoint sequence P∞ from F, that is Ai ∩ Aj = ∅ whenever i 6= j. Then µ(∪∞ (A ) = i i=1 i=1 µ(Ai ), Definition 2.1.10 The triple (Ω, F, P ) is called a measure space. The measure µ is finite if µ(Ω) < ∞. It is a probability measure if µ(Ω) = 1. If µ(Ω) = 1, the triple (Ω, F, µ) is called a probability space. The measure is σ-finite, if there are ∞ measurable sets {An }∞ n=1 of finite measure such that Ω = ∪n=1 An . A measure on a Borel σ-algebra B(Ω) is called a Borel measure. Property (2) is called the σ-additive property, which implies that if An is a nondecreasing sequence of measurable sets A1 ⊂ A2 ⊂ . . . , lim P (An ) = P (∪∞ n=1 An ). n→∞ (2.1) Similarly if An is a non-increasing sequence of measurable sets A1 ⊃ A2 ⊃ . . . , lim P (An ) = P (∩∞ n=1 An ). n→∞ Definition 2.1.11 Consider the measure space (Ω, F, P ). We say that F is complete if any subset of a measurable set of null measure is in F. Standard Assumption. We assume that the measure is a probability measure and that the σ-algebra is complete. In this case if f = g almost surely, that is they agree on a set of full measure, then they are both measurable or both not measurable. Remark 2.1.12 A measure on F is determined by its value on a generating set of F that is a π system. 9 2.1.5 Measurable Functions Definition 2.1.13 Given measurable spaces (Ω, F) and (S, B), a function f : Ω → S is said to be measurable if f −1 (B) = {ω : f (ω) ∈ B} belongs to F for each B ∈ B. Measurable functions are known as random variables, especially if S = R. Let (Ω, F) be a measurable space. Example 2.1.14 The indicator function of a set A is: 1, if ω ∈ A 1A (ω) = , 0, if ω 6∈ A. a real valued function. The functions 1A is measurable if and only if A ∈ F. Example 2.1.15 A function f : E → R that only takes a finite number of values have the following form, n X f (x) = aj 1Aj (x), j=1 where aj are values that f take and Aj = {x : f (x) = aj }. Then f is measurable if and only if Aj are measurable sets. Definition 2.1.16 A simple function is of the form f (x) = n X aj 1Aj (x), j=1 where aj ∈ R and Aj are measurable sets. If {Aα , α ∈ I} is a collection of subsets of Y then f −1 (Ac ) = (f −1 (A))c f −1 (∩α∈I Eα ) = ∩α∈I f −1 (Eα ) f −1 (∪α∈I Eα ) = ∪α∈I f −1 (Eα ). Proposition 2.1.17 • If f1 : (Ω, F) → (E1 , A1 ) and f2 : (Ω, F) → (E2 , A2 ) are measurable functions. Define the product σ-algebra A1 ⊗ A2 = σ{A1 × A2 : A∈ A1 , A2 ∈ A2 }. Then h = (f1 , f2 ) : (Ω, F) → (E1 ×, A1 ⊗ A2 ) is measurable. 10 • If f : (X1 , B1 ) → (X2 , B2 ) and g : (X2 , B2 ) → (X3 , B3 ) are measurable functions then so f ◦ g : (X1 , B1 ) → (X3 , B3 ) is measurable. • If f, g, fn : (X, B) → (R, B(R) are measurable functions, then the follow−g| ing functions are measurable: f + g, f g, max(f, g) = (f +g)+|f , f+ = 2 max(f, 0), f − , lim sup fn , lim inf fn , supn≥1 fn , inf n≥1 fn , limn→∞ fn . Proposition 2.1.18 If f : E → [0, ∞] is a positive measurable function, there is a sequence {fn } of simple functions such that 0 ≤ f1 ≤ f2 ≤ . . . fn ≤ . . . f, and fn → f pointwise. Furthermore fn → f uniformly on any sets on which f is bounded. Here both E and [0, ∞] are equipped with the relevant Borel σ-algebras. Theorem 2.1.19 1. For every measurable function f , there is a sequence of simple functions fn such that for every x, lim fn (x) = f (x). n→∞ If f is bounded, fn can be chosen to be uniformly bounded. 2. If f : E → [0, ∞] is a positive measurable function, there is a sequence {fn } of simple functions such that 0 ≤ f1 ≤ f2 ≤ . . . fn ≤ . . . f, and fn → f pointwise. Furthermore fn → f uniformly on any sets on which f is bounded. Here both E and [0, ∞] are equipped with the relevant Borel σ-algebras. Proposition 2.1.20 Let (X, B(X)) and (Y, B(Y )) be the metric spaces with Borel σ-algebras. If f : X → Y is continuous then it is Borel measurable . 2.2 Integration with respect to a measure Let (E, F, µ) be a measure space. Let E = {f (x) = n X aj 1Aj (x) : Aj ∈ F, aj ∈ R} j=1 11 be the set of (measurable) simple functions. We define the integral of f = with respect to µ as below: Z f dµ(x) = x∈E n X Pn j=1 aj 1Aj aj µ(Aj ). j=1 If aj , j = 1, . . . , n are distinct values, then Aj = f −1 ({aj }). Definition 2.2.1 Let f : E → [0, ∞] be a positive measurable function. Define Z Z f dµ = sup gdµ. g∈E:g≤f Definition 2.2.2 Let R f : E →RR be a general measurable function. Write f = f + − f − . If both f + dµ and f − dµ are finite, we say that f is integrable with respect to µ and the integral is defined to be: Z Z Z + f dµ = f dµ − f − dµ. E E E The set of all integrable functions are denoted by L1 (µ), elements of which are said to be L1 functions. If A is a Borel measurable set, we denote by the following the same integral: Z Z Z f dµ, f (x) µ(dx), 1A (x)f (x) dµ(x). A A E Proposition 2.2.3 Assume that f, g ∈ L1 (µ). 1. If f ∈ L1 (µ), so does |f | ∈ L1 (µ), and Z Z f dµ ≤ |f | dµ. 2. If f, g ∈ L1 (µ), a, b, ∈ R, then af + bg ∈ L1 (µ) and Z Z Z (af + bg) dµ = a f dµ + b g dµ. 3. If f ≤ g, Z Z f dµ ≤ 12 gdµ. 4. If A is a measurable set, define Z R R f dµ = 1A f dµ. Then A Z g dµ f dµ = A A for all measurable set A implies that f = g almost surely. R 5. If f ≥ 0 and f dµ = 0 then f = 0 a.e. 2.3 Lp spaces Let (E, B, µ) be a measure space. Two functions f, g : E → R are equivalent if f = g almost surely. For 1 ≤ p < ∞, we define the Lp space: Z p L (E, B, µ) = {f : E → R : measurable, |f (x)|p dµ(x) < ∞}. E 1 R Define kf kLp = E |f (x)|p dµ(x) p . Let L∞ be the set of bounded measurable functions with |f |L∞ = inf{a : µ(|f (x)| > a) > 0}. Then Lp spaces are Banach spaces. The notations Lp (E, B), Lp (E) or Lp maybe used for simplicity. Proposition 2.3.1 Hölder’s inequality (Cauchy-Schwartz if p = q = 2). Z Z |f g|dµ ≤ p 1 Z p q |g| dµ |f | dµ 1 q , 1 1 + = 1. p q Minkowski Inequality for p ≥ 1, kf + gkLp ≤ kf kLp + kgkLp . p q Suppose that µ is a finite measure. Then by Hölder inequality, P P L ⊂ L for p < q. A function φ : R → R is P convex if φ( pi xi ) ≤ i pi φ(xi ) where pi are real positive numbers such that i pi = 1. If φ is twice differentiable f is convex 00 if and only if φ ≥ 0. Example of convex functions are: φ(x) = xp for p > 1, φ(x) = ex . Theorem 2.3.2 (Jensen’s Inequality) If φ is a convex function then for all integrable functions f and µ a probability measure Z Z φ f dµ ≤ φ(f )dµ. 13 Proposition 2.3.3 (Chebychevs inequality) If X is an Lp random variable then for a ≥ 0, 1 P (|X| ≥ a) ≤ E|X|p . a 1 RTheorem 2.3.4 If f ∈ L . For any > 0 there is a simple function φ such that |f − φ| < . Theorem 2.3.5 (The monotone convergence theorem) If fn are non-decreasing sequence converging to f almost surely, and fn− ≤ g and g is integrable then Z Z lim fn dµ = lim f dµ. n→∞ n→∞ Proof This is shown when fn ≥ 0. Otherwise take gn = fn + g. Then gn ≥ 0 and {gn } is a non-decreasing sequence. Theorem 2.3.6 (Dominated Convergence Theorem) If |fn | ≤ g where g is an integrable function, and fn → f a.e. then f is integrable Z Z lim fn dµ = lim fn dµ. n→∞ n→∞ Theorem 2.3.7 (Fatou’s lemma) If fn is bounded below by an integrable function then Z Z lim inf fn dµ ≤ lim inf fn dµ n→∞ 2.4 n→∞ Notions of convergence of Functions Definition 2.4.1 Let (E, B, µ) be a measure space. Let fn be a sequence of measurable functions. • fn converges to f almost surely if there is a set Ω0 with µ(Ω0 ) = 0 lim fn (ω) = f (ω), n→∞ ∀ω 6∈ Ω0 . • The sequence fn is said to converge to f in measure if for any δ > 0 lim P (|fn − f | > δ) = 0. n→∞ • Lp convergence Z |fn − f |p dP → 0. Proposition 2.4.2 Let An be measurable sets. The indicator functions 1An → 0 in probability if and only if µ(An ) → 0. 14 2.5 Convergence Theorems Proposition 2.5.1 measure. 1. If fn converges to f in Lp , p ≥ 1, it converges to f in 2. If fn converges to f in measure there is a subsequence of fn which converges almost surely. 3. fn converges to f in measure if and only if every subsequence has a further subsequence that converges a.s. Theorem 2.5.2 (Egoroff’s Theorem*) Suppose that µ(E) < ∞. Let fn be a sequence of functions converging to f almost surely. Then for any > 0 there is a set E ⊂ E with µ(Ec ) ≤ such that fn converges to f uniformly on E . In particular fn converges to f in measure. 2.5.1 Uniform Integrability Let (Ω, F, µ) be a measure space. Definition 2.5.3 A family of real-valued functions (fα , α ∈ I), where I is an index set, is uniformly integrable (u.i.) if Z lim sup sup |fα |dµ = 0. C→∞ α {|fα |≥C} Proposition 2.5.4 Suppose that supα E|fα |p < ∞ for some p > 1, then the family (fα , α ∈ I) is uniformly integrable. Proof This follows from Z sup |fα |dµ ≤ α {|fα |≥C} Z 1 C (p−1) sup α |fα |p dµ → 0. {|fα |≥C} Definition 2.5.5 A family {fα } is uniformly absolutely continuous, if for any > 0 there is a δ > 0 such that, Z sup |fα |dµ < , ∀A ∈ F with µ(A) < δ. α∈I A Lemma 2.5.6 Assume that µ(Ω)R< ∞ and f ∈ L1 . Given any > 0 there is δ > 0 such that if µ(A) < δ then A |f |dµ < . 15 Suppose that the measure µ is finite. The lemma above shows that a L1 function is (uniformly) absolutely continuous. The lemma brows show that it is uniformly integrable. Proposition 2.5.7 Let µ be a finite measure. The family (fα , α ∈ I) is uniformly integrable if and only if the following two conditions hold: R 1. (L1 boundedness): supα∈I |fα |dµ < ∞ 2. The family fα is uniformly absolutely continuous. Proof We first assume that (fα , α ∈ I) is uniformly integrable. For any 0 < < 1, take C such that Z |fα |dµ ≤ /2, |fα |≥C Then, since µ(Ω) < ∞, we have L1 boundedness, Z Z Z sup |fα |dµ ≤ |fα |dµ + sup |fα | ≤ + Cµ(Ω) < ∞. 2 α α |fα |≥C |fα |<C Also Z |fα |dµ ≤ sup sup α Z α A ≤ sup ≤ |fα | dµ + sup α Z |fα | dµ + sup A∩{|fα |≥C} Z α Z α {|fα |≥C} |fα |dµ A∩{|fα |<C} Cdµ . A∩{|fα |<C} + C µ(A) → 0. 2 R Thus supα A |fα |dµ → 0 as µ(A) → 0 (uniform absolutely continuity). We next prove the converse. Assume that Xt is uniformly absolutely continuous and is L1 bounded. Then Z 1 sup µ(|fα | ≥ C) ≤ sup |fα |dµ C α α R is mall when C is large. By choosing C large we see that |fα |≥C |fα |dµ is as small as we want. Proposition 2.5.8 Let fn , f ∈ L1 . a) Suppose that fn → f in measure and fn is uniformly integrable. Then R |fn − f |dµ → 0. 16 R b) Suppose that |fn − f |dµ → 0 and µ(Ω) < ∞. Then fn → f in measure and fn is uniformly integrable. Proof Part a). Assume that {fα } is uniformly integrable. Then for any > 0 there choose C > 1 such that Z Z |f |dµ < . sup |fn |dµ + n {|f |≥C} {|fn |≥C} Sincefn → f in measure, we may choose N such that if n > N , µ(|fn − f | > ) < . 2C It follows that Z Z |fn − f |dµ ≤ sup |fn − f |dµ n |fn −f |≤ Z ! Z + |fn |dµ + {|fn −f |>}∩{|fn |≥C} {|fn −f |>}∩{|fn |≤C} Z ! Z |f |dµ + + {|fn −f |>}∩{|f |≥C} {|fn −f |>}∩{|f |≤C} Z ≤ + 2 + 2 sup n Cdµ {|fn −f |>} ≤3 + 2Cµ{|fn − f | > }) ≤ 4. R R R Hence |fn − f |dµ → 0. Finally | fn dµ − f dµ| ≤ |fn − f |dµ → 0. Proof of part b). The convergence in measure follows from Z 1 µ(|fn − f | > ) < |fn − f |dµ → 0. R For the uniform integrability, we observe that Z Z Z |fn |dµ ≤ |fn − f |dµ + |f |dµ. A Ω A Take A = Ω to see that {fn } is L1 bounded. By Lemma 2.5.6, for any > 0, there R is δ > 0 such that if µ(A) < δ, then A |f |dµ < . This means that Z Z |fn |dµ ≤ |fn − f |dµ + A Ω and {fn } is absolutely continuous. Apply Proposition 2.5.8 to conclude the uniform integrability. 17 2.6 Pushed Forward Measures, Distributions of Random Variables Let (Ω, F, ν) be a measure space and (S, B) a measurable space. Let f : Y → S be a measurable function. The pushed forward measure f∗ (ν) on (S, B) is defined as follows: f∗ (ν)(B) = ν{x : f (x) ∈ B}, B ∈ B. Lemma 2.6.1 Let α : S → R be a bounded Borel measurable function. Then Z Z α ◦ f dν = α d(f∗ (ν)). (2.2) S Ω Let (Ω, F, P ) be a probability space. Let X : Ω → S be a measurable function. Let α : S → R be a bounded measurable function. Define Z Eα(X) = α ◦ X dP. Ω The measure X∗ (P ) on S is called the probability distribution or the probability law of X. Let us denote this measure by µX , µX (B) = P (ω : X(ω) ∈ B). The change of variable formula says that Z Eα(X)) = α(y) dµX (y). S If S = R, then µX ((−∞, a)) = P (ω : X(ω) ≤ a). In particular, Z Z EX = y dµX (y), var(X) = (y − EX) dµX (y). S S If X1 , . . . , Xn are real valued random variables, then (X1 , . . . , Xn ) is a real valued random viable. The joint distribution of X1 , . . . , Xn is the measure in Rn induced by (X1 , . . . , Xn ). 18 2.7 Lebesgue Integrals Definition 2.7.1 Let (Ω, F, P ) be a measure space. We say that F is complete if any subset of a measurable set of null measure is in F. The sets in the completion of the Borel σ-algebra are said to be Lebesgue measurable. The Lebesgue Measure L : B(R) → R is determined by, X L(A) = inf{ (bj − aj ) : ∪∞ A ∈ B(R). j=1 (aj , bj ) ⊃ A}, j The measure L of an interval is the length of the interval. The integrals developed in the last section for functions f : R → R are Lebesgue integrals. Proposition 2.7.2 A bounded function f : [a, b] → R is Riemann integrable if and only if the set of discontinuity of f has Lebesque measure zero. Proposition 2.7.3 If a bounded function f : [a, b] → R is Riemann integrable, then it is Lebesgue measurable and Lebesgue integrable. Furthermore, the integrals have the common value: Z Z b f (x)dµ = f (x)dx. [a,b] a Let F be a right continuous increasing function, there is a unique Borel measure on R, called the Lebesgue-Stieltjes measure associated to F , such that µF ((a, b]) = F (b) − F (a). Furthermore X µF (A) = inf{ (F (bj ) − F (aj )) : ∪∞ j=1 (aj , bj ) ⊃ A}. j R R It is customary to denote h dF for the intergal h dµF . If F = f − g where f, g are right continuous increasing functions, define µF = µf − µg . The measure µF is not necessarily positive valued. It is a signed measure. Definition 2.7.4 Let f, g : [a, b] → R be bounded functions. We say f is RiemannStieltjes integrable with respect to g if there is a number l such that for all > 0, there is δ such that for all partitions ∆ : a = t0 < x1 < · · · < tn = b 19 with mesh |∆| := max1≤i≤n (ti − ti−1 ) < δ, n X ∗ < . f (t ) [g(t ) − g(t )] − l j+1 j j j=0 Here t∗j is any point in [tj−1 , tj ]. If so we we define l to be the Riemann-Stieltjes Rb integral of f with respect to g and this is denoted by a f (t) dg(t). Take a function F that is absolutely continuous on [a, b]. Then Z x F (x) = f (a) + g(t)dt, a with g Lebesque integrable and f differentiable at almost surely all points. Furthermore F 0 = g almost surely. In this case Z b Z b h(x)µF (dx) = h(x)F 0 (x)dx. a a A right continuous increasing function F on [0, ∞) has finite total variation, We may assume that F (0) = 0. The measure µF is absolutely continuous with respect to the Lebesque measure if and only if F is absolutely continuous. 2.8 Total variation I will relate this to semi-martingales later. Definition 2.8.1 Denote by P ([a, b]) the set of all partitions a ≤ t0 < t1 < · · · < tn = b of [a, b]. The total variation of a function g on [a, b] is defined by X gT V ([a, b]) = sup {|g(tj ) − g(tj−1 )|} . ∆∈P ([a,b]) j If gT V ([a, b]) is finite we say g is of bounded variation on [a, b]. If f : R → R is an increasing then it has left and right derivatives at every point and there are only a countable number of points of discontinuity. Furthermore f is differentiable almost everywhere. Let f be a function, Define f + (x) = max(f (x), 0) to be its positive part and − f = − min(f (x), 0) to be the absolute value of its negative part. For p = 1, note that |f (ti+1 ) − f (ti ))| = [f (ti+1 ) − f (ti )]+ + [f (ti+1 ) − f (ti )]− hence the name ‘total variation’. 20 Proposition 2.8.2 A real valued function on the interval [a, b] is of bounded variation if and only if it is the difference of two monotone functions. Such a function is differentiable almost everywhere with respect to the Lebesque measure. Proposition 2.8.3 If f is an increasing function then its derivative exists almost every. Furthermore f 0 is Borel measurable. Example 2.8.4 Let f be a real valued integrable function defined on [a, b]. Rx 1. Its indefinite integral a f (t)dt is of bounded variation and continuous. 2. The derivative of its indefinite integral equals f almost surely. A right continuous increasing function F on [0, ∞) has finite total variation, We may assume that F (0) = 0. The measure µF is absolutely continuous with respect to the Lebesque measure if and only if F is absolutely continuous. 21 Chapter 3 Lectures 3.1 Lectures 1-2 The objectives of the course are: At the end of the course we should be able to understand and work with martingales and with stochastic differential equations. Here is a stochastic differential equation (SDE) of Markovian type on Rd in differential form: m X dxt = σi (t, xt )dBti + b(t, xt )dt. (3.1) i=1 Here (Bt1 , . . . , Btm ) is an Rm -valued “Brownian Motion” on a “filtered probability space” (Ω, F, Ft , P ). Each xt is a function from the probability space Ω to Rd , We assume that each σi : R+ × Rd → Rd and b : R+ × Rd → Rd to be Borel measurable. Suitable conditions will be introduced to ensure that there is a ‘solution’ and that “uniqueness of solutions” hold. If σi ≡ 0 for all i, this is an ordinary differential equation: ẋt = b(t, xt ). Please review results concerns ODEs and review the existence and uniqueness theorem which you learnt. The integral form of the SDE (3.1), given below, is more instructive: xt = x0 + m Z X i=1 t σi (s, xs )dBsi 0 Z + t b(s, xs )ds. 0 Rt We will soon define “stochastic integrals”. The notation 0 σi (xs )dBsi denotes the stochastic integral, Itô integral in this case, of σ(s, xs ) with respect to the one dimensional Brownian motion (Bsi ). Let (ft ) be a suitable stochastic R t process, ft : Ω → R, and (Bt ) a Brownian motion. The “stochastic integral 0 fs dBs ” is a “local martingale”. We will need 22 to study martingales, local martingales and semi-martingales. By the “integral representation theorem for martingales”, All “sample continuous ” local martingales are stochastic integrals of the form above. The Clark-Ocone formula, a popular formula in Malliavin calculus, gives an explicit formula for the integrand. The L2 chaos expansion takes this further. You might want to ask the question why we study SDE. We will be able to answer this question better, later in the course and at the end of this course. For the moment just believe that SDEs are popular and useful. 3.2 Notation In this section we fix the notation. Let (Ω, F) be a measure space: Ω is a set and F is a σ-algebra of subsets of Ω. By “F is a σ-algebra” we meant that: ∅ ∈ F, the compliment Ac of a set from F is in F, the union ∪∞ i=1 Ai of a sequence of sets Ai ∈ F is in F. A signed measure µ on the measure P∞space is a function∞from F to R such that µ(∅) = 0, and that µ(∪∞ A ) = n=1 n n=1 µ(An ) if {An }n=1 is pairwise disjoint. Unless otherwise stated, by a measure we mean one that takes non-negative values, µ : F → R+ . The measure is said to be finite if µ(Ω) < ∞ and it is a probability measure if µ(Ω) = 1. A finite measure can always be normalised to a probability measure : µ̃(A) := µ(A) µ(Ω) . A measure is σ-additive if there is a countable cover of Ω by measurable sets of finite measure. In this paragraph we briefly review on metric spaces. The restriction d to a subset A of a metric space is a distance function on A. A subset of a metric space is complete if any Cauchy sequence converges to a point in the set. A closed subset of a complete metric space is complete. A metric space is separable if it has a countable dense set. The product metric on (X1 , d1 ) and (X2 , d2 ) is max(d1 , d2 ). Definition 3.2.1 • A collection T of subsets of a set X is a topology if ∅ ∈ T , and T is closed under the union of any number of sets and under the intersection of a finite number of sets. Sets from T are called open sets. Compliments of open sets are closed sets. • The Borel σ-algebra of a topological space is the smallest σ-algebra that contains all open sets and is denoted by B(X). Elements of the Borel σalgebra are Borel sets. Borel sets include: open sets, closed sets, countable unions and intersections of closed sets. 23 Example 3.2.2 Let (X, d) be a metric space. The collection of open sets defined by the distance defines the metric topology on X. The metric topology are determined by open balls. The following are examples of metric spaces: [0, 1], Rn , Banach spaces such as Lp ((Ω, µ); R), Hilbert spaces, and any finite dimensional manifolds. In this course we are concerned only with metric spaces that are separable and 1 complete. Single points {x} = ∩∞ n=1 B(x, n ) are Borel set. Definition 3.2.3 • Let (X1 , F1 ) and (X2 , F2 ) be measure spaces. A function f : X1 → X2 is measurable if the pre-image of a measurable set is measurable: f −1 (A) ∈ F1 if A ∈ F2 . • Let (X1 , τ1 ) and (X2 , τ2 ) be topological spaces. A function f : X1 → X2 is continuous if the pre-image of an open set is open: f −1 (U ) ∈ τ1 if U ∈ τ2 . Proposition 3.2.4 All continuous maps are Borel measurable. Proof For i = 1, 2, let (Xi , τi ) be topological spaces and Bi the Borel σ-algebras. Let f : X1 → X2 be a continuous function. Let f −1 (B2 ) = {f −1 (A) : A ∈ B2 }, f −1 (τ2 ) = {f −1 (A) : A ∈ τ2 }. Then f −1 (B2 ) is a σ-algebra and f −1 (τ2 ) ⊂ τ1 ⊂ B1 be the continuity of f . It is also easy to show that σ(f −1 (τ2 )) = f −1 (B2 ). Definition 3.2.5 Two complete separable metric spaces are measurable isomorphic if there is a bijection φ such that both φ and φ−1 are measurable. Definition 3.2.6 The space ([0, 1], B([0, 1]), and also those isomorphic to it, is a standard Borel space. Let (X, F) be a measurable space. A Borel set E inherits a σ algebra by restriction: {E ∩ A : A ∈ F}. Theorem 3.2.7 (Theorem 2.13, [21])) Let Xi , i = 1, 2 be complete separable metic spaces, and Ei ⊂ Xi Borel sets. Then E1 and E2 , with the restricted Borel σalgebra, are measurable isomorphic if and only if they have the same cardinality. The cardinality of [0, 1]Z is the same as that of the continuum. The cardinality of C([0, 1]; Rd ) is that of the continuum. This is because continuous functions are determined by their values at rational numbers (a countable set), and hence the 24 cardinality of the set of continuous functions is that of RQ , a countable product R. The cardinality of Lp (R; R) is that of the continuum. First we consider Lp spaces from [2πn, 2π(n + 1)] → R. The Fourier series of a function in Lp converges to it in Lp . Hence its values are determined by a countable number of quantities (their Fourier coefficients) and the sines and cosines with respect to which the expansion is made. For non-Borel σ-algebras, a more general concept exists. The moral of this story is: “Most” of probability spaces without an atom is “measurably isomorphic” to the standard Borel space. All “reasonable” measure spaces that are rich enough to support a Brownian motion is a standard Borel space. 3.3 The Wiener Spaces Define W d ≡ C([0, 1]; Rd ) := {ω : [0, 1] → Rd continuous } and W0d ≡ C0 ([0, 1]; Rd ) := {ω : [0, 1] → Rd continuous , σ(0) = 0}. Then W , with the uniform norm kωk = sup |ω(t)|, 0≤t≤1 is a separable Banach space. Furthermore W0 a closed subspace. With appropriate modification the discussion below applies to both W0 and W0d . The cardinality of both spaces are that of the continuum. Hence they with their Borel σ-algebras are standard Borel spaces. Let {ωi } be a dense set of W0d . Let {tk } ∈ Q ∩ [0, 1]. Then the topology is determined by {ω ∈ W0d : kω − ωi k < n1 }. The following sets is a base of the topology: 1 1 } = W d ∩{ω ∈ (Rd )[0,1] : |ω(tk )−ωi (tk )| < }. n n (3.2) A cylindrical set is of the form {ω ∈ W0d : |ω(tk )−ωi (tk )| < {ω ∈ W0d : ω(t1 ) ∈ A1 , . . . , ω(tk ) ∈ Ak } = ∩ni=1 {ω(ti ) ∈ Ai }, where Ai ∈ B(Rd ) and 0 ≤ t1 < t2 < · · · < tk ≤ 1. The collection Cyl of cylindrical sets generate B(W0d ), check e.g. Fubini’s theorem, and is measure determining (two finite measures on (W0d , B(W0d )) agreeing on Cyl will agree). In fact any π system that generates a σ-algebra is measure determining. 25 Denote by pt the heat kernel on Rd . For x, y ∈ Rd . pt (x, y) = 1 − d e |x−y|2 2t . (2πt) 2 Then pt (x, y)dy is a probability measure. Let πt : W d → Rd be the projection, πt (ω) = ω(t). Theorem 3.3.1 There is a unique probability measure µ on (W0d , B(W0d )) such that for 0 < t1 < t2 < · · · < tn ≤ 1, A1 , . . . , An ∈ B(Rn ), µ(ω : πt1 (ω) ∈ A1 , . . . , πtn (ω) ∈ An ) Z Z (3.3) n pt1 (0, x1 )pt2 −t1 (x1 , x2 ) . . . ptn −tn−1 (xn−1 , xn )πi=1 = ... dxi . A1 An This measure is commonly known as the Wiener measure. The space (W0d , B(W0d , µ) is the Wiener space. We will give a proof of the existence in section 3.8.1. The map πt as a stochastic process on the probability space (W0d , B(W0d , µ) is a Brownian motion, which we define shortly. Note. The interval [0, 1] can be replaced by [0, T ] where T is a positive number. 3.4 Lecture 3. The Pushed forward Measure Let (X, B) and (Y, G) be two measurable spacs and µ a measure on (X, B). A measurable map φ : X → Y induces a measure on (Y, G) such that for any C ∈ Y , (φ∗ µ)(C) = µ({x : φ(x) ∈ C}). Denote φ−1 (C) = {x : Φ(x) ∈ C}. Lemma 3.4.1 Let f : Y → R be in L1 (Y, (φ)∗ (µ)), then Z Z f (φ)(x)dµ(x) = f (y)d(φ∗ µ)(y). X Y Idea of Proof: First take f = Z f (y)d(φ∗ µ) = Y Z X n Pn i=1 ai 1Ai , then n n X X ai 1Ai d(φ∗ (µ) = (φ∗ µ)(Ai ) = µ(φ−1 (Ai )). Y i=1 i=1 26 i=1 On the other hand Z X Z n n X f (φ)(x)dµ(x) = ai 1Ai (φ(x))dµ(x) = µ({x : φ(x) ∈ Ai }). X i=1 X i=1 Next take f a positive function. Take an increasing sequence of simple functions that converges to f : n2n X k 1 1A (x). fn (x) = n [2n f (x)] ∧ n = 2 2n k k=1 n n where Ak = {x : f (x) ∈ [ 2kn , k+1 2n ). Note that [2 f (x)] = k, if k ≤ 2 f (x) < n (k + 1)2 . For f ∈ L1 consider the integrals of f + and f − separately. If φ : U ⊂ Rd → φ(U ) ⊂ Rd is a diffeomorphism onto its image. Let dx denote the Lebesque measure. Then φ−1 induces the pushed forward measure, which is pushed back of the Lebesque measure by φ . Then if f : Rd → R is any bounded measurable function, Z Z f (x)dx = f (φ(x))(φ−1 )∗ (dx). Y X On the other hand, Z Z f (x)dx = Y f (φ(x))| det T φ(x)|dx. X This shows that, e.g. for any Borel set A, take f (x) = 1φ(A) , f (φ(x)) = 1A (x). d(φ−1 )∗ (dx) dx 3.5 = | det T φ(x)|. Basics of Stochastic Processes Let (Ω, F, P ) be a measure space. Assume that P (Ω) = 1. Two measurable sets A and B are independent if P (A ∩ B) = P (A)P (B). Definition 3.5.1 1. Let {Fα , α ∈ Λ} be a family of σ-algebras. We say that {Fα , α ∈ Λ} are independent, if for any finite index set {α1 , . . . αn } ⊂ Λ and any sets A1 ∈ Fα1 , . . . , An ∈ Fαn , P (∩ni=1 Ai ) = Πni=1 P (Ai ). 27 2. A family of random variables are mutually independent if the σ-algebras generated by them are mutually independent. Let (Ω, F, P ) be a probability space. If X : Ω → Rd is measurable (i.e. a random variable), then X∗ P is the probability distribution of X and is denoted by µX . And Z Z EX = XdP = ydµX (y). Rd Ω Let πi : Rn → R be the projection to the ith component. The tensor σ-algebra ⊗n B(R) is ⊗n B(R) = σ{πi−1 (A) : A ∈ B(R)}. For example π1−1 (A) = A × R × · · · × R. If each Xi is an R-valued measurable functions, (X1 , . . . , Xn ) : Ω → Rn is measurable with respect to ⊗n B(R). The joint distribution of (X1 , . . . , Xn ) is the measure on Rd pushed forward to the map (X1 , . . . , Xn ). Definition 3.5.2 The random variables {X1 , . . . , Xn } are independent if µ(X1 ,...,Xn ) = µX1 ⊗ · · · ⊗ µXn . Independence holds, if the two measures in the identity above agree on cylindrical sets: n P (X1 ∈ A1 , . . . , Xn ∈ An ) = πi=1 P (Xi ∈ Ai ), Ai ∈ B. Equivalently for any gi : R → R bounded measurable, n n E(πi=1 gi (Xi )) = πi=1 Egi (Xi ). Let (Ω, F, P ) be a probability space. Let S be a metric space which is considered to be endowed with the Borel σ-algebra B unless otherwise stated. In this course we take S = Rd . Let T be a subset of R, e.g. R, [0, 1], [a, b], [a, b), and {1, 2, . . . }. Definition 3.5.3 Let I be a set. A function X : Ω × I → S is a stochastic process if for each α ∈ I, X(·, α) : (Ω, F) → (S, B) is measurable. The process is denoted by (Xt , t ∈ T ) or (Xt ) for short. A point t ∈ I is referred as time and the stochastic process is said to be a continuous time stochastic process if T is an interval. If I = Z we have a discrete time stochastic process, the stochastic process is denoted by (Xn ). 28 Example 3.5.4 • Take Ω = [0, 1], F = B([0, 1]), and P the Lebesque measure. Take I = {1, 2, . . . }. Define Xn (ω) = ωn . These are continuous functions from [0, 1] → R and hence measurable. • Take I = [0, 3]. Let X, Y : Ω → R be two random variables. Then Xt (ω) = X(ω)1[0, 1 ] (t) + Y (ω)1( 1 ,3] (t) is a stochastic process. 2 2 Definition 3.5.5 A stochastic process (Xt , t ∈ I) is said to have independent increments if for any n and any 0 = t0 < t1 < · · · < tn , ti ∈ I, the increments {(Xti+1 − Xti )}ni=1 are independent. Definition 3.5.6 A sample continuous stochastic process (Bt : t ≥ 0) on R1 is the standard Brownian motion if B0 = 0 and the following holds: 1. For 0 ≤ s < t, the distribution of Bt − Bs is N (0, t − s) distributed. 2. Bt has independent increments. Definition 3.5.7 A stochastic process (Xt , t ≥ 0) with state S is said to be sample path continuous if t 7→ Xt (ω) is continuous for almost surely all ω. The terminology time continuous can also be used. Remark: Another regular encountered class of stochastic processes are càdlag processes: for almost all ω, the sample path t 7→ Xt (ω) has left limit and is right continuous for all time t. Such functions have jumps at the point of discontinuity. Definition 3.5.8 A stochastic process Xt is said to have stationary increments if the distribution Xt − Xs and Xt+a − Xs+a are the same for all a > 0 and 0 ≤ s < t. 3.6 Brownian Motion A Brownian motion describes the following properties: the law of the randomness we model are independent during disjoint time intervals and has the Gaussian property. Let I = [0, 1] or I = [0, ∞). Definition 3.6.1 A stochastic process (Bt : t ≥ 0) on Rd is a Brownian motion with initial value x the following holds: 1. B0 = x 29 2. t 7→ Bt (ω) is continuous for almost surely all ω. 3. For s ≥ 0 an t > 0, the distribution of Bt+s − Bs is N (0, tI) distributed. 4. Bt has independent increments, i.e. for any 0 = t0 < t1 < · · · < tk , {Bti+1 − Bti }ki=1 is a family of independent random variables. If B0 = 0 and d = 1 this is the standard (linear ) Brownian motion. Remark*: Let A be a d × d matrix, ABt + at is N (at, AAT t) distributed. Fix t > 0, observe that n −1 2X − B in t ). Bt − B0 = (B i+1 n t i=0 2 2 Assume that {B i+1 t − B 2in t } are independent identically distributed. This allures 2n that Bt ∼ N (0, t). Indeed, for each t, Bt has infinitely divisible laws and is a Lévy process. The sample continuity will conclude a Gaussian distribution N (at, tC) some a ∈ Rd and some positive semi-definite symmetric matrix C. Up to a linear transform the process is indeed a Brownian motion. 3.7 Lecture 4-5 Proposition 3.7.1 The joint distribution of (Bs , Bt ), s < t, has density ps (0, x)pt−s (x, y) with respect to the Lebesque measure dxdy on (Rd )2 . Proof For any A1 , A2 ∈ B(Rd ), P (Bs ∈ A1 , Bt ∈ A2 ) = E1A1 (Bs )1A2 (Bt ) = E1A1 (Bs )1A2 (Bt − Bs + Bs ) Z Z = 1A1 (x)1A2 (z + x)ps (0, x)pt−s (0, z)dzdx d d ZR ZR = 1A1 (x)1A2 (y)ps (0, x)pt−s (0, y − x)dzdx d d ZR R = ps (0, x)pt−s (x, y)dydx. A1 ×A2 3.7.1 Finite Dimensional Distributions Let I = [0.T ]. Let (Xt , 0 ≤ t ≤ T ) be a stochastic process with values in a separable metric space (S, B(S)). 30 Definition 3.7.2 For 0 ≤ t1 < t2 < · · · < tn , the measurable map ω 7→ (Xt1 (ω), Xt2 (ω), . . . , Xtn (ω)) from (Ω, F) to (S n , B(S n )) induces a Borel measure µt1 ,...,tn on S n . These are the finite dimensional distributions of the stochastic process (Xt ). They are also known as marginal distributions. For A ∈ B(S n ), µt1 ,...,tn (A) = P (ω : (Xt1 (ω), Xt2 (ω), . . . , Xtn (ω)) ∈ A). Exercise. Compute the finite dimensional distributions of a Brownian motion. Let (Xt ) be a stochastic process, we define a map X· : Ω → (Rd )[0,T ] by X· (ω)(t) = Xt (ω). It is a measurable map. The measure (X· )∗ (P ) is the probability distribution (also called the law ) of the process. Finite dimensional distributions determine the distribution of the process. 3.7.2 Gaussian Measures on Rd Given a measure µ, its characteristic function is: Z µ̂(λ) := eihλ,xi dµ(x), λ ∈ Rd . Two finite measures agree if their characteristic functions agree. See §3.8 (page 197), volume I of [3]. A probability measure is Gaussian if for some a ∈ Rd and some symmetric positive semi-definite d × d matrix C, 1 µ̂(λ) = eiλa− 2 hCλ,λi . This is denoted by N (a, C). The measure is absolutely continuous with respect to the Lebesque measure if C is invertible. Take the subspace spanned the eigenvectors of C, we we that the Gaussian measure is non-degenerate on this subspace. hence there is little need to study non-degenerate Gaussian measures on a finite dimensional vector space. Let a ∈ Rd , C a positive definite d × d matrix. The (non-degenerate) Gaussian measure N (a, C) on Rd is defined as the measure that is absolutely continuous with respect to dx and 1 1 dµ −1 = e− 2 hC (x−a),x−ai . n√ dx (2π) 2 det C 31 (3.4) Take t 0 ... 0 0 t ... 0 . = ... 0 0 ... t C = tId×d |x−a|2 dµ 1 − 2t e = pt (a, x) = . d dx (2πt) 2 Definition 3.7.3 A random variable whose distribution is Gaussian is a Gaussian variable. A stochastic process Xt is Gaussian if its marginal distributions are Gaussian. Proposition 3.7.4 Let X = (X1 , . . . , Xn ) ∼ N (a, C) where C = (Ckl ). Then EX = a and cov(Xk , Xl ) = Ckl . Proof If C is diagonal this is easy to see. Otherwise, let A be a square root of C, C = AAT . Then det C = (det A)2 and hC −1 ξ, ξi = |A−1 ξ|2 , any ξ ∈ Rd . Let {ej } be the standard o.n.b. of Rd , then Z EX = x Rd (2π) n 2 1 √ 1 det C e− 2 hC −1 (x−a),x−ai dx Z 1 1 −1 e− 2 hC x,xi dx √ y∈Rd (2π) det C Z 1 1 2 z=A−1 y e− 2 |z| det(A)dz = 0. = (Az + a) d√ z∈Rd (2π) 2 det C = (y + a) d 2 The key ingredients is that the measure is turned into a product of measures on each P factor and Az is a linear function of z. In components, hAz + a, ek i = j Ak,j zj + ak . Z EXk = EhX, ek i = ( z∈Rd X Ak,j zj + ak ) j 32 1 1 −2 d e (2π) 2 P 2 i (zi ) Πdzi = ak . cov(Xk , Xl ) = EhX − a, ek ihX − a, el i Z = hx − a, ek ihx − a, el i 1 √ 1 e− 2 hC −1 (x−a),x−ai dx (2π) det C Z 1 1 −1 = hy, ek ihy, el i e− 2 hC x,xi dx d√ y∈Rd (2π) 2 det C Z 1 1 2 z=A−1 y e− 2 |z| det(A)dz hAz, ek ihAz, el i = d√ z∈Rd (2π) 2 det C Z P X X 1 − 12 i |z|2 e ( ak,j zj al,i zi ) = dz d z∈Rd j (2π) 2 i Z P X 1 − 12 i |z|2 = Ak,i Al,i dz = (AAT )k,l = Ck,l . zi2 d e d 2 z∈R (2π) i Rd d 2 A family of real valued random variables {X1 , . . . , Xk } are uncorrelated if cov(Xj , Xk ) = 0 when j 6= k. For Gaussian random variables,being uncorrelated and being independent are equivalent. In fact if Ci,j = 0 when i 6= j, C is diagonal and the Gaussian measure is a product measure. A family of stochastic processes are independent, if their joint probability distribution is a product of the individual probability distribution. This can be tested with evaluation at a finite number of times or the investigation of their finite dimensional distributions. Remark 3.7.5 If (Bt ) = (Bt1 , . . . , Btd ) is a d-dimensional Brownian motion if and only if {(Bt1 , . . . , Btd )} are independent one dimensional Brownian motions. For each t, {Bt1 , . . . , Btd } are independent random variables from C = tId×d . The independence of the stochastic processes follows from this and the independent increment property. 3.8 Kolmogorov’s Continuity Theorem In the following definitions, the Euclidean space Rd can be replaced by a Banach space. Definition 3.8.1 Let α ∈ (0, 1). Let I be an interval of R, a function f : I → Rd is Hölder continuous of exponent α if |f (t) − f (s)| ≤ C|t − s|α . 33 Definition 3.8.2 Let α ∈ (0, 1). Let I be an interval of R, a function f : I → Rd is locally Hölder continuous of exponent α if on any compact subinterval [a, b] ⊂ I, |f (t) − f (s)| < ∞. |t − s|α t6=s,t,s∈[a,b] sup Let T ∈ R+ ∪ {0}. Theorem 3.8.3 ( Kolmogorov’s Continuity Theorem) Let (xt , 0 ≤ t < T ) be a stochastic process, taking values in a Banach space (E, k − k). Suppose that there exist positive constants p, δ and C such that Ekxt − xs kp ≤ C|t − s|1+δ . Then there is a continuous modification of (xt ), denoted as (x̃t ), which is locally Hölder continuous with exponent α ∈ (0, δ/p). For a proof or a refined version of this theorem (due to Garsia-Rademich-Rumsey 1970) see §2.1 (pages 47-51) of Stroock-Varadhan [27]. Definition 3.8.4 Two stochastic processes (Xt ) and (Yt ) on the same probability space are modifications of each other if for each t, P (Xt = Yt ) = 1. Here the exceptional set, {ω : Xt (ω 6= Yt (ω)}, of zero measure may depend on t. Definition 3.8.5 Two stochastic processes (Xt ) and (Yt ) on the same probability space are indistinguishable of each other if P (Xt = Yt , ∀t) = 1. Theorem 3.8.6 Let (xt , 0 ≤ t < T ) be a real valued stochastic process such that xt − xs is N (0, t − s) distributed. It has a continuous modification and the paths t 7→ xt (ω) are Hölder continuous on any interval [a, b] ⊂ [0, T ) of exponent α for any α < 21 . Proof For any p > 0, 1 p Z E|xt − xs | = p 2π(t − s) √ x=( t−s)z = ∞ −|x|2 |x|p e 2(t−s) dx −∞ 1 |t − s| √ 2π p 2 Z ∞ |z|p e −|z|2 2 p dz = E(|x1 |p )|t − s| 2 < ∞. −∞ By Kolmogorov’s continuity criterion there is a continuous modification of the p −1 sample path. The Hölder exponent, α = 2 p = p−2 2p , can be taken arbitrarily close 1 to 2 by taking p large. 34 3.8.1 The existence of the Wiener Measure and Brownian Motion Fix T > 0. We come back to prove Theorem 3.3.1, which states that there is a unique probability measure on W d such that µ(ω : πt1 (ω) ∈ A1 , . . . , πtn (ω) ∈ An ) Z Z pt1 (0, x1 )pt2 −t1 (x1 , x2 ) . . . ptn −tn−1 (xn−1 , xn )dxn . . . dx2 dx1 . ... = A1 An (3.5) We used πt (ω) = ωt to denote the evaluation map at t. By Kolmogorov’s extension theorem, there is a unique probability measure µ0 on the product σ-algebra of (Rd )[0,T ] . See §3.9.2 for detail. Treat (B(Rd ), ⊗[0,T ] B(Rd ), µ0 ) as a probability space and πt : B(Rd ) → Rd is a measurable function. Since πt ∼ N (0, t − s), by Theorem 3.8.6 πt has a continuous modification and hence the set of noncontinuous path has µ0 measure zero. The product σ-algebra = ⊗[0,T ] B(Rd ) is smaller than the Borel σ-algebra of the product topology. Since the projections πt are continuous, B(W d ) contains the σ-algebra of the product topology. However continuous functions are determined by their values on rational numbers. The Borel σ-algebra is contained in B(W d ) = W d ∩ ⊗[0,T ] B(Rd ). However W d or W0d is not a measurable set in the tensor σ-algebra! We refer to Revuz-Yor [23]. Corollary 3.8.7 There is a measurable map φ : ((Rd )[0,T ] , ⊗[0,T ] B(Rd )) → (W0d , B(W0d )). Let µ = φ∗ µ0 . Then µ satisfies (3.5). In conclusion Wiener measure exists and Brownian motion exists. In fact πt is a Brownian motion on the Wiener space. The map φ is constructed as below. If ω is not continuous on Q we define φ(ω) ≡ 0. Let Ω0 be the collection of such sets. It is clearly measurable. If ω is continuous on Q we define φ(ω) such that it agrees with ω on Q and is continuous. Now for all Borel sets Ai , {ω : φ(ωti ) ∈ Ai , ti ∈ Q} = {ω : ωti ∈ Ai , ti ∈ Q} or {ω : φ(ωti ) ∈ Ai , ti ∈ Q} = {ω : ωti ∈ Ai , ti ∈ Q} ∪ Ω0 . Both are measurable sets. By Corollary 3.8.6, µ0 (ω 0 ) = 0. On cylindrical sets, φ ∗ µ 0 = µ0 . 35 3.8.2 Lecture 6. Sample Properties of Brownian Motions Rt We wish to define 0 fs dBs . Could we use the theory of Lebesque-Stieltjes inRb tegration ? To make sense of a fs dgs as a Lebesque-Stieltjes integral g is assumed to be of finite total variation on [a, b]. Functions of finite total variation is differentiable almost surely everywhere. A typical Brownian path is nowhere differentiable. If fs is reasonably smooth we may overcome the problem of the differentiability of t 7→ Bt R. By the elsmentary integrationR by parts formula: Rb t t a fs dBs = fb Bb − fa Ba − 0 Bs dfs . The stochastic integral 0 fs dBs we define later will not request such regularities on fs . Since a d-dimensional Brownian motion consists of d independent one dimensional Brownian motions. In this section we assume that Bt is a one dimensional Brownian motion. Proposition 3.8.8 Let ∆n : a = tn0 < tn1 < · · · < tnMn +1 = b be a sequence of partitions of [a, b] with |∆n | → 0. Then lim E n→∞ Mn X !2 2 (Btni+1 − Btni ) − (b − a) = 0. i=0 In particular Tn converges in probability to b − a. There is a sub-sequence of partitions ∆nk , such that for almost surely all ω, Mn X Tn ≡ (Btni+1 − Btni )2 → b − a. i=0 Proof ETn = Mn X E(B tn i+1 2 −B ) = tn i i=0 d Note that Bt − Bs = √ Mn X (tni+1 − tni ) = b − a. i=0 t − sB1 (exercise). 2 E(Tn − (b − a)) = var(Tn ) = Mn X var (Btni+1 − Btni )2 i=0 = Mn X var (tni+1 − tni )B12 , d sinceBt − Bs = √ t − sB1 i=0 = Mn X tni+1 − tni 2 (varB12 ) ≤ |∆n |(b − a)var(B12 ) → 0. i=0 36 The first statement of the Proposition hold. Now L2 convergence implies convergence in probability, and so there is a sub-sequence that is convergent almost surely. Taking for example a dyadic partition, divide each interval by 2 each time, the whole sequence converge almost surely, [14] Definition 3.8.9 Let (xt , 0 ≤ t < ∞) be a stochastic process. Suppose that for any sequence ∆n of partitions of [0, t] with limn→∞ |∆n | = 0, T (n) (x· ) := Mn X (xtni+1 − xtni )2 i=0 converges, in probability, to a finite limit, which we denote by hx, xit . We say that (xt ) has finite quadratic variation and hx, xit is its quadratic variation process. the convergence. We also denote the convergence mentioned above by lim n→∞ Mn X (P ) E(xtni+1 − xtni )2 = hx, xit i=0 The quadratic variation process is an increasing function. By Proposition 3.8.8, the Brownian motion (Bt , t ∈ [0, ∞)) has finite quadratic variation and its quadratic variation process is t. We’ll see below that a sample continuous stochastic process (xt , t ∈ [0, ∞)) with finite quadratic variation, which is not identically zero, cannot have finite total variation over any finite time interval. This can be seen in the following proposition, where the Brownian motion (Bt ) can be replaced by a continuous process. Proposition 3.8.10 For almost surely all ω, the Brownian paths t 7→ Bt (ω) has infinite total variation on any intervals [a, b]. And Bt (ω) cannot have Hölder continuous path of order α > 21 . The total variation of a function g : [a, b] → R is: gT V ([a, b]) = sup n−1 X |g(ti+1 ) − g(ti )| ∆ i=0 where ∆ : a = t0 < t1 < · · · < tn = b ranges through all partitions of [a, b]. Proof Fix an ω. Since Bt has almost surely continuous paths we only consider all such ω with t 7→ Bt (ω) continuous. 37 Suppose that B(ω)T V ([a, b]) < ∞. For the partitions in the previous lemma such that Tn converges almost surely, Mn Mn X X n 2 n n n (Bti+1 − Bti ) ≤ max Bti+1 (ω) − Bti (ω) · (Btni+1 − Btni ) i i=0 i=0 n ≤ max Bti+1 (ω) − Btni (ω) · B(ω)T V ([a, b]) → 0 i The convergence follows Pn from the fact that Bt (ω)2is uniformly continuous on [a, b]. This contradicts that i=0 (Btni+1 (ω) − Btni (ω)) converges to d − c. n n Suppose that Bti+1 (ω) − Bti (ω) ≤ C(ω)|t − s|α , where C(ω) is a constant for each ω, for some α > 12 . Mn X |B tn i+1 2 2 (ω) − B (ω)| ≤ C (ω) tn i i=0 n X |tni+1 − tni |2α i=0 ≤ C 2 (ω)|∆n |2α−1 2 n X (tni+1 − tni ) i=0 n 2α−1 ≤ C (ω)(b − a)|∆ | → 0, as 2α − 1 > 0. This contradicts with Proposition 3.8.8. 38 3.8.3 The Wiener measure does not charge the space of finite energy* Let F : R → R be an increasing function. Then f has only a countable number of points of discontinuity and f is differentiable almost everywhere. We modify the value of f at the points of discontinuity so that the modified function is right continuous. Then f 0 = f˜0 where they are differentiable and this holds a.e.. A right continuous increasing function induces a measure µF on R, which is determined by µF ((a, b]) = F(b) − F (a). It is easy to see that this is indeed a measure and the continuity of the measure on a monotone sequence of sets is assured by the right continuity of F . This is the Lebesgue-Stieltjes measure associated to F. If F : [c, d] → R is of finite total variation with T V (F ) then both F1 = T V (F ) + F and F2 = T V (F ) − F are increasing and F := F1 − F2 . If F is right continuous, T V (F ), F1 and F2 are right continuous. If the measure µF is absolutely continuous with respect to the Lebesque measure, denote by p(x) the its density. Z Z f (x)dµF (x) = f (x)p(x)dx for any f bounded measurable. In particular, Z b p(x)dx = F (b) − F (a). a Suppose that F ∈ H 1 , the space of finite energy also known as W 1,2 : Z 1 0 2 1 d F (s) ds < ∞}. H = {F ∈ W0 : 0 Then the weak derivative F 0 = p and Z b F (b) − F (a) = F 0 (x)dx. a By Cauchy-Schwartz, |F (b) − F (a)| ≤ qR √ |F 0 (x)|2 dx b − a and F is Hölder continuous of order 21 . Since almost surely all path in W d is not a Hölder continuous of order 12 , the Wiener measure does not charge µ(H) = 0. 39 3.9 Product-σ-algebras Let I be an arbitrary index and let (Eα , Fα ), α ∈ I be a family of measurable spaces. The tensor σ-algebra, also called the product σ-algebra, on E = Πα∈I Eα is defined to be ⊗α∈I Fα = σ{πα−1 (Aα ) : Aα ∈ Fα , α ∈ I}. Here πα : E = Πα∈I Eα → Eα is the projection (also called the coordinate map). If x = (xα , α ∈ I) is an element of E, πα (x) = xα . The tensor σ-algebra is the smallest one such that for all α ∈ I, the mapping πα : (E, ⊗α∈I Fα ) → (Eα , Fα ) is measurable. For example, we make take (Eα , Fα ) = (S, B), a metric space with its Borel sigma-algebra. We often take S = R or S = Rn . 3.9.1 The Borel σ-algebra on the Wiener Space The set of all maps from [0, 1] to Rd can be denoted as the product space (Rd )[0,1] . The tensor σ-algebra ⊗[0,1] B(Rd ) is the smallest space such that the projections πt (σ) = σ(t) is measurable. The product topology is the smallest topology such that the projections are continuous. Hence the Borel σ-algebra of the product topology is larger than the tensor σ-algebra. A subset W d of (Rd )[0,1] is the subset of continuous functions. The tensor σ-algebra of (Rd )[0,1] is determined by set of the form {πt−1 (B) : B ⊂ B(Rd )}. Note that B(W d ) ⊂ W d ∩⊗[0,1] B(Rd ) by (3.2). On the other hand πt : W d → Rd is continuous with respect to the uniform topology: |πt (ω1 ) − πt (ω2 )| ≤ kω1 − ω2 k. This means that the uniform topology is finer than the product topology and B(W d ) contains the Borel σ-algebra of the product topology on (Rd )[0,1] ∩ W d and hence contains W d ∩ ⊗[0,1] B(Rd ). In conclusion, B(W d ) = ⊗[0,1] B(Rd ) ∩ W d . 40 Any measurable set in the tensor σ-algebra is determined by the a countable number of projections. The action to determine whether a function is continuous or not cannot be determined by a countable number of operations. Hence W d is not a measurable set in the tensor σ-algebra. (Don’t be confused with the following: once we know a function is continuous, the function can be determined by its values on a countable dense set.) 3.9.2 Kolmogorov’s extension Theorem For each ω, we may view t ∈ I 7→ X(ω, t) ∈ S an element of S I . Define X· : Ω → S I to be the map from t to Xt (ω). Then X· : (Ω, F) → (S I , ⊗I B) is measurable if and only if each Xt : (Ω, F) → (S, B) is measurable. Definition 3.9.1 The map X : Ω → S [0,T ] induces a probability measure on ⊗[0,T ] B. This is the law or the distribution µX of the stochastic process. The projection map from S [0,T ] to S n defined by πt1 ,...,tn : X· ∈ S [0,T ] 7→ (Xt1 , . . . , Xtn ) ∈ S n induces from µX the marginal distribution µt1 ,...,tn . The question is whether the marginal distributions determine the law µ. A cylindrical set in πα∈I S is product set each of whose factor spaces, with possibly a finite number of exceptions, equals S. Let E denote the collection of cylindrical sets: E = {Πα∈I Aα , Aα ∈ B} where Aα = S except for a finite number of index α. Then σ(E) = ΠI B. Let J ⊂ I be a subindex set. Let πJ : Πα∈I S → Πα∈J S be the projection. If J ⊂ K ⊂ I let πKJ : Πα∈K S → Πα∈J S be the projection map. Let µ be a measure on the product space πI S, and let µJ the measure induced by the projection from πI S to πJ S. Then (µJ ) is a projective family of measures, which means (πKJ )∗ µK = µJ . The family of marginal distributions forms a projective family of cylindrical measures. Definition 3.9.2 Let E be a separable metric space. Let {µJf } be a collection of probability measures on ⊗I B where Jf runs through finite subsets of I. They are called cylindrical measures. This is a projective family of measures if (πKJ )∗ (µk ) = µJ . 41 Theorem 3.9.3 (Kolmogorov’s Extension Theorem) Given a projective family of cylindrical measures, there is a unique probability measure µ on ⊗I B such that (πJf )∗ µ = µJf . In fact there is a stochastic process Xt : Ω × [0, 1] to S such that the distribution of (Xt1 , . . . , Xtn ) is µ(t1 ,...,tk ) . Remark 3.9.4 This theorem leads to problems in convergence of stochastic processes, especially weak convergence. If there is a sequence of measures µn , µ on ⊗n B such that the marginal distributions µnt1 ,...,tk of µn converges weakly to that of µ, it is not necessary that µn converges to µ weakly. See Billingsley on convergence of probability measures. 42 Chapter 4 Conditional Expectations All measures are assumed to be σ-additive which means the total space is the union of a countable number of measurable sets of finite measure. The Lebesque measure is σ-additive and don’t forget finite measures. Short hand: we denote by f ∈ G the statement that ‘f is a function measurable with respect to G’. 4.1 Preliminaries This section is assumed and not covered in the lectures. A measure is not a canonical object. If there is another way of measuring a set how do the two measures compare? Definition 4.1.1 1. Let (Ω, F) be a measurable space. Given two measures P and Q. The measure Q is said to be absolutely continuous with respect to P if Q(A) = 0 whenever P (A) = 0, A ∈ F. This will be denoted by Q << P . 2. They are said to be equivalent, denoted by Q ∼ P , if they are absolutely continuous with respect to the other. Theorem 4.1.2 (Radon-Nikodym Theorem) If Q << P , there is a nonnegative measurable function Ω → R, , which we denote by dQ dP , such that for each measurable set A we have Z dQ Q(A) = (ω)dP (ω). A dP The function dQ dP : Ω → R is called the Radon-Nikodym derivative of Q with respect to P . We also say that dQ dP is the density of Q with respect to P . This function is unique. 43 1 Note that if Q is a finite measure then dQ dp ∈ L (Ω, F, P ). If P is a probability R dQ measure, and Ω dP (ω) dP (ω) = 1, then Q is a probability measure. If furthermore dQ dP > 0, then Z Z Z 1 dQ 1 dP = dP = dQ. dQ dP dQ A A A dP dP It follows that Q << P and the two measures are equivalent and dP dQ · dQ dP = 1. Example 4.1.3 Let Ω = [0, 1) and P the Lebesque measure. Let Ani = [ 2in , i+1 2n ), n n n n i = 0, 1, . . . , 2 − 1. and Fn = σ{A0 , A1 , . . . , A2n −1 }. Let µ be a measure on Fn . Check that X µ(An ) dµ i (x) = 1An (x), x ∈ [0, 1). dP P (Ani ) i i Two measures Q1 and Q2 are singular if Q1 (A) = 0 whenever Q2 (A) 6= 0 and Q2 (A) = 0 whenever Q1 (A) 6= 0. Example 4.1.4 Let Ω = [0, 1] and P the Lebesgue measure . Define Q(A) = R dQ1 2 2 A x2 dx so dQ dP (x) = 2x . Define Q1 by dP = 21[0, 1 ] . Then Q1 << P and P 2 is not absolutely continuous with respect to Q1 . Define Q2 by two measures Q1 and Q2 are singular. 44 dQ2 dP = 21[ 1 ,1] . The 2 4.2 Lecture 7-8: Conditional Expectations Question: What is the conditional distribution of (Bt ) given B1 = 0? Let (Ω, F, P ) be a probability space. Let G be sub-σ-algebra of F. The random variables here are either real valued or Rd valued, or vector space valued. Definition 4.2.1 Let X ∈ L1 (Ω, F, P ). The conditional expectation of X given G is a G-measurable function, denoted by E{X|G}, such that Z Z X(ω)dP (ω) = E{X|G}(ω)dP (ω), ∀A ∈ G (4.1) A A Standard brackets can be used instead of the curly ones to denote conditional expectation, E(X|G). Theorem 4.2.2 The conditional expectation of X ∈ L1 (Ω, F, P ) exists and is unique up to a set of measure zero. Proof R • Existence. Define Q(A) = A X(ω)dP (ω) for A ∈ G. Now P restricts to a measure on G and Q << P . Let E{X|G} = dQ dP . ( Treat Q as a signed measure or assume that X ≥ 0 in the first instance. For X = X + − X − define E{X|G} = E{X + |G} − E{X − |G}.) • Uniqueness. Let g and g̃ be two such functions. Since g − g̃ is G measurable, A1 := {g − g̃ > 0} ∈ G R R and A∩A1 gdP = A∩A1 g̃dP for all A ∈ G. Thus (g − g̃)1A1 = 0 a.s.. So g ≤ g̃ and by symmetry g = g̃ a.s.. Proposition 4.2.3 For all bounded G-measurable function g. Z Z g(ω)X(ω)dP (ω) = g(ω)E{X|G}(ω) dP (ω). Ω (4.2) Ω This follows from the Monotone Class Theorem. By the uniqueness and direct checking of equation (4.1), the following properties are intuitive and their proofs are straight forward. 45 • If X ∈ G, E{X|G} = X a.s. • If X is orthogonal to G, E{X|G} = EX a.s. • If g ∈ G, X ∈ F, then E{gX|G} = gE{X|G} a.s. • For all a, b ∈ R, E{aX + bY |G} = aE{X|G} + bE{Y |G}. • If X ≤ Y , then E{X|G} ≤ E{Y |G}. Exercise 4.2.1 The family The family of random variable {E{X|G} : g ⊂ F } is L1 bounded. Proof E|E{X|G}| ≤ E (E{|X| | G}) = E|X|. Conditional Probability: Given B ∈ F, P (B|G) := E(1B |G), is the conditional probability of B given G. 4.3 Properties of conditional expectation This following is given as a handout, not part of the lectures. Proposition 4.3.1 Let X, Y ∈ L1 (Ω, F, P ) and G a sub-σ-algebra of F. 1. For all a, b ∈ R, E{aX + bY |G} = aE{X|G} + bE{Y |G}. 2. E(E{X|G}) = EX. 3. If X ≤ Y , then E{X|G} ≤ E{Y |G}. 4. If G1 is a sub σ-algebra of G2 then E{X|G1 } = E{E{X|G1 }|G2 } = E{E{X|G2 }|G1 }. 46 5. If X is G measurable, XY ∈ L1 then E{XY |G} = XE{Y |G}. In particular E{X|G} = X a.s. 6. Jensen’s Inequality. Let φ : Rd → R be a convex function, e.g. φ(x) = |x|, |x|p , p > 1. If n = 1, ex is convex. Then φ (E{X|G}) ≤ E{φ(X)|G}. For p ≥ 1, kE(X|G)kp ≤ kXkp . 7. If |Xn | ≤ g ∈ L1 then E(Xn |G) → E(X|G). 8. If E|Xn − X| → 0 then E|E{Xn |G} − E{X|G}| → 0. 9. If Xn ≥ 0 and Xn increases with n then E(Xn |G) increases to E(limn→∞ Xn |G). 10. If Xn ≥ 0, E(lim inf n→∞ Xn |G) ≤ lim inf n→∞ E(Xn |G). 11. If σ(X) ∨ G, the σ-algebra generated by σ(X) and A, is independent of A then E(X|A ∨ G) = E(X|G). In particular if X is independent of G then E{X|G} = EX. Proof The proof for 1-4 is straight forward. For 5, check it holds for X = 1B where B ∈ G and apply Monotone class theorem. For 6, note that φ(x) = sup{f(p,q) : f(p,q) (x) = px+q, f (p,q) ≤ φ, p, q ∈ Q}. For 7, note that E{|Xn | |G} ≤ g ∈ L1 . For 8, |E{Xn − X G}| ≤ E{|Xn − X| G}. Check L1 boundedness and uniformly absolutely continuity of E{|Xn − X| |G} on G. For 9-10, noteR that Xn 1B is positive and increasing. The last statement: R R for A ∈ A, B ∈ G, A∩B XdP = E(X1B )P (A), A∩B E{X||G}dp = A E{X1B |G}dP = RP (A)E(X1BR ). Since {A ∩ B} forms a π-system, show that C = {D ∈ G ∨ A : D XdP = D E{X|G}dP } is a Dynkin system and conclude that C = G ∨ A. 4.4 Conditioning on a Random Variable Let η : (Ω, F) → (S, B) be a measurable map. Let G = σ(η). Let ξ : ω → (S, B) be F-measurable with E|ξ| < ∞. Denote E(ξ|η) := E(ξ|σ(η)). Let S be a vector space. E(ξ|σ(η)) is σ(η)-measurable, there exists a Borel measurable function φ : S → R such that E(ξ|σ(η)) = φ(η). 47 Write E{ξ|η = y} for φ(y). This can be understood from another point of view. Let P̂η = η∗ (P ), the distribution measure of η on R. We define a Borel measure µ on R as below. Let C ∈ B(R). Z µ(C) = ξdP. η∈C If P (η ∈ C) = P̂η (C) = 0, µ(C) = 0. Hence µ << P̂η and there is a function dµ (y) on R such that dP̂η Z dµ µ(C) = C Hence dµ (y) dP̂η C dµ (η) dP̂η (y)dP̂η (y). is a version of P (ξ|η = y). By the change of variable formula, Z and dP̂η dµ dP̂η Z (y)dP̂η (y) = η∈C dµ dP̂η (η)dP = E{ξ|η}. (D) Example 4.4.1 Suppose that (ξ, η) = f (x, y)dxdy. Define fξ|η (x, y) = if the integrand is not zero. Otherwise set it to be zero. Then Z P (ξ ∈ A|η = y) = fξ|η (x, y)dx. A This is the conditional distribution of ξ given η. Also Z E(ξ|η = y) = xfξ|η (x, y)dx. R Proof Let A, B(R). Then Z P (ξ ∈ A, η ∈ C) = Z On the other hand, Z η∈C Z fξ|η (x, η)dxdP = η∈C P (ξ ∈ A|η)dP. 1ξ∈A dP = η∈C fξ|η (x, y)dx(η)∗ P (dy). C 48 R f (x,y) R f (x,y)dx R Since (η)∗ P (dy) = R f (x, y)dx. Z Z Z f (x, y)dxdy = P (ξ ∈ A, η ∈ η). fξ|η (x, η)dxdP = C η∈C A It follows that P (ξ ∈ A|η) = fξ|η (x, η). In the case of X and Y are independent, any functions of the variable X is independent of Y . Then P̂(X,Y ) = P̂X × P̂Y . For all y, we take µy (B) = P̂X (B). Clearly µy is a probability measures (same for all y) and Z Z y µ (B)dP = PX (B)dP = P (y ∈ A, X ∈ B). y∈A y∈A Hence µy (B) is a version of conditional expectation of B on Y . Example 4.4.2 Compute P (Bt ∈ A|B1 = 0). Note that (Bt , B1 ) ∼ pt (0, x)p1−t (x, y). Then Z Z p (0, x)p1−t (x, y) pt (0, x)p1−t (x, y) R t P (Bt ∈ A|B1 = y) = dx = dx. p1 (0, y) A R pt (0, x)p1−t (x, y)dx A We have used the fact that Z pt (0, x)p1−t (x, y)dx = p1 (0, y), R which is a Chapman-Kolmogorov equation. The Brownian bridge from 0 to 0 at time a has density given by pt (0, x)p1−t (x, 0)dx . p1 0, 0 Question: For 0 < t1 < t2 < 1, Does P (Bt1 ∈ A1 , Bt2 ∈ A2 |B1 = 0) have a density ? Check out pt1 (0, x)pt2 −t1 (x, y)p1−t2 (y, 0) . p1 (0, 0) 49 Example 4.4.3 Let Gs = σ{Br : 0 ≤ r ≤ s}. Let (Bt ) be a Brownian motion. Let t ≥ s. Prove that E{Bt |Gs } = Bs . If t = s, Bt is Gt measurable and E{Bt |Gs } = Bs . If s < t, E{Bt |Gs } = E{Bt − Bs |Gs } + E{Bs |Gs } = E{Bt − Bs |Gs } + Bs . Let us now prove that E{Bt − Bs |Gs } = 0. Since Gs is determined by cylindrical sets. We only need to test on functions of the form πi 1{Bti ∈Ai } , which are associated to ∩i {Bti ∈ Ai }. Here 0 = t0 < t1 < t2 < · · · < tk = s. To make it simple to read let gi : R → R be Borel functions, consider k E(Bt − Bs )πi=1 gi (Bti ). Since Btj = see that Pj i=1 (Bti+1 − Bti ) and use the independent increments property to k E(Bt − Bs )πi=1 gi (Bti ) = 0 which means E{Bt − Bs |Gs } = 0. 50 4.5 Regular Conditional Probabilities This section is not delivered in class. We have previously seen that in the case when G is a finite partition of Ω there is a probability measure P ω such that P (B|G)(ω) = P ω (B). This construction works for Ω = {1, 2, . . . , n, . . . } with discrete topology. Let C ∈ F denote P (C|G}(ω) = E(1C |G}(ω). (4.3) This is the condition probability of C given G. Definition 4.5.1 A system of probability measures {Q(B)(ω) : ω ∈ Ω} is called a regular conditional probability given G if (1) Each B ∈ F, Q(B)(ω) is a version of P (B|G)(ω). (2) For almost all ω, Q(·)(ω) is a probability measure. We write Qω for the measure Q(·)(ω). Note that P ω (B) is a version of E{1B |G}(ω) means that for all B ∈ F and A ∈ G, Z P (A ∩ B) = Q(B)(ω)dP (ω). A Note that for all f ∈ L1 Z E{f |G}(ω) = f (ω̃)Qω (dω̃). Ω We now consider the task of constructing a regular conditional probability. Note that P (B|G)(ω) : F × Ω → R is G measurable and satisfies that • For any version of the the conditional expectation, P (Ω|G)(ω) = 1 • ‘Countable’ additivity. Let us fix a countable number of disjoint sets Ck and any versions of the conditional expectations of Ck , then outside of a set of measure zero: P (∪nk=1 Ck |G)(ω) = n X k=1 51 P (Ck |G)(ω). Hence P (−|G)(ω) is a natural candidate for a probability measure. For each B, P (B|G)(ω) is defined outside of a null set N B and the value of the function on N B is undetermined. It still makes sense when a countable number of sets are used. The problem arises when there are an uncountable number of countably disjoint sets. Throughout this section let E be a Polish space ((separable ands completely metrizable ) ). We have in mind [0, 1], Rn , finite dimensional manifolds and the Wiener space W . Borel σ-algebras for such spaces are countably generated, a property which allows effective management of exceptional sets in the definition of conditional expectations. Definition 4.5.2 A measurable space (E, A) is a Borel space also known as a standard space if there is a bijection φ from E to [0, 1] such that φ and φ−1 are measurable while [0, 1] is equipped with the Borel σ-algebra. A Polish space with its Borel σ-algebra is a standard space. A measurable subset of a standard topological space together with the induced sigma field is a standard probability space. Theorem 4.5.3 (Thorem 7.1, p.145 Parthasarathy [21]) If (Ω, F) is a standard probability space, then a regular conditional expectation given any sub-σ -algebra exists. For the proof please consult Parthasarathy [21] (pp145-146). Example 4.5.4 Take a measurable function Y : Ω → E = {y1 , y2 , . . . }. Define Ai = Y −1 ({yi }) = {ω : Y (ω) = yi }. Then σ(Y ) = σ{Ai } and Ai ∩ Aj = ∅ if i 6= j. Let X ∈ L1 (Ω, F, P ). Define ( E(1Ai X) P (Ai ) , if P (Ai ) 6= 0 . φ(yi ) = 0, if P (Ai ) = 0 Then ( φ(Y (ω)) = E(1Ai X) P (Ai ) , 0, if ω ∈ Ai and P (Ai ) 6= 0 . if ω ∈ Ai and P (Ai ) = 0 Then E(X|Y )(ω) = φ(Y (ω)). Indeed for any Ak ∈ σ(Y ), Z Z φ(Y )dP = Ak Ak ! Z X E(1A X) i 1Aj (ω) dP = XdP. P (Ai ) Ak i 52 We may define φ̃ such that φ̃(yi ) = ai , ai ∈ R, if P (Y −1 (yi )) = 0 and φ = φ̃ otherwise. Then φ̃(Y ) is a version of the conditional expectation. And φ̃(Y ) = φ(Y ) almost surely. 53 4.6 Lecture 9: Regular Conditional Distribution and Disintegration Let X : (Ω, F) → (S, B) be a measurable map. Let X ∈ L1 (Ω, F, P ) and G a sub-σ-algebra of F. For B ∈ B, let X −1 (B) = {X ∈ B}. Denote P {X −1 (B)|G} = E{1X −1 (B) |G}. Can we choose a version P (X −1 (B)|G), so that for almost all ω, P (B)(ω) = P {X −1 (B|G)(ω} is a probability measure on (S, B)? Theorem 4.6.1 (Theorem 8.1 (p147) Parthasarathy [21]) If (S, B) and (Ω, F) are separable standard Borel spaces and X : Ω → S a measurable map. Let G ⊂ F. There are a family, µ(ω, ·), of probabilities measures, called the regular conditional distribution of X given G, such that the following holds. • For almsot all ω, µ(ω, ·) is a probability measure on (S, B). • For each A ∈ B, µ(ω, A) is a version of P (X −1 (A) · |G)(ω). The measures are denoted by P̂X (·|G)(ω). For uniqueness and a proof please read Parthasarathy [21], pages 147-150. Note that for any integrable function f : R → R, Z E{f (F )|G}(ω) = f (x)µ(dx|G)(ω). R Let S, S̃ be two complete metric spaces with Borel σ-algebras and G a sub-σ algebra of B(S). Theorem 4.6.2 (Disintegration,Thm 6.4, Kallenberg [14]) Let X : Ω → S so that there is a regular version, µω , of P (X ∈ ·|G). Let Y : Ω → S̃ be Gmeasurable. Let f : S × S̃ → R be such that f (X, Y ) is integrable. Then Z E{f (X, Y )|G}(ω) = f (x, Y (ω))dµω (x). x∈S 54 4.6.1 Lecture 9: Conditional Expectation as Orthogonal Projection Let L2 := L2 (Ω, F, P ), a Hilbert space. Let K := L2 (Ω, G, P ), which is a closed subspace of L2 . Let f ∈ L2 , Let π denote the orthogonal projection defined by the projection theorem, π : L2 (Ω, F, P ) → L2 (Ω, G, P ), given by the orthogonal decomposition: f¯ ∈ K, f 0 ⊥ K. f = π(f ) + f 0 , By f 0 ∈ K we meant that f 0 is orthogonal to K. f f0 ⊥ K π(f ) ∈ K Since π(f ) is the only element in K such that f − π(f ) is orthogonal to K: ∀g ∈ L2 (Ω, G, P ). hf − π(f ), giL2 = 0, Writing this out: Z Z f gdP = Ω π(f ))gdP Ω This holds in particular for g = 1A and A ∈ G. Conclusion. The orthogonal projection of an L2 function f is a version of its conditional expectation, i.e. f¯ = E{f |G} a.s. We next assume that f ≥ 0, not necessarily in L2 . Let fn be a sequence of bounded, hence in L2 , functions (increasing with n) converging to f point wise. Then π(fn ) is well defined and Z Z A ∈ G. 1A fn dP = 1A f¯n dP, Since 1A fn is increasing with n and is positive, so is 1A f¯n . Hence they have limits and we may exchange limits with integration: Z Z Z Z ¯ 1A f dP = lim 1A fn dP = lim 1A fn dP = 1A lim π(fn )dP. n→∞ n→∞ n→∞ This means E{f |G} = lim π(fn ). n→∞ For f ∈ L1 let f = f+ − f− and define E{f |G} = E{f + |G} − E{f − |G}. 55 Remark 4.6.3 Observe that π(f ) is the unique element of K such that kf − π(f )k = min g∈L2 (Ω,G,P ) kf − gk. Please refer to §II.2 Functional Analysis [22] for the Hilbert space projection theorem. 56 4.6.2 Uniform Integrability of conditional expectations Let (Ω, F, µ) be a measure space. Definition 4.6.4 A family of real-valued functions (fα , α ∈ I), where I is an index set, is uniformly integrable (u.i.) if Z lim sup sup |fα |dµ = 0. C→∞ α {|fα |≥C} Lemma 4.6.5 (Uniform Integrability of Conditional Expectations) Let X : Ω → R be an L1 . Then the family of functions {E{X|G} : G is a sub σ-algebra of F} is uniformly integrable. Proof Let A be any set in G, E 1A E{X | G} ≤ E (1A E{|X| | G}) = E ({E{1A |X| | G})) = E(1A |X|). We have used Jensen’s inequality in the first step. Take A = Ω we see that {{E{X|G}} is L1 bounded: sup E|{E{X|G}| ≤ E|X|. G Take A = {|E{X|G}| ≥ C}, where C is a positive number. (That A ∈ G is clear). P (|E{X | G}| ≥ C) ≤ 1 E|X| E | E{X|G}| ≤ , C C which converges to 0 as C goes to infinity. Since f ∈ L1 , for any > 0, there is δ > 0 such that if P (A) < δ, E(1A |X|) < . Consequently, for any , take δ C > E|X| . Then E 1A E{X | G} ≤ E(1A |X|) < . The proof is now complete. 57 Chapter 5 Martingales Let (Ω, F, P ) be a probability space. 5.0.3 Lecture 10: Introduction Definition 5.0.6 A family {Ft }t∈I of non-decreasing sub-σ- algebras of F is a called a filtration: Fs ⊂ Ft , ∀s, t ∈ I, s < t where I is the index set. We say that (Ω, F, Ft P ) is a filtered probability space. Define Ft+ = ∩h>0 Ft+h . A filtration is right continuous if Ft+ = Ft . The filtration {Gt : Gt = Ft+ } is right continuous. The natural filtration of a continuous process is not necessarily right continuous. Let F∞ = ∨t≥0 Ft = σ(∪t≥0 Ft ), the smallest σ algebra containing every σ algebras Ft , t ≥ 0. The completion of a σ-algebra Ft is normally obtained by adding all null sets in F∞ whose measure is zero and is called the augmented σ-algebra. The standard assumption on the filtration is that it is right continuous and each σ-algebra is complete. Definition 5.0.7 A stochastic process (Xt : t ∈ I) is Ft -adapted if each Xt is Ft measurable. Typically we take I = [a, b], [a, ∞), or I = {1, 2, . . . } or I = N . Definition 5.0.8 Let Ft be a filtration on (Ω, F, P ). An adapted stochastic process (Mt , t ∈ I) • is a -martingale if each Xt ∈ L1 , and E{Mt |Fs } = Ms , 58 ∀s ≤ t. • is a sub-martingale, if E{Mt |Fs } ≥ Ms for all s ≤ t. • is a super martingale if E{Mt |Fs } ≤ Ms for all s ≤ t. For a sub-martingale, the integrability condition can be replaced by the integrability of its positive part. See Revuz-Yor [23]. Note that if (Xt ) is a super-martingale then (−Xt ) is a sub-martingale. Example 5.0.9 1. Let {Xn , n ∈ N }, be a sequence of independent integrable random variables with mean zero. Define Fn = σ{X1 , . . . , Xn } and Sn = Pn j=1 Xj . Then E{Sn |Fn−1 } = EXn + Sn−1 = Sn−1 , and so (Sn ) is a martingale. IfXn : Ω → {1, −1} are Bernoulli variables, P (Xn = 1) = 21 , Sn is the simple random walk on the real line. 2. Let Xn be as above. Let X̃n = Xn + 1. Then S̃n = n X X̃n = k=1 n X Xk + n = Sn + n k=1 is a sub-martingale. 3. If X1 , X2 , . . . is a sequence of independent non-negative integrable random variables with mean 1 let Mn = Πni=1 Xi . It is a discrete time martingale with respect to Fk = σ{X1 , X2 , . . . , Xk }. 4. Let f ∈ L1 and ft = E{f |Ft } then ft is a martingale. Proposition 5.0.10 Let Gt be a filtration with Gt ⊂ Ft . If (Xt ) is an (Ft )martingale, it is a Gt -martingale. Proof Let A ∈ Gs ⊂ Fs . Since (Xt ) is an Ft -martingale, E[Xt 1A ] = E[Xs 1A ]. It follows that (Xt ) is an Gt -martingale. We have already seen that the set of martingales is a vector space. Is there a Hilbert structure associated to it? We will see later that there is indeed one on the subspace of L2 -bounded martingales (on which stochastic integrals can be defined through the Riesz representation theorem). For the moment we are contented with the following colorability theorem. Proposition 5.0.11 Let {M n (t) : t ≥ 0} be a sequence of Ft -martingales. If for each t, limn→∞ M n (t) = M (t) and {Mn (t), n = 1, 2, . . . } is uniformly integrable then (Mt ) is a martingale. 59 Proof Let s < t, A ∈ Fs . Then E[Msn 1A ] = E[Mtn 1A ]. Since {Msn 1A } and {Mtn 1A } are uniformly integrable, we may taking n → ∞, exchange limit with taking expectation and conclude that E[Ms 1A ] = E[Mt 1A ]. Example 5.0.12 Take Ω = [0, 1] and define F1 to be the Borel sets of [0, 1] and P the Lebesgue measure. Define Ft to be the σ-algebra generated by the collection of functions which are Borel measurable when restricted to [0, t] and constant on [t, 1]. Let f : [0, 1] → R be an integrable function. We may define Mt = E{f |Ft }, f (x), if x ≤ t R1 Mt (x) = 1 1−t t f (r)dr if x > t. Check that for s < t, f (x), if x ≤ s R1 Rt 1 1−s [ s f (r)dr + t Mt (r)dr] if x > s. f (x), if x ≤ s R1 1 R1 Rt 1 1−s [ s f (r)dr + t ( 1−t t f (u)du)dr] if x > s. f (x), if x ≤ s R1 1 1−s [ s f (r)dr] if x > s. E{Mt |Fs }(x) = = = = Ms (x). Take f (x) = x2 + 1 and compute Mt = E{f |Ft }. 5.1 Lecture 11: Overview Besides martingales there is the concept of “local martingales”. Let us assume all stochastic processes concerned in this section are sample continuous. A sample continuous stochastic process is a local martingale if there is a sequence of stopped stochastic processes (cut-offs) Mtn := Mt∧Tn , which are martingales. Here Tn is a sequence of “stopping times” with limn→∞ Tn = ∞ and so limn→∞ Mtn = Mt . Definition 5.1.1 Let I = [0, ∞) or I = {0, 1, 2, . . . }. A measurable random function T : Ω → I ∪ {∞} is a (Ft , t ∈ I) stopping time if {ω : T (ω) ≤ t} is Ft measurable for all t ∈ I. 60 Definition 5.1.2 The stopped process X T is defined by XtT (ω) = XT (ω)∧t (ω). Definition 5.1.3 An adapted stochastic process (Xt : t ≥ 0) is a local martingale if there is an increasing sequence of stopping times Tn with limn→∞ Tn = ∞ a.s. and is such that for each n, (XtTn − X0 , t ≥ 0) is a uniformly integrable martingale. We say that Tn reduces X. We will also discuss further local martingales in a later section. A one dimensional Brownian motion is a martingale with respect to its own filtration. A sample continuous local martingale with its “quadratic variation process” satisfying hM, M i∞ = ∞ is a time changed Brownian motion: Mt = BhM,M it . This is the context of Dubins-Schwartz theorem (P181, [23]) which we do not discuss further. See definition 3.8.9 for the definition of quadratic variation process. We will be also discussing this in detail later. A stochastic integral with respect to a Brownian motion or with respect to a local martingale is a martingale. We shall see this later. On the other hand, letting B = (B 1 , . . . , B d ) be an Rd valued Brownian motion. A Brownian local martingale, one that is an FtB local martingale is of the form: d Z t X Mt = C + Hsi dBsi . i=1 0 This is the integrable representation theorem for martingales. See Thm (3.5), page 201 [23] for detail. The Clark-Ocone formula express H i in terms of Malliavin derivatives of M . Further expansions leads to L2 chaos decomposition of the space of L2 functions. Content of this paragraph will be outside of the remit of the lectures. Definition 5.1.4 An n dimensional stochastic process (Xt1 , . . . , Xtn ) is a Ft localmartingale if each component is a Ft local-martingale. A remarkable result application of martingales is Lévy’s martingale representation for Brownian motions which we do discuss later. This theorem trivialises among other things the difficult task of proving the components of a candidate stochastic processes begin independent. 5.2 Lecture 12: Stopping Times Consider the time that an event has arrived. This time is ∞ if the event does not arrive. Let I = [0, ∞) or I = {0, 1, 2, . . . }. 61 Definition 5.2.1 A random function T : Ω → I ∪ {∞} is a (Ft , t ∈ I) stopping time, also called an optional time, if {ω : T (ω) ≤ t} ∈ Ft for all t ∈ I. If I = {0, 1, 2, . . . }, T is a stopping time if {T (ω) = n} ∈ Fn . A constant time is a stopping time. T (ω) ≡ ∞ is also a stopping time. Write S ∨ T = max(S, T ) and S ∧ T = min(S, T ). Proposition 5.2.2 (1) If S, T are stopping times, then max(T, S), min(T, S) are stopping times. In particular S ∧ t and S ∨ t are stopping times for any t ∈ I. (2) If Tn is a sequence of increasing stopping times then T := supn Tn is a stopping time. If Sn is a sequence of decreasing stopping times then S := inf n Sn is a stopping time. (3) If Tn are stopping times then lim supn→∞ Tn and lim inf n→∞ Tn are stopping times. Proof These are easily seen from the following, {ω : max(S, T ) ≤ t} = {S ≤ t} ∩ {T ≤ t} ∈ Ft {ω : min(S, T ) ≤ t} = {S ≤ T } ∪ {T ≤ t} ∈ Ft {sup Tn ≤ t} = ∩n {Tn ≤ t} ∈ Ft n {inf S ≤ t} = ∪n {Sn ≤ t} ∈ Ft lim sup Tn = inf sup Tn , n→∞ n≥1 k≥n lim inf Tn = sup inf Tn n→∞ n≥1 k≥n Assume that inf(∅) = +∞. Definition 5.2.3 We say that T : Ω → I ∪ {∞} is a weakly optional time if for all t ∈ I, {T < t} ∈ Ft . Recall Ft+ ≡ Ft+ = ∩s>t Fs . Lemma 5.2.4 (1) A optional time is weakly optional. (2) If T is Ft -weakly optional then T is Ft+ optional. 62 Proof (1) For any t > 0, {T < t} = ∪∞ n=1 {T ≤ t − 1 } ∈ Ft . n (2) We show that If {T < t} ∈ Ft then T is a Ft+ stopping time: {T ≤ t} = ∩∞ n=1 {T < t + 1 Note that ∩N n=1 {T < t + n } = {T < t + 1 N} 1 }. n it follows that {T ≤ t} ∈ Ft+ . Let (Xt ) be a stochastic process with values in a measurable space (S, B). Let B ∈ B. Let TB (ω) = inf{t > 0 : Xt (ω) ∈ B}. We say TB is the hitting time of the set B by the process Xt For a discrete time process (Xn ), TB (ω) = inf {ω : Xn (ω) ∈ B}. n Example 5.2.5 Suppose that (Xn , n = 0, 1, 2, . . . ) is Fn -adapted. Let B be a measurable set. Then TB is an Fn stopping time: {TB ≤ n} = ∪k≤n {ω : Xk (ω) ∈ B} ∈ Fn . For continuous time stochastic process this is more complicated and in general we request some kind of separability of the process. For example of the process is continuous, the process is determined on its values on rational numbers. Theorem 5.2.6 Let (Xt , t ≥ 0) be (Ft ) adapted with values in (S, B). We assume that S is a topological space. 1. Let (Xt ) be left continuous . Let U be an open. Then TU is weakly optional. 2. Let (Xt ) be right continuous . Let U be an open set. Then TU is weakly optional. 3. Let S be a metric space and let (Xt ) be continuous. Let B be a closed set, TB is weakly optional. Proof (1) Let t > 0. If TU = s then Xs ∈ U and 0 ≤ s < t. If Xt is right continuous, since U is open, there Xt ∈ U for t ∈ [s, s + ]. And {TU < t} = ∪r∈Q∩(0,t) {Xr ∈ U } ∈ Ft . 63 This proves that TU is weakly optional. (2) If (Xt ) is left continuous, if s = TU > 0, there is 0 < < s/2 such that Xt ∈ U for t ∈ [s − , s]. Hence {TU < t} = ∪r∈Q∩(0,t) {Xr ∈ U }. If TU = 0 then {TU < t} = Ω. Hence TU is weakly optional. (3) Since B is closed, for any t > 0, 1 ∞ ∞ {TB ≤ t} = ∪m=1 ∩n=1 ∪r∈Q∩[ 1 ,t] d(Xr , B) < . m n 1 If ω belongs to the right hand side, there is a sequence rn in [ m , t] such that 1 d(Xrn (ω), B) ≤ n1 . There is a subsequence rnk → s0 ∈ [ m , t] such that Xs0 (ω) ∈ B. If on the other hand s0 = TB (ω) ≤ t. Since B is closed, Xs0 (ω) ∈ B. By left continuity for any n, there is rn ∈ (s0 /2, s0 ] such that d(Xrn (ω), B) < n1 . And {TB ≤ 0} = ∩n {TB < n1 } ∈ F0+ . Theorem 5.2.7 For Brownian filtration completed with null sets, F0+ = F0 . See page 38, Morters-Peres [19], for a nice proof. Theorem 5.2.8 Let T be a stopping time. Then (BT +s −BT , s ≥ 0) is a Brownian motion. This can be proved by first assume that T takes a countable number of values. Remark 5.2.9 From now on we assume the standard assumptions on the filtration: it is right continuous and complete. We assume that our stochastic processes are cádlag: for almost surely all paths it has left limit and is right continuity. In fact if Ft satisfies the usual conditions and Xt is a super-martingale with t 7→ E[Xt ] is continuous, then it has a cádlag modification which is a Ft super martingale. This follows from the martingale convergence theorem. For each t consider Xqi where qi ∈ Q decreasing (increasing) to t. We see that Xt has finite left limit Xt− and finite right limit Xt+ . It can also be seen that the process (Xt+ , t ≥ 0) is an integrable Ft+ -super-martingale. 64 5.2.1 Extra Reading Note that ω ∈ {TB ≤ 0} means within any small neighbourhood of 0, there are points where Xt (ω) is arbitrarily small. In particular if sn ↓ 0, Xsn = 0 will imply that TB = 0. This we cannot tell at time 0 without further information. Let DB the first entrance time (passage time) DB = inf{t ≥ 0 : Xt ∈ B}. The difference between DB and TB is what happens at time 0. Suppose that B is closed. If X0 = 0 then DB = 0. If X0 6= 0 and (Xt ) is right continuous then DB 6= 0. So DB in this case is a Ft -stopping time. In the case of X0 ∈ B where B is an open set, then DB = TB = 0 if Xt right continuous. If X0 and Xt walk out of B the next instant and stays away from it, then TB = ∞, Example 5.2.10 Let Xt : Ω → R be sample continuous and a ∈ R. Let Da (ω) = inf {Xt (ω) ≥ a}. t≥0 Let Ft = σ{Xs : s ≤ t}. Then Da is a stopping time. By the continuity of the path, {DA ≤ t} = { sup Xs ≥ a} = { sup Xs ≥ a} 0≤s≤t s∈[0,t]∩Q = ∩∞ n=1 ∪s∈[0,t]∩Q {Xs ≥ a − 1 } ∈ Ft . n Example 5.2.11 Let a ∈ R. Let Ta = inf{t > 0 : Xt ≥ a}. Let (Xt ) be continuous. For t > 0 ∞ {Ta ≤ t} = ∪∞ m=1 ∩n=1 ∪r∈[ 1 ,t]∩Q {ω : Xr (ω) ≥ a − m 1 } ∈ Ft . n However {Ta = 0} = ∩{Ta ≤ n1 } is in general only F0+ measurable. 5.2.2 Lecture 13: Stopped Processes Consider (Xn ). Then {XT ∈ A} = ∪∞ k=1 ({Xk ∈ A} ∩ {T = k}) . Definition 5.2.12 Let T be a stopping time. Define FT = {A ∈ F∞ : A ∩ {T ≤ t} ∈ Ft , ∀t ≥ 0}. 65 This is the information available when an event arrives. It is clear that if T ≡ t then FT = Ft . Proposition 5.2.13 Let S, T be stopping times. (1) If S < T , then FS ⊂ FT . (2) If S ≤ T , for any A ∈ FS , S1A + T 1Ac is a stopping time. (3) If S, T are stopping time then FS ∩{S ≤ T } ⊂ FS∧T and FS∧T = FS ∩FT . Proof (1) If A ∈ FS , A ∩ {T ≤ t} = (A ∩ {S ≤ t}) ∩ {T ≤ t} ∈ Ft and hence A ∈ FT . (2) Since FS ⊂ FT , {S1A + T 1Ac ≤ t} = ({S ≤ t} ∩ A) ∪ ({T ≤ t} ∩ Ac ) ∈ FT . (3) Take A ∈ FS . Then A ∩ {S ≤ T } ∩ {S ∧ T ≤ t} = (A ∩ {S ≤ T } ∩ {S ∧ T ≤ t}) ∩ {S ≤ t} ∈ Ft . Hence A ∩ {S ≤ T } ∈ FS∧T . Furthermore A = (A ∩ {S ≤ T }) ∪ (A ∩ {T ≤ S}) ∈ FS∧T . In particular FS ⊂ FS∧T . By symmetry, FT ⊂ FS∧T . For a nice account of stopping times see Kallenberg [14]. Definition 5.2.14 A stochastic process X : [0, R) × Ω → E is progressively measurable if for each t, (s, ω) 7→ Xs (ω) as a map from ([0, t] × Ω, B([0, t]) ⊗ Ft ) to (E, B(E)) is measurable. We often assume in addition that X : R+ × Ω → E is measurable. Theorem 5.2.15 If T is a stopping time and Xt is progressively measurable then XT if FT -measurable. For a proof see Revuz-Yor [23]. Let 0 = t0 < t1 < · · · < tn < t, and H−1 ∈ F0 and Hi ∈ Fti be measurable functions. Let n−1 X H(t) = H−1 1{0} (t) + Hi 1(ti ,ti+1 ] . i=0 66 It is an elementary process. It is easy to check that it is progressively measurable. A left continuous (or right continuous ) adapted process is progressively measurable: Given a partition of R, we may approximate such a process by processes that is a constant random function on each time interval. If T is a stopping time and Xt : Ω → E is progressively measurable then 1{T <∞} XT is FT measurable. Proposition 5.2.16 If T is weakly optional, then there is stopping times Tn such that Tn takes only a finite number of values and Tn decreases to T . Proof Define Tn = Tn (ω) = 1 n 2n [2 T + 1], j+1 j j+1 , if T (ω) ∈ , , 2n 2n 2n Then Tn decreases to T and for j 2n j+1 2n , ≤t< {Tn ≤ t} = {Tn ≤ j = 0, 1, 2 . . . . j j } = {T < n }. n 2 2 and Tn are stopping times. 5.3 Lecture 14: The Martingale Convergence Theorem Let Hn be a process such that Hn ∈ Fn−1 (previsible). It is the stake one puts down at time n − 1 on betting Xn , an adapted process, goes up. It is determined by events up to time n. The winning at time n is Hn (Xn − Xn−1 ). Define a new process H · X (the total winnings up to time n), called the martingale transform of X by H is : (H · X)0 = 0 (H · X)n = H1 (X1 − X0 ) + · · · + Hn (Xn − Xn−1 ), n≥1 This can be Rconsidered as discrete ‘stochastic integral’, which can be symbolically n denoted by 0 Hs dXs ’. Lemma 5.3.1 Let Hn be adapted to Fn−1 with |Hn (ω)| ≤ K for some K > 0. (1) If Hn ≥ 0 and (Xn ) is a super martingale then ((H · X)n ) is a super martingale. (2) If Xn is an Fn martingale then (H · X)n is a martingale. 67 Proof (1) Write Yn = (H ·X)n . The last winning is Yn −Yn−1 = Hn (Xn −Xn−1 ). Since Hn is bounded, (H · X)n ∈ L1 and E{Yn − Yn−1 |Fn } = Hn E{Xn − Xn−1 |Fn−1 }. Since Hn ≥ 0 and Xn is a super martingale, E{Yn − Yn−1 |Fn } ≤ 0 and {Yn } is a supe-rmartingale. (2) In case of Xn is a martingale, E{Yn − Yn−1 |Fn } = Hn E{Xn − Xn−1 |Fn−1 } = 0. Let a < b, by ‘an upper crossing’ by (Xn ) we mean a journey starting from below a and ends above b. Let us say Xn < a and m = inf m>n {Xm > b}. Then connecting the points Xn , Xn+1 , . . . , Xm gives us an ‘upper crossing’ in graph. Lemma 5.3.2 ( page 107, Williams [29]) Let X be a super-martingale and let UN [a, b](ω) be the number of up-crossings of [a, b] made by a stochastic process {Xn } by time N . Then (b − a)EUN ([a, b]) ≤ E(XN − a)− . Proof Let H be a betting strategy that you play 1 unit when Xn < a, plays until X gets above b and stop playing. Then (H · X)N ≥ (b − a) (UN ([a, b])) − [XN (ω) − a]− . Taking expectation, using the fact that (H · X) is a super-martingale, to see that (H · X)N (ω) ≥ (b − a)E (UN ([a, b])) − [XN (ω) − a]− . If (an ) is a sequence that crosses from a to b infinitely often for some a < b, then an cannot have a limit. If an does not have a limit, there will be a number a < b such that an crosses it infinitely often. This is the philosophy behind the following martingale convergence theorem. Theorem 5.3.3 Let Xn be a discrete time super-martingale with supn (Xn )− < ∞. Then limN →∞ XN exists. Assume that supt E|Xt | < ∞ then X∞ is in L1 . 68 Proof Let A be the set of ω such that limN →∞ XN does not exist. If ω ∈ A there are two rational numbers a < b such that lim inf XN (ω) < a < b < lim sup XN (ω). N →∞ N →∞ It is clear that A = ∪a,b∈Q,a<b {ω : lim inf XN (ω) < a < b < lim sup XN (ω)} = ∪Λa,b . N →∞ N →∞ We will show that P (Λa,b ) = 0. If ω ∈ Λa,b , there must be an infinitely number of visits, UN (ω), to below a and to above b: limN →∞ UN ([a, b]) = ∞. Hence Λa,b ⊂ {ω : E lim UN ([a, b]) = ∞}. N →∞ On the other hand, by the upper crossing Lemma, (b − a) lim EUN ([a, b]) ≤ sup E(XN − a)− ≤ sup((XN )− + |a|) < ∞. N →∞ N N By the monotone converging theorem, E lim UN ([a, b]) = lim E UN ([a, b]) < ∞. N →∞ N →∞ In particular limN →∞ UN ([a, b]) < ∞ almost surely and P (Λa,b ) = 0. To see that limN →∞ XN is finite apply Fatou’s lemma E| lim XN | = E lim |XN | ≤ lim inf E|XN | ≤ sup E|XN | < ∞. N →∞ N →∞ N →∞ N E(Xt+ ) + E(Xt− ). Note that if E|Xt | = And (Xt ) is L1 bounded is a stronger condition. Let Ft be a right continuous complete filtration and Xt an L1 bounded supermartingale. Assume that t 7→ E[Xt ] is right continuous. Then there is a cádlag modification of Xt which is a Ft super-martingale. See Revuz-Yor [23] (Chapter 2, section 2 for detail). By the above theorem, (Xt ) is a continuous time martingale, then it converges along every increasing sequences {tk }: {Xtk } is a discrete time super-martingale. Let s1 > s2 > s3 > . . . be a decreasing sequence of numbers, E{Xsk |Fsk+1 } = Xsk+1 . 69 Let Yn = Xsn and Gn = Fsn . Then Gn is a decreasing sequence of σ-algebras. By the same argument as before, Y−∞ := limn→∞ Yn converges almost surely. This motivated the study of martingales as below. Consider a decreasing family of · · · ⊂ F−m ⊂ F−2 ⊂ F−1 . Let Z−n = E{Z−1 |F−n }. Let F−∞ = ∩n F−n . Remark 5.3.4 Since Z−1 ∈ L1 , (Z−n ) is a L1 bounded and uniformly integrable. It follows that for any A ∈ F∞ ⊂ F−m , EZ−1 1A = EZ−m 1A . Taking m → ∞ to see that E(Z−1 1A ) = E(Z−∞ 1A ) and Z−∞ = E{Z−1 |F−∞ }. Let f : R+ → R be a function then limt→T f (t) exists if and only if for any sequence tn → T , limn→∞ f (tn ) exists and have the same limit. This leads to the following theorem. If the function is furthermore right continuous, the function is determined by its values on rational numbers. Theorem 5.3.5 If (Xt , t ∈ [0, T ), where T ∈ R ∪ {∞}, is a right continuous super martingale with supt<T E[Xt+ ] < ∞, then limt→T Xt exists almost surely. Let {tn } be ordered rational numbers converging to T . The previous lemma applies to show that limt↑T,t∈Q Xt exists almost surely. The above theorem holds if (Xt ) is a sub-martingale with supt E[Xt− ] < ∞. Let I = [0, T ) where T is finite or ∞. If f ∈ L1 we may define a martingale ft := {f |Ft }. The following theorem say that all L1 martingales are given this way. Let f ∈ L1 with Ef = 1, we define a probability measure Q on F such that dQ dP = f . Then ft is the density of the two measures restricted to Ft . Given a martingale (ft , t < 1) can we define a measure Q on F1 such that the density of Q with respect to P , restricted on Ft , is ft ? The following theorem says yes if (ft , t < 1) is uniformly integrable. This theorem is taken from Revuz-Yor [23] (section 3, Chapter 2, Theorem 3.1). Theorem 5.3.6 (The End Point of a martingale) If (Xt , t ∈ [0, T )) is a right continuous martingale, the following are equivalent. Below we use XT for the end point. (1) Xt converges in L1 , as t approaches T , to a random variable XT . 70 (2) There exists a L1 random variable XT s.t. Xt = E{XT |Ft }. (3) (Xt , t < T ) is uniformly integrable. Proof In case (1), Xt converges in L1 implies that {Xt } is uniformly integrable, c.f. Proposition 2.5.8. Given uniform integrability, we know that (Xt ) is L1 bounded and by Proposition 5.3.5 there is an L1 random variable XT such that limt→T Xt ≡ XT almost surely. Note that limt→T |Xt − XT | = 0 if and only if the convergence is in probability and {Xt } is uniformly integrable. Hence (3) is equivalent to (1). Assume (2). By Lemma 4.6.5, (Xt , t ≥ 0) is uniformly integrable, hence (3) holds. Let us assume that (1) and (3) hold. We prove (2). By the martingale property, for any u > t, Xt = E{Xu |Ft }. By the uniform integrability, Xt = lim E{Xu |Ft } = E{XT |Ft }. u→T 5.4 Lecture 15: The Optional Stopping Theorem We say that S ≤ T if S(ω) ≤ T (ω) for a.s. ω. We say that T is bounded if there is C > 0 such that T (ω) ≤ C for all ω. Theorem 5.4.1 (Doob’s Optional Stopping Theorem) Let S, T : Ω → {0, 1, 2 . . . } be bounded stopping times with S ≤ T . (1) If (Xn ) be a super-martingale. Then EXT ≤ EXS . (2) If Xn is a sub-martingale, EXS ≥ EXT . (3) If Xn is a martingale EXS = EXT . For any stopping time T , the stopped process X T is a martingale. Proof We only prove part (3). Recall the martingale transform: (H · X)0 = 0 (H · X)n = H1 (X1 − X0 ) + · · · + Hn (Xn − Xn−1 ), n≥1 Define HnT = 1n≤T ( The last betting is on the T -th game). Then T (H ·X)n = n X [(X1 −X0 )+· · ·+(Xk −Xk−1 )]1T =k = k=1 n X k=1 71 (Xk −X0 )1T =k = Xn∧T −X0 . By lemma 5.3.1, (H T · X) is a martingale and so is X T . IfT ≤ N , take n = N , then 0 = E(H T · X)N = E(XT − X0 ) = 0. Hence E(XT ) = E(X0 ). If (Xn ) is uniformly integrable, then X∞ = limn→∞ Xn exists. WE know that (H T · X)n = Xn∧T − X0 . For part super and sub martingales, take H S,T = 1S<n≤T . We assume that (Ft ) satisfies the standard assumptions. Recall that if (Xt : t ∈ I) is progressively measurable then XT ∈ FT for any stopping time T . Proposition 5.4.2 Let (Xt : t ∈ I) be an integrable progressively measurable process. (1) Suppose that for all bounded stopping times S ≤ T , EXT = EXS , then E{XT |FS } = XS . (2) Suppose that for all bounded stopping times S ≤ T , EXT ≤ EXS then E{XT |FS } ≤ XS . Proof Let A ∈ FS , define τ = S1A + T 1Ac ≤ T a bounded stopping time. Then EXT = E[XT 1A ] + E[XT 1Ac ] EXτ = E[XS 1A ] + E[XT 1Ac ]. . (1) For the first statement, EXτ = EXT implies that E[XT 1A ] = E[XS 1A ] for all A ∈ FS and the conclusion holds. (2) For the second statement, EXτ ≥ EXT by the assumption giving that E[XT 1A ] ≤ E[XS 1A ]. Since E[XT 1A ] = E [E{XT |FS }1A ]], we have E [(E{XT |FS } − Xs ) 1A ]] ≤ 0 for any A. Hence E{XT |FS } ≤ XS . Theorem 5.4.3 (Optional Stopping Theorem) 1. Let Xt , t ≥ 0 be right continuous martingale. Then for all bounded stopping times S ≤ T , E{XT |FS } = XS , a.s. 72 2. As in the above, if {Xt , t ≥ 0} is furthermore uniformly integrable, we define XT = X∞ on {T = ∞}. Then XS = E{X∞ |FS } E{XT |FS } = XS . for all stopping times S and T ( which are not necessarily bounded). 3. Let (Xt , t ≥ 0) be a right continuous super-martingale. Let S and T be two bounded stopping times. Then E{XT |FS } ≤ XS almost surely. If furthermore {Xt , t ≥ 0} is uniformly integrable, E{XT |FS } ≤ XS for all stopping times. Proof 1) Let K ∈ R be such that S(ω) ≤ T (ω) ≤ K. Let Sn = 21n [2n S + 1] then XSn = E{XK |FSn } by Doob’s optional stopping theorem (together with Proposition 5.4.2). And EXSn = EXK . By Lemma 4.6.5, (XSn , n = 1, 2, . . . ) is a uniformly integrable family. Hence EXS = limn→∞ EXSn = EXK . That E{XT |FS } = XS follows from the Proposition 5.4.2. If {Xt } is a uniformly integrable right continuous martingale, by Proposition 5.3.6, there is an L1 variable with Xt = E{X∞ |Ft } Let Sn = 21n [2n S + 1]. Let Ak = {Sn = 2kn } ∈ FSn . Take t = 2kn . Since A ∈ Ft , E(X kn 1Ak ) = E(X∞ 1Ak ). Thus E(XSn 1Ak ) = E(X kn 1Ak ) = E(X∞ 1Ak ). 2 2 Summing over k to see that EXSn = EX∞ and XSn = E{X∞ |FSn }. The family XSn is uniformly integrable. For A ∈ FS ⊂ FSn , EXSn 1A = EX∞ 1A , taking n → ∞ to obtain E(XSn = EX∞ and XS = E{X∞ |FS }. Hence E{XT |FS } = E{E{X∞ |FT }FS } = XS . The proof for super martingales. Let us only prove the case when (Xt ) is a uniformly integrable super-martingale. For s ≤ t, Then Xs ≤ E{Xt |Fs }. Since limt→∞ E{Xt |Fs } = E{X∞ |Fs }, Xs ≤ E{X∞ |Fs }. The rest of the proof is as above. 73 5.5 Martingale Inequalities (I) The following martingale inequality is inspired by the following, where Bt is a Brownian motion: 1 P (sup Bs ≥ a) = 2P (Bt ≥ a) = P (|Bt | ≥ a) ≤ p E|Bt |p . a s≤t Proposition 5.5.1 Let (Xt , t ∈ I) be a right continuous martingale or a positive sub-martingale. Let t ∈ I, an interval. 1. Maximal Inequality. For p ≥ 1, P (sup |Xt | ≥ λ) ≤ t∈I 1 sup E|Xt |p , λp t∈I λ>0 2. Lp inequality. Let p > 1. Then p p p E sup |Xt | ≤ sup E|Xt |p . p − 1 t∈I t∈I Remark 5.5.2 • If (Xt ) is a martingale in a Banach space, kXt k is a positive sub-martingale which follows from that the norm is a convex function. • Let s < t. If (Xt ) is a martingale, E|Xs |p = E (|E{Xt |Fs }|p ) ≤ E|Xt |p . In particular E|Xs |p increases with time and sups≤t E|Xs |p ≤ E|Xt |p . • If (Xt ) is a sub-martingale, then Xs ≤ E{Xt |Fs }. If it is positive, |Xs |p ≤ |E{Xt |Fs }|p and the discussion above is valid. • By the Markov inequality, for any p ≥ o, P (sup |Xt | ≥ λ) ≤ t∈I 1 E sup |Xt |p . λp t∈I This inequality holds for any processes. For a martingale this is weaker than the maximal inequality. • If Xt is a process satisfying conditions in the theorem. From the Markov inequality and the Lp inequality we ‘almost’ recover the maximal inequality for p > 1: p 1 1 p p P (sup |Xt | ≥ λ) ≤ p E sup |Xt | ≤ p sup E|Xt |p . λ λ p−1 t∈I t∈I t∈I 74 See Revuz-Yor [23] for a proof. The standard convention for the maximum is: X ∗ = supt∈I |Xt |. Corollary 5.5.3 Let Xt be a right continuous martingale such that supt∈[0,∞) E|Xt |p < ∞ for some p > 1, then limt→∞ Xt converges to X∞ in Lp . Proof By the Martingale convergence theorem, X∞ = limt→∞ Xt exists and is finite a.s. Note that |Xt |p ≤ supt |Xt |p , the latter belongs to L1 by the Lp inequality. Thus (|Xt |p , t) is u.i. and E|X∞ |p ≤ lim inf t→∞ (|Xt |p ) < ∞. By the dominated convergence theorem E|Xt − X∞ |p → 0. 5.6 Lecture 16: Local Martingales Definition 5.6.1 An adapted stochastic process (Xt : t ≥ 0) is a local martingale if there is an increasing sequence of stopping times Tn with limn→∞ Tn = ∞ a.s. and is such that (XtTn − X0 , t ≥ 0) is a uniformly integrable martingale. We say that Tn reduces X. A sample continuous martingale is a local martingale (Take Tn = n). A local martingale (Mt ) is a martingale if (MT : T bounded stopping times} is uniformly integrable. This is the case if {|Mt |, t} is bounded by an L1 random variable Z. Remark 5.6.2 If Xt is a martingale then EXt = EX0 . If Xt is a local martingale this no longer holds. Furthermore given any function m(t) of bounded variation there is a local martingale such that m(t) is its expectation process. A local martingale which is not a martingale is called a strictly local martingale, otherwise it is a true martingale, see Elworthy-Li-Yor [6] for discussions related to this. Let T > 0. Theorem 5.6.3 Let Mt be a continuous local martingale. If Mt has finite total variation on [0, T ] then Mt = M0 , any t ≤ T . Proof This proof is the same as that for Brownian motions. We did not cover it in the lectures. We may assume that M0 = 0. Let t ≤ T . First let MT V (t, ω) = sup ∆ N −1 X |Mtj+1 (ω) − Mtj (ω)|. j=0 75 where ∆ range through all partitions 0 = t0 < t1 < · · · < tN = t of [0, t]. It is increasing and continuous in t. Let Tn = inf{t : MT V (t) ≥ n}. Tn Fix n, write Xt = MP t . Then Xt is bounded by n and a martingale. For a −1 2 martingale, EXt2 = E N i=0 |Xti+1 − Xti | . Indeed, E N −1 X |Xti+1 − Xti | 2 = i=0 = = N −1 X i=0 N −1 X 2 E E |Xti+1 − Xti |Fti o n E E [Xt2i+1 − 2Xti+1 Xti + Xt2i ]Fti i=0 N −1 X EXt2i+1 − EXt2i = EXt2 i=0 EXt2 = E N −1 X |Xti+1 − Xti |2 ≤ E max |Xti+1 − Xti | i=0 N −1 X ! |Xti+1 − Xti | i=0 ≤ max XT V (ω) E max |Xti+1 ω i ≤ nE max |Xti+1 − Xti | . − Xti | i Since Xt is uniformly continuous on [0,t] and bounded, by the dominated convergence theorem, E maxi |Xti+1 − Xti | → 0. Hence E[Xt2 ] = 0 and MtTn = 0 almost surely. This means that Mt = 0 on {t < Tn }. Taking n → ∞, Since Tn → ∞ the conclusion holds. 76 Chapter 6 The quadratic Variation Process Definition 6.0.4 A stochastic process of finite variation is an adapted stochastic process with sample paths of finite total variation on any finite time interval. Definition 6.0.5 A semi-martingale is an adapted stochastic process such that Xt = X0 +Mt +At where Mt is a local martingale and At of finite total variation, M0 = A0 = 0. A continuous semi-martingale Xt is of the form Xt = X0 + Mt + At where Mt and At are continuous. Definition 6.0.6 Let H02 be the space of L2 p bounded continuous martingales that 2 vanishes at 0. Let M ∈ H0 . Define kM k = E(M∞ )2 . Proposition 6.0.7 The vector space H02 with inner product from the norm k − k is a Hilbert space. n ) is a Cauchy sequence Proof Let (Mtn ) be a Cauchy sequence in H02 . Then (M∞ 2 n 2 in L . Let X = limn→∞ M∞ . Then X ∈ L . Define Mt = E{X|Ft }. Then (Mt ) is a martingale. By Jensen’s inequality, 2 2 EMt2 ≤ E E{M∞ |Ft } = EM∞ . Since t 7→ EMt is continuous, we take a right continuous version of (Mt ). Then limt→∞ Mt = X, by Theorem 5.3.6. By Doob’s L2 inequality, n E sup |Mtn − Mt |2 ≤ E|M∞ − X|2 → 0. t There is a subsequence of Mtn that converges to Mt a.s. uniformly in t. Hence n , n = 1, 2, . . . } is (Mt ) is a continuous martingale and M0 = 0. Since {M∞ n |F } to see that M = uniformly integrable, taking limit n → ∞ in Mtn = E{M∞ t t E{M∞ |Ft } and M∞ = X a.s. 77 6.1 Lecture 18-20: The Basic Theorem P Let Mn be a martingale. Then Mn , equaling M0 + k (Mk −PMk−1 ), is the sum 2 2 of orthogonal processes. Furthermore E(Mn )2 = EM 0 +E k (Mk − Mk−1 ) . P ∞ It follows that (Mn ) is bounded in L2 if and only if k=1 E(Mk − Mk−1 )2 < ∞. Let (Mn ) be a martingale. Define the stochastic integral M · M by (M · M )n = n X Mk (Mk+1 − Mk ). k=0 It is a local martingale and (M · M )n = (Mn )2 − (M0 )2 − P∞ k=1 (Mk − Mk−1 )2 . Theorem 6.1.1 (1) For any continuous local martingales M and N , there exists a unique continuous process hM, N it of finite variation vanishing at 0 such that Mt Nt −hM, N it is a local martingale. This process is called the bracket process or the quadratic variation of M and N . (2) The process hM, N i has the following properties: (a) It is symmetric and bilinear, and 1 hM, N i = [hM + N, M + N i − hM − N, M − N i]. 4 (6.1) (b) hM − M0 , N − N0 it = hM, N it . (3) hM it ≡ hM, M it is increasing. (4) If (Mt ) is bounded, Mt2 − hM it is a martingale. Proof We first establish the uniqueness. Let At and A0t be two processes of finite variation satisfying that Mt Nt − At and Mt Nt − A0t are local martingales. Then At − A0t is a continuous time local martingale with finite variation. By Theorem 5.6.3, At − A0t = 0 a.s. The uniqueness is the key for the proof of the properties in (2). (2a): Mt Nt − hM, N it is a local martingale and so is Nt Mt − hN, M it . The symmetry follows. Similarly if a, b ∈ R and M 0 is a another martingale, haM + bM 0 , N i = ahM, N i + bhM 0 , N i. Next note that 14 (M + N )2 − 14 (M − N )2 = M N , and MN − 1 (hM + N, M + N i − hM − N, M − N i) 4 78 is a local martingale. By the uniqueness of the bracket process, hM, N i = 1 (hM + N, M + N i − hM − N, M − N i) . 4 Since M0 N is a local martingale, hM0 , N i = 0 and part (2)C follows. For part (3), note that for any real numbers a, (M − a)N − M N = −aN is a local martingale. Hence hM − a, N i = hM, N i. The proof of the existence of the quadratic variation process follows from two intuitive ideas: • Itô’s formula: Mt2 = M02 t Z +2 0 Ms dMs + hM, M it . • Let ∆n be a sequence of partitions with ∆n → 0, X in probability = lim (Mtni+1 − Mtni )2 . hM it n→∞ i We first assume that Mt is bounded. By part (2b) we may assume that M0 = 0. Let us first approximate Mt be a family of stochastic processes that is piecewise constant in time. Given n we construct a process H n such that |Htn −Mt | ≤ 21n for all time t. Such approximation will depend on the path. Let us construct a family of stopping times. Let τ0n = 0 1 } 2n 1 = inf{t > 0 : |Mt − Mτkn | ≥ n }. 2 τ1n = inf{t > 0 : |Mt − M0 | ≥ n τk+1 Then limk→∞ τkn = ∞. Define Htn = ∞ X n ] (t). Mτkn 1(τkn ,τk+1 k=0 Then supt |Htn − Mt | ≤ (H n · M )t ≡ 1 2n . Z 0 We define the stochastic integral t Hsn dMs = X k≥0 79 n Mτkn (Mt∧τk+1 − Mt∧τkn ). Let Qnt ∞ X n = (Mt∧τk+1 − Mt∧τkn )2 . k=0 Then, Mt2 = 2 Z t Hsn dMs + Qnt . 0 (check! This is a simple manipulation.) R · algebraic n Then the process 0 Hs dMs is a martingale (check!). It belongs to H02 . (check!) Furthermore {(H n · M ), n = 1, 2, . . . , } is a Cauchy sequence, see Lemma 6.1.2 below. Let Z = limn→∞ (H n · M ). Then Z ∈ H02 and Mt2 − 2Zt is a continuous process. Define hM it = Mt2 − 2Zt . By Doob’s L2 inequality, Z t Z t n 2 n sup(Qt −hM it ) = sup Mt − 2 Hs dMs − hM it = 2 sup(Zt − Hsn dMs ) → 0 t t t 0 0 in probability. In the following by taking an almost surely convergent sub-sequence we may assume the convergence is for P almost surely all ω. 2 n We prove part (3). Since Qnt = ∞ ≥ Qnτn , for n k=0 (Mt∧τk+1 − Mt∧τk ) , Qτk+1 k k k n each n. Since {τ } ⊂ {τ }, the monotony of hM, i on {τ (ω), n, k = 1, 2, }˙ n n+1 t k is clear. By continuity, Qn is non-decreasing on the closure of {τkn (ω), n, k = 1, 2, . . . }. Fix ω, let (a, b) be in the compliment of the closure of {τkn (ω), n, k = ˙ Then if s ∈ [a, b] hM i (ω) − hM i (ω) ≤ 1n for all n. Thus Ms (ω) = 1, 2, }. s a 2 Ma (ω). It follows that hM it is increasing. Part (4) is clear from the proof. Now we consider the case when Mt is not bounded. Let τ n is the first time that |Mt | is greater or equal to n. Then hM τn i is defined. It is clear that hM τn iτm = hM τm iτm . For all m < n, hM τn i = hM τm i on {t < τm }. We may define hM it = hM τm i, ω ∈ {t < τm }. Since limn→∞ τn = ∞, we have a process hM it defined everywhere. 80 Since (M τn )2t − hM τn it = (M 2 − hM i)τt n is a martingale, M 2 − hM i is a local martingale. Lemma 6.1.2 {(H n · M ), n = 1, 2, . . . , } is a Cauchy sequence in H02 . Proof Indeed let m < n, 2 kH n · M − H m · M k2 = k(H n − H m ) · M k2 = E X n Ck (Mτk+1 − Mτkn ) .. k≥0 where Ck is either 0, where the indices for m and n coincide, or of the form Mτkn − 1 . It Mτjm , with τjm < τkn . The sum consists of orthogonal terms and |Ck | ≤ 2m−1 1 n m 2 2 follows that kH · M − H · M k ≤ 2m−1 E(M∞ ) . Proposition 6.1.3 Let T be a stopping time, M and N are local continuous martingales, then hM T , N T i = hM, N iT = hM, N T i. Proof We observed earlier hM, M iT = hM T , M T i. Now 1 MtT NtT − M0 N0 − [hM + N, M + N iT + hM − N, M − N iT 4 is a local martingale. Hence hM T , N T i = hM, N iT . Similarly hM, N T iT = hM T , N T i = hM, N iT . It follows that hM, N T i = hM, N i. Proposition 6.1.4 If M is a continuous local martingale then hM, M it = 0 if and only if Ms = M0 for s ∈ [0, t]. Proof Assume that hM, M it = 0 = 0. We first suppose that Mt is bounded and M0 = 0. Let s ≤ t. Then E[Mts − M02 ] = EhM, M is = 0 . Then E(Mt2 ) = 0 for all s ≤ t. Otherwise let Tn be a reducing sequence of stopping times then MtTn − M0 = 0 almost surely for each n. Taking n → ∞ to complete the proof. See Revuz-Yor, Proposition 1.13. Theorem 6.1.5 Let M and N be two continuous local martingales. For any sequence of partitions with |∆n | → 0, hM, N it = lim |∆n |→0 ∞ X (Mt∧tnj+1 − Mt∧tnj )(Nt∧tnj+1 − Nt∧tnj ), j=0 81 with convergence in probability. Definition 6.1.6 Let X and Y be two continuous processes. If for any sequence of partitions with |∆n | → 0, lim n→∞ ∞ X (Xt∧tnj+1 − Xt∧tnj )(Yt∧tnj+1 − Yt∧tnj ) j=0 exists, we define the limit to be hX, Y it . See Revuz-Yor for a proof. The limit on the right hand side is another definition for the quadratic variation process, which we will use during the remaining of the section. Proposition 6.1.7 • If At is continuous process of finite variation and Xt is a continuous semi-martingale then hX, Ait = 0. • If Xt = Mt + At and Yt = Nt + Ct be two continuous semi-martingales with local martingale parts M and N , hX, Y it = hM, N it . These assertions follow from the continuity of M and the following estimate | ∞ X (Xt∧tnj+1 − Xt∧tnj )(At∧tnj+1 − At∧tnj )| j=0 ≤ max |Xt∧tnj+1 − Xt∧tnj | j ∞ X |At∧tnj+1 − At∧tnj |. j=0 Remark 6.1.8 The bracket process of M, N is somewhat determined by the correlation of the two stochastic processes. This statement can be made more precise with the help of multi-dimensional martingale representation theorem. If they are independent and bounded, E[hM, N i]2 = 0 from computing ∞ X E[ (Mt∧tnj+1 − Mt∧tnj )(Nt∧tnj+1 − Nt∧tnj )]2 . j=0 If M, N are unbounded, let T be a stopping time, MT and N T are not necessarily independent. For example T could be the first time they meet. However let Tn = 82 inf{t : |Mt | ≥ n}, Sn = inf{t : |Nt | ≥ n}. Then M Tn and N Sn are independent. Hence E[M Tn N Sn ] = 0 and hM Tn , N Sn i = 0. By the property of martingale bracket, hM Tn , N Sn i = hM, N iTn ∧Sn = hM Tn ∧Sn , N Tn ∧Sn i = 0 which shows and hM, N i = 0 and E(M N )Tn ∧Sn = 0. 6.2 Martingale Inequalities (II) Theorem 6.2.1 (Burkholder-Davis-Gundy Inequality) For every p > 0, there exist universal constants cp and Cp such that for all continuous local martingales vanishing at 0, p p cp EhM, M iT2 ≤ E(sup |Mt |)p ≤ Cp EhM, M iT2 t<T where T is a finite number or infinity. Let τ be a stopping time, we have sup sup |Mt∧τ |p ≤ sup |Mt |p , t<∞ τ t<∞ sup |Mτ |p ≤ sup |Mt |p , τ t<∞ Let (τn ) be a reducing sequence for a local martingale then {Mtτn − M0 } is uniformly integrable. Corollary 6.2.2 Let (Mt ) is a continuous local martingale. If M0 ∈ L1 and supt<∞ Mt ∈ L1 then (Mt ) is a martingale. Corollary 6.2.3 Let (Mt ) be a continuous L2 bounded martingale. Then Mt2 − hM, M it is a true martingale and {MT2 − hM, M iT , T ranges through all stopping times} is a uniformly integrable family. In particular, EMT2 = EhM i∞ . Proof We may take M0 = 0. By Doob’s Inequality, E sup (Ms )2 ≤ 4 sup E(Ms2 ) < ∞. s<∞ s<∞ By Burkholder-Davis-Gundy Inequality, EhM, M i∞ < ∞ and {hM iτ : τ are stopping times } is uniformly integrable. By Burkholder-Davis-Gundy Inequality again, the family {(Mtτ )2 − hM, M iτt : τ are stopping times } is uniformly integrable and hence Mt2 − hM, M it is an uniformly integrable martingale. 83 Remark 6.2.4 A stochastic process is Lp bounded if sups≤t E|Ms |p < ∞. Let f ∈ L2 , Ft a filtration with the usual assumptions, then Mt = E{f |Ft } is a martingale. Further (Mt2 , t ≥ 0) is L2 bounded and uniformly integrable. By conditional Jensen inequality, Mt2 ≤ E{f 2 |Ft }, and Mt2 is uniformly integrable. Thus lim E(Mt )2 = E(M∞ )2 . t→∞ On the other hand if supt EMt2 < ∞, Mt is uniformly integrable and M∞ = limt→∞ Mt exists with Mt = E{M∞ |Ft }. By Fatou’s lemma, E(M∞ )2 ≤ lim E(Mt )2 = sup E(Mt )2 < ∞. t→∞ t If the filtration is augmented and right continuous, we may take the cadlag version of the martingale. Then L2 (Ω, F, P ) is the ‘same’ as the space of L2 bounded cadlag martingales. For two independent Brownian motions B, W , noting that they are L2 bounded on any finite time interval [0, t], Bt Wt is a true martingale and EBT WT = 0 for any T, S ≤ t. However if T is the first time they meet, EBT WT 6= 0! 84 Chapter 7 Stochastic Integration Rt We aim to define 0 fs dXs , where Xs = X0 +Ms +As is a right continuous semimartingale and ft a left continuous measurable process. Here we limit ourselves to the case when (Ms ) is continuous in which case the bracket process is also continuous. In the definition and throughout this section ∞ can be replaced by a finite number T0 . This is important as Brownian motion is not L2 bounded on R+ , it is L2 bounded on any bounded time interval. 7.1 Lecture 21: Integration If (As , s ≥ 0) is a right continuous function of finite variation with A0 = 0. There is an associated Borel measure µA on [0, ∞) determined by µA ((c, d]) = A(c) − A(d). Note that µ({d}) = A(d) − A(d−). If A is continuous the measure does not charge a singleton. Write AT V (s) + As AT V (s) − As As = − . 2 2 where AT V is the total variation process. Recall a signed measure µ decomposed as difference of two positive measures: µ = µ+ − µ− . For the Radon measure µA , |µA | is the measure Rdetermined by ATRV . Let f : RR+ → R be integrable t with integral denoted by [0,∞) fs dAs . Let 0 fs dAs = [0,∞) 1(0,t] (s)fs dµA (s). If f : R+ → R is left continuous, Z t X fs dAs = lim f (tnj ) A(tnj+1 ) − A(tnj ) . 0 |∆n |→0 j We may allow f and As random. The above procedure holds for each ω for which As (ω) is of finite variation. There is however the added complication 85 of measurability. We assume that f is progressively measurable (by progressive measurability we include the assumption that f is universally measurable, i.e. f : R+ ⊗ Ω → R is measurable with respect to B(R+ ) ⊗ F∞ and A is a right R t continuous finite variation process (recall, in particular, As is adapted). Then 0 fs (ω)dAs (ω) is a process of finite variation and is right continuous. The integral is furthermore continuous if (As ) is sample continuous. Recall that hM, M i correspond to a positive measure and hM, N i a signed measure, written as µ+ − µ− where µ+ , µ− are positive measures. By |hM, N i| we mean the measure corresponds to µ+ + µ− . For anyqa, hM − aN it ≥ 0. This means hM, M it + a2 hN, N it ≥ 2ahM, N it . p hM,M it Take a = hM, M it hN, N it . A similar proof to see that hM, N i ≤ t hN,N it shows that for s < t: q q hM, N it − hM, N is ≤ hM, M it − hM, M is hN, N it − hN, N is . Let Hs , Ks be measurable functions. Approximate them by elementary functions give us the following theorem: Theorem 7.1.1 Let M and N be two continuous local martingales. Let H and K be measurable processes, i.e. measurable with respect to F∞ ⊗ B(R+ ). Then for t ≤ ∞, s s Z t Z t Z t 2 |Hs ||Ks | d|hM, N i|s ≤ |Hs | dhM, M is |Ks |2 dhN, N is , a.s. 0 0 0 To see the theorem holds first take H and K to be elementary processes. Apply Hölder Inequality to the above inequality to see, for p1 + 1q = 1, Corollary 7.1.2 [Kunita-Watanabe Inequality] For t ≤ ∞, Z t E |Hs ||Ks ||dhM, N i|s 0 Z ≤ E 0 7.2 t |Hs |2 dhM, M is p2 ! p1 Z E 0 t |Ks |2 dhN, N is Lecture 21: Stochastic Integration Let H 2 be the space of L2 bounded continuous martingales. 86 2q ! 1q Definition 7.2.1 For M ∈ H 2 , define L2 (M ) to be the space of progressively measurable stochastic process (ft ) such that f : Ω × R+ → R measurable and Z ∞ (fs )2 dhM, M is < ∞. kf k2L2 (M ) := E 0 This is a Hilbert space. This is an inner product space, if H, K ∈ L2 (M ), define Z ∞ hH, KiL2 (M ) = E Hs Ks dhM is . 0 By Kunita-Watanabe inequality, |hH, KiL2 (M ) | < ∞. Let PM be the measure on R+ × Ω determined by Γ ∈ B(R+ ) ⊗ F∞ , Z ∞ PM (Γ) = E 1Γ dhM, M is . 0 Then L2 (M ) = L2 (R+ × Ω, B(R+ ) ⊗ F∞ , PM ). We see that L2 (M ) is a Hilbert space. Let N be an integer and 0 = t0 < t1 < · · · < tN +1 . Definition 7.2.2 An elementary process (Kt ) is a bounded stochastic process of the form N X Kt (ω) = K−1 (ω)1{0} (t) + Ki (ω)1(ti ,ti+1 ] (t), i=0 where K−1 ∈ F0 , Ki ∈ Fti . The family of elementary processes is denoted by E. Elementary process are left continuous and so progressively measurable. Let M ∈ H 2 then Z ∞ |E Ks2 dhM, M is | ≤ |K|∞ hM, M itN +1 0 and E ⊂ ∩M ∈H 2 L2 (M ). Proposition 7.2.3 The set of elementary processes are dense in L2 (M ). Proof We prove the case when f ∈ L2 (M ) is left continuous. First assume f is bounded and let X gn (s, ω) = f0 (ω)1{0} (s) + f jn (ω)1( jn ≤s< j+1 n ] j≥1 87 2 2 2 Since f is left continuous and bounded, gn → f in L2 (M ). Note that f jn is Fj/2n 2 measurable. In general let fn (s) = fs 1|fs |≤n . Then |fn − f |2L2 (M ) Z t ≤E 0 fs2 1|f (s)|≥n dhM, M is → 0. If f is only assumed to be progressively measurable. Let f ∈ L2 (M ) be a function orthogonal to E. Hence for any s < t, K ∈ Fs , Z ∞ Z t 0 = hf, K1(s,t] iL2 (M ) = E fr K1(s,t] dhM, M ir = EK fr dhM, M ir . 0 s Rt Hence 0 fr dhM, M ir , which is in L1 (Ω) by the Kunita-Watanabe inequality (Theorem 7.1.2), is a martingale and of finite variation process and f = 0. 7.2.1 Stochastic Integral: Characterization All separable infinite dimensional Hilbert spaces are isomorphic (take a set of baiso sis in each space to construct the isometry). Hence L2 (M ) = H02 . Below we construct an explicit isometric map: K 7→ I(K, M ). We will call I(K, M ) the Itô integral and denote it by Z t Ks dMs . 0 The terminology ‘integral’ will be seen to be justified, when we take the example that K is an elementary process. Definition 7.2.4 Let M ∈ H 2 , and Kt (ω) = K−1 (ω)1{0} (t) + N X Ki (ω)1(ti ,ti+1 ] (t). i=0 Define the following elementary integral (K · M )t = k−1 X Ki (Mti+1 ∧t − Mti ∧t ). i=0 88 The part K−1 is irrelevant for elementary integral with respect to a continuous processes (Mt ). Denote by H02 the subspace of H 2 whose elements Mt satisfies M0 = 0. Theorem 7.2.5 Given M ∈ H 2 and K ∈ L2 (M ), there is a unique process I ≡ I(K, M ) ∈ H 2 , vanishing at 0, such that for any N ∈ H 2 , any t ≥ 0 Z t hI, N it = Ks dhM, N is . (7.1) 0 Proof If (7.1) holds for all N ∈ H02 it holds for all N ∈ H 2 . (a) We first prove the uniqueness. Suppose that there are two martingales I1 , I2 ∈ H02 such that Z t Z t hI1 , N it = Ks dhM, N is , hI2 , N it = Ks dhM, N is . 0 0 Then hI1 − I2 , N i = 0 for any N ∈ H02 . Then N1 − N2 = 0 a.s. by standard results on Hilbert spaces or take N = N1 − N2 . (b) The existence. For another proof of the existence, by approximation, see Proposition 7.2.10. (b1) Let N ∈ H02 . We define a real valued linear map: U: H02 −→ R Z U (N ) = E 0 ∞ Ks dhM, N is . (b2) By Kunita-Watanabe inequality: Z ∞ Z |U (N )| = E Ks dhM, N is ≤ E 0 0 ∞ Ks2 1 q 2 dhM, M is EhN, N i∞ < |K|L2 (M ) |N |H02 . (7.2) By the Riesz Representation Theorem for bounded linear operators, there is a unique element, I, of H02 , such that U (N ) = hI, N iH 2 = hI, N iH 2 , 0 And so Z E 0 ∀ N ∈ H02 ∞ Ks dhM, N is = EhI(K, M ), N i∞ . 89 (7.3) (b3) Define t Z Hs dhM, N is . Xt := It Nt − 0 We prove that (XRt ) is a martingale. By the defining property of the t bracket process, 0 Hs dhM, N is , which vanishes at 0, must equal to hI, N it . Let τ be any bounded stopping time. EIτ Nτ = EhI, N iτ = EhI, N iτ ∧∞ = EhI, N τ i∞ Z ∞ Z ∞ by(b) = E Ks dhM, N τ is = E 1s<τ Hs dhM, N is 0 0 Z τ =E Ks dhM, N is . 0 This shows that EXτ = EX0 = 0 and (Xt ) is a martingale. (c) We show that the map K 7→ I(K, M ) is a linear isometry. The linearly is clear. Also, let K, K 0 ∈ L2 (M ), Z 0 hI(K, M ), I(K , M )iH 2 = E 0 Z ∞ 0 ∞ =E Z0 ∞ =E 0 0 Ks dhM, I(K , M )is Z s 0 Ks d Kr dhM, M ir 0 Ks Ks0 dhM, M is = hK, K 0 iL2 (M ) . 7.2.2 Properties of Integrals The following follows from the proof of the theorem: Corollary 7.2.6 For H ∈ L2 (M ), K ∈ L2 (N ), and M, N ∈ H 2 , Z · Z · Z t Hs dMs , Ks dNs = Hs Ks dhM, N is . 0 0 t 90 0 Proof Since K · N ∈ H 2 , Z hI(H, M ), I(K, N )it = t Hs dhM, K · N is = 0 Z = 0 Z t Z Hs d( 0 0 s Kr dhM, N ir ) t Hs Ks dhM, N is . Take H = K, M = N and take t → ∞ to obtain the following identity, commonly known as the Itô isometry, Z · Z · Z ∞ E Ks dMs , Ks dMs =E (Ks )2 dhM, M is . 0 0 0 ∞ Corollary 7.2.7 If M ∈ H 2 , H ∈ L2 (M ), K ∈ L2 (I(H, M )), then Z s Z t Z t Ks d Hs dMs = Ks Hs dMs . 0 0 0 Hc2 , Proof For any N ∈ Z · Z t Z t h Ks d Is (H, M ), N it = Ks dhN, I(H, M )is = Ks Hs dhM, N is 0 0 0 = hI(HK, M ), N it . 7.2.3 Lecture 22: Stochastic Integration (2) Let M ∈ H 2 , and (Kt ) an elementary process, Kt (ω) = K−1 (ω)1{0} (t) + N X Ki (ω)1(ti ,ti+1 ] (t), i=0 and (K · M )t = k−1 X Ki (Mti+1 ∧t − Mti ∧t ). i=0 Proposition 7.2.8 Let M ∈ H 2 . If K is an elementary process, K · M ∈ H02 . The map K ∈ E 7→ K · M ∈ H 2 is linear and kK · M |H 2 = kKkL2 (M ) , 91 (Itô Isometry). Proof Take s ∈ [ti , t), Since Mt is a Ft martingale and Ki ∈ Fti ∈ Fs , E{Ki (Mti+1 ∧t − Mti ∧t )|Fs } = Ki (Mti+1 ∧s − Mti ∧s ). For s < ti , E{Ki (Mti+1 ∧t − Mti ∧t )|Fs } = E{ E{Ki (Mti+1 ∧s − Mti ∧s )|Fti |Fs } E{ Ki E{(Mti+1 ∧s − Mti ∧s )|Fti |Fs } = 0. Consequently E{(K · M )t |Fs } = X Ki (Mti+1 ∧s − Mti ∧s ). i and K · M is a martingale. We prove Itô isometry, k−1 X k(K · M )k2H 2 = E !2 Ki (Mti+1 − Mti ) = i=0 = = k−1 X k−1 X E Ki2 (Mt2i+1 − Mt2i ) i=0 n o E Ki2 E Mt2i+1 − Mt2i |Fti i=0 k−1 X E Ki2 hM, M i2ti+1 − hM, M i2ti i=0 ∞ Z =E 0 Ks2 dhM, M is = kKk2L2 (M ) . We use the fact that Mt2 − hM, M it is a martingale. Proposition 7.2.9 Let M, N ∈ H 2 , K ∈ E, then for all t, hK·M, N it = Proof Note that (K · M )t = k−1 X Rt 0 Ks dhM, N is . t Ki (Mt i+1 − Mtti ). i=0 For all N ∈ H02 , hK · M, N it = = k−1 X i=0 k−1 X hKi (M ti+1 − M ti ), N it = k−1 X Ki hM ti+1 , N it − hM ti , N it i=0 Ki hM, N itti+1 − hM, N itti i=0 92 Z = 0 t Ks dhM, N is . Rt Given that hK · M, N it = 0 Ks dhM, N is , we have, Z Z ∞ Ks dhM, K · M is = E EhK · M, K · M i∞ = E ∞ 0 0 Ks2 dhM, M is . One may also observe that (K · M )t = l−1 X X ! Ki (Mti+1 − Mti ) + Kl (Mt − Mtl ) 1(tl ,tl+1 ) (t). i=0 l Let N ∈ H 2 , t h(K · M ), N itl+1 l ∧t ∧t t ∧t = Hl hM, N itl+1 . l ∧t Proposition 7.2.10 Let M ∈ H 2 . The map K 7→ K · M extends to an isometry from L2 (M ) to H02 and for all t Z t hK · M, N it = Ks dhM, N is 0 for all N ∈ H02 . Proof Take fn , elementary processes converging to K ∈ L2 (M ) we see that k(fn · M ) − (fm · M )kH 2 = kfn − fm k2L2 (M ) → 0 as n, m → ∞. Then fn · M converges in H 2 and we define f · M to be the limit. Two isometries agreeing on a dense set must agree: K ·M = I(K, M ) as required. 7.3 Localization Proposition 7.3.1 Let τ be a stopping time. Then for M ∈ H 2 , f ∈ L2 (M ), Z τ ∧t Z t Z t Hs dMs = Hs dMsτ = Hs 1[0,τ ] (s)dMs . 0 0 0 H 2. Proof Take N ∈ Then for any t ∈ [0, ∞], Z τ ∧· Z · Z t τ Hs dMs , N = Hs dMs , N = Hs dhM, N τ i 0 0 0 t t Z · Z t Hs dhM τ , N is = Hs dMsτ , Ns . = 0 0 93 t The first required equality follows. We have used the properties of martingale brackets: hM, N τ i = hM τ , N i. For the second equality note that, Z · Z t Hs 1s≤τ dhM, N is . Hs 1s≤τ dMs , Ns = 0 0 t By properties of Lebesgue-Stieljes integrals, Z t Z t Z t Z t τ τ τ Hs dhM , N is = Hs dMs , N . Hs dhM, N is = Hs 1s≤τ dhM, N is = 0 0 0 0 t The second require inequality follows. Let S ≤ T be stopping times, t Z Hs dMsS = Rt Hs dMs 0 0 On t < S ∧ T , t∧S Z S 0 Hs dMs = Rt 0 Hs dMsT . Definition 7.3.2 Let M be a continuous local martingale. Let L2loc (M ) be the space of progressively measurable process H for which there is a sequence of stopping times Tn increasing to infinity such that Z Tn E 0 Hs2 dhM, M is < ∞. The class L2loc (M ) consists of all progressively measurable H such that t Z 0 Ks2 dhM, M is < ∞, ∀t. Proposition 7.3.3 Let M be a continuous local martingale with M0 = 0. If H ∈ L2loc (M ), there exists a local martingale H · N with (H · N )0 = 0 and hH · M, N i = H · hM, N i for all continuous local martingales Nt . Detail of the proof was not covered in class. Proof Let Z t Tn = inf t ≥ 0 : (1 + 0 94 Hs2 )dhM, M is ≥n . Then Tn is an increasing sequence of stopping times increasing to infinity such that M Tn is in H 2 and kH Tn kL2 (M Tn ) bounded by n. Then if n < m, t∧Tm Z and so 0 Hs dM Tm , 0 0 Rt t∧Tn Z Hs dM Tn = Hs dMsTn agrees with Z Rt 0 Hs dM Tm on t < Tn . Define t t Z Hs dMsTn Hs dMs = 0 0 on {t < Tn }. Rt Since (H · M )Tn = 0 Hs dMsTn , (H · M ) is a local martingale. Now let N be a continuous local martingale with reducing stopping times Sn such that N Sn is bounded. Define τn = Tn ∧ Sn . We see that Z ·∧τn τn τn τn τn τn Hs dhM, N is . hH · M, N i = h(H · M ) , N i = h(H · M ), N i = 0 Taking n → ∞ to see that hH · M, N it = Rt 0 Hs dhM, N is . Definition 7.3.4 A progressively measurable process f is locally bounded if there is an increasing sequence of stopping times increasing to infinity such that there are constants Cn with |f Tn | ≤ Cn for all n. Both continuous functions and convex functions are locally bounded. Any locally bounded functions are in L2loc (M ) for any continuous local martingale M . 7.4 Properties of Stochastic Integrals Definition 7.4.1 If Xt = Mt + At is a continuous semi-martingale, f a progressively measurable locally bounded stochastic process we define Z t Z t Z t fs dXs = fs dMs + fs dAs . 0 0 0 Proposition 7.4.2 Let X, Y be continuous semi-martingales. Let f, g, K be locally bounded and progressively measurable. Rt Rt Rt 1. 0 (afs + bgs )dXs = a 0 fs dXs + b 0 gs dXs . Rt Rt Rt 2. 0 fs (dXs + dYs ) = 0 fs dXs + 0 fs dYs . 95 3. Z t s Z fs d 0 = gr dXr fs gs dXs . ∞ Z 1s≤τ Ks dXs = ∞ Ks dXsτ . 0 0 0 t 0 0 4. For any stopping time τ , Z Z τ Ks dXs = Z R· 5. If Xs is of bounded total variation on [0, t] so is the integral 0 Ks dXs ; and if R Xs is a local martingale so is Ks dXs . In particular for a semi-martingale R· Xt this gives the Doob-Meyer decomposition of 0 Ks dXs . Definition 7.4.3 Let X, Y be continuous time semi-martingales. The Stratonovich integral is defined as: Z t Z t 1 Ys dXs + hX, Y it . Ys ◦ dXs = 2 0 0 Also note that for continuous processes, Riemann sums corresponding to a sequence of partitions whose modulus goes to zero converges to the stochastic integral in probability. Note that this convergence does not help with computation. Although there are sub-sequences that converges a.s. we do not know which subsequence of the partition would work and this subsequence would be likely to differ for different integrands and different times. Proposition 7.4.4 If K is left continuous and ∆n : 0 = tn0 < tn1 < · · · < tnNn = t is a sequence of partition of [0, t] such that their modulus goes to zero, then Z t Ks dXs = lim n→∞ 0 Nn X Ktni (Xtni+1 − Xtni ). i=1 The converges is in probability. Proposition 7.4.5 (Dominated Convergence Theorem) Let K n be locally bounded progressively measurable converging to K. Suppose that there is a locally bounded progressively bounded function F such that |K n | ≤ F . Then Z s Z s lim sup Krn dXr − Kr dXr → 0. n→∞ s≤t 0 0 The convergence is in probability. 96 7.5 Itô Formula Intuition. Let xt solves the ODE: ẋt = σ(xt ) Fundamental Theorem of Calculus: let f be C 1 and then Z t f 0 (xs )σ(xs )ds. f (xt ) = f (x0 ) + 0 Let f : R → R be a der terms says that C3 function, then Taylor’s Theorem with Cauchy remain- 1 1 f (y) = f (y0 ) + f 0 (y0 )y + f 00 (y0 )y 2 + f (3) (θ)y 3 . 2 3 We apply it to a semi-martignale (xt ): X f (xt ) = f (xti+1 ) − f (xti ) X 1 1 00 = f 0 (xti )(xti+1 − xti ) + f (xti )(xti+1 − xti )2 + f (3) (xθ(s) )(xti+1 − xti )3 2 3 Rt 0 R t 00 Heuristically the right hand side converges in probability to 0 f (xs )dxs + 21 0 f (xs )dhxis . The first two terms converges in probability. If all relevant terms are bounded the convergence holds in L2 . The third term can be seen to converge to zero because the process is of finite quadratic variation. We define Z Z Z t t s s fs dXs − fs dXs = 0 fs dXs . 0 Theorem 7.5.1 (Standard Form) Let Xt = (Xt1 , . . . , Xtn ) be a Rn valued continuous time semi-martingale and f a C 2 real valued function on Rn then for s < t, n Z t n Z X ∂f 1 X t ∂2f i f (Xt ) = f (Xs ) + (Xr )dXr + (xr )dhX i , X j ir . 2 s ∂xi s ∂xi ∂xj i=1 i,j=1 In short hand, Z f (Xt ) = f (Xs ) + s t 1 (Df )(Xr )dXr + 2 Z s t (D2 f )(Xr )dhX, Xir . If T is a stopping time, apply Itô’s formula to Yt = XT ∧t we see that Z T ∧t Z 1 T ∧t 2 f (XT ∧t ) = f (X0 ) + (Df )Xr dXr + (D f )xr dhX, Xir . 2 0 0 A special case is: 97 Proposition 7.5.2 The product formula. If Xt and Yt are real valued semi-martingales, Z t Z t Ys dXs + hX, Y it Xs dYs + Xt Yt = X0 Y0 + 0 0 If X = Y , this has been proven true in the construction of the bracket process when X is an L2 bounded continuous martingales. This extends by polarisation to the product formula. To prove the theorem we first assume that Xt is bounded and real valued. We prove that if Itô’s formula holds for a function f it holds for f (x)x, using the product formula. This shows the formula holds when f is a polynomial. If f is not a polynomial, we first assume that xt has compact support K. Let Pn be polynomials converging to f in C 2 . Finally we take Tn be a sequence of stopping times to prove that the formula holds. Example 7.5.3 Let Mt be a continuous semi-martingale. Let f (x) = ex and apply Itô’s formula to the process Mt − 21 hM, M it , assume that M0 = 0. Then 1 eMt − 2 hM,M it Z t = 1+ 1 + 2 Z 1 eMs − 2 hM,M is dMs − 0 t e Ms − 12 hM,M is 0 Z = 1+ t 1 2 Z 0 t 1 eMs − 2 hM,M is dhM, M is dhM, M is 1 eMs − 2 hM,M is dMs . 0 1 If Mt is a local martingale, eMt − 2 hM,MRit is a local martingale and is called the t exponential martingale of Mt . If Mt = 0 fs dBs then the exponential martingale is an integral with respect to the Brownian motion Bt . Remark 7.5.4 Let Mt be a continuous local martingale. The exponential martin1 gale Nt := eMt − 2 hM,M it is a martingale if and only if E(Nt ) = 1 for all t. Let Mt be a continuous local martingale with E|M0 | < ∞. Suppose that the family {MT− , T bounded stopping times } is uniformly integrable. Then Mt is a super martingale. It is a martingale if and only if EMt = EM0 . See Proposition 2.2 in Elworthy-Li-Yor 93(Probability Theory and Related Fields 1999, volume 115). If Mt = (Mt1 , . . . , Mtn ) we denote by hM, M it the matrix whose entries are i hM , M j it . Theorem 7.5.5 Let Xt be a semi-martingale. 98 ∂ 1. Assume that ∂t F (t, x) and continuous functions. Then ∂2 ∂xi ∂xj F (t, x), i, j = 1, . . . , d exists and are Z t Z t ∂F DF (s, Xs )dXs F (t, Xt ) =F (0, X0 ) + (s, Xs )ds + 0 ∂s 0 Z 1 t 2 + D F (s, Xs )dhXs , Xs i. 2 0 2. Itô’s formula holds for complex valued functions. 7.6 Lévy’s Martingale Characterization Theorem Definition 7.6.1 An Ft -adapted stochastic process (Xt ) is a (Ft ) Brownian motion if it is a Brownian motion and for each t ≥ 0, (Bt+s − Bs ) is independent of Fs . A Brownian motion is a FtB -martingale where FtB = σ{Bs , s ≤ t}. Theorem 7.6.2 [Lévy’s Martingale Characterization Theorem] An Ft adapted sample continuous stochastic process Xt in Rd vanishing at 0 is a standard Ft -Brownian motion if and only if each Xt is a Ft local martingale and hX i , X j it = δij t. The proof is not covered in class. P Proof Proof of the ‘if part’: for any λ ∈ Rd , Yt = hλ, Xt iRd = λj Xtj is a 1 2 local martingale with bracket |λ|2 t. The exponential martingale expihλ,Xt i+ 2 |λ| t is a martingale as it is bounded on any compact time interval, hence 1 E{expihλ,Xt −Xs i |Fs } = e− 2 |λ| 2 (t−s) . This is sufficient to show that Xt − Xs is independent of Fs and 1 E expihλ,Xt −Xs i = e− 2 |λ| 2 (t−s) , which implies that Xt − Xs ∼ N (0, t − s). Proof of ‘only if part’. First for s < t, E{Xti |Fs } = E{Xti − Xsi + Xsi |Fs } = E(Xti − Xsi ) + Xsi = Xsi and each Xt is a martingale. For s < t, E{(Xti )2 − (Xsi )2 |Fs } = E{(Xti − Xsi )2 |Fs } = E(Xti − Xsi )2 = t − s. Then hX i , X i it = t and the bracket of independent Brownian motions is zero. 99 Chapter 8 Stochastic Differential Equations Let m, d ∈ N , let {Bti , i = 1, . . . , m} be independent 1-dimensional Brownian motions on a filtered probability space (Ω, F, Ft , P ) with the standard assumptions on completeness and right continuity. A stochastic differential equation of (Markovian type) is of the form: dxt = m X σi (t, xt )dBti + b(t, xt )dt. i=1 The functions σi , b : R+ × Rd → Rd are assumed to be Borel measurable. A solution (xt , 0 ≤ t < ζ(x0 )) is a stochastic process that satisfy the integral equation: Z t m Z t X xt = x0 + σi (s, xs )dBsi + b(s, xs )ds i=1 0 0 and in particular the integrals must be defined. The value x0 is the initial value and ζ(x0 ) : Ω → R+ is the life time of (xt ). For clarity, we write the above SDE, just this once, in the long form. The components in Rd of the Rd valued random variable and functions are: xt = (x1t , . . . , xdt ), σi = (σi1 , . . . , σid ), b = (b1 , . . . , bd ). Then Z t m Z t X 1 1 1 k dxt = x0 + σk (s, xs )dBs + b1 (s, xs )ds 0 0 k=1 Z t m Z t X 2 2 k dx = x2 + σk (s, xs )dBs + b2 (s, xs )ds t 0 0 0 0 0 k=1 ... Z t m Z t X d d d k dx = x + σ (s, x )dB + bd (s, xs )ds. s 0 s k t k=1 100 We also introduce the short form of the SDE, in which we let σ = (σ1 , . . . , σm ) and Bt = (Bt1 , . . . , Btm ): dxt = σ(t, xt )dBt + b(t, xt )dt. To motivate the definition of solutions let us recall the Central Limit Theorem: the law of the sum of n independent random variables (satisfying certain integrability conditions) after scaling with respect to n converges to a Gaussian distribution. Hence a system governed by dxt = b(xt )dt subject to many small independent influences can be described by adding a Gaussian random variable. The variation with time is embedded in the Brownian motion and the dependence of the randomness on the location is reflected in the diffusion coefficient σ. What is Brownian motion for us? It is a martingale, its probability distribution is Gaussian etc. All these are built on a chosen environment (Ω, F, Ft , P ), which is an added tool/structure and which is supposedly to represent the the random perturbation faithfully. There is, however, no reason why one choice of such probability space is better than another choice. One important notion is whether the randomness in our solution is purely a function of the Brownian motion. It should be, but not necessarily true according to the definition below. There is the notion of weak solutions which only describe the statistical property of the solution, i.e. its law, but not the sample pathwise property of the solution. 8.0.1 Stochastic processes defined up to a random time The stochastic process Xt (ω) := 2−B1t (ω) is defined up to the first time Bt (ω) reaches 2. We denote this time by τ . For any given time t, no matter how small it is, there are a set of path of positive probability (measured with respect to the Wiener measure on C([0, t]; Rd )) which will have reached 2 by time t: r Z ∞ y2 2 P (τ ≤ t) = P (sup Bs ≥ 2) = 2P (Bt ≥ 2) = e− 2 dy. π √2 s≤t t This probability converges to zero as t → 0. We say that Xt is defined up to τ and τ is called its life time or explosion time. Let Rd ∪ {∆} be the one point compactification of Rd , which is a topological space whose open sets are open sets of Rd plus set of the form (Rd r K) ∪ {∞} where K denotes a compact set. Given a process (Xt , t < τ ) on Rd we define a process (X̂t , t ≥ 0) on Rd ∪ {∆}: Xt (ω), if t < τ (ω) X̂t (ω) = . ∆, if t ≥ τ (ω). 101 If Xt is a continuous process on Rd then X̂t is a continuous process on Rd ∪ ∆. Define Ŵ (Rd ) ≡ C([0, T ]; Rd ∪ ∆} whose elements satisfy that: Yt (ω) = ∆ if Ys = ∆ for some s ≤ t. The last condition means that once a process enters the coffin state it does not return. 8.1 Lecture 24. Stochastic Differential Equations For i = 1, 2, . . . , m, let σi , b : R+ × Rd → Rd be Borel measurable locally bounded functions. Let Bt = (Bt1 , . . . , Btm ) be an Rm valued Brownian motion. 8.2 The definition Definition 8.2.1 An Ft adapted stochastic process (Xt ) is a (Ft ) Brownian motion if it is a Brownian motion and for each t ≥ 0, (Bt+s − Bs ) is independent of Fs . Definition 8.2.2 A solution to the SDE dxt = σ(t, xt )dBt + b(t, xt )dt, (8.1) consists of (1) a filtered probability space (Ω, F, Ft , P ); (2) a Ft Brownian motion Bt = (Bt1 , . . . , Btm ); (3) an adapted continuous stochastic process (xt , t < τ ) in Rd , where τ : Ω → R+ ∪ {∞} is measurable, such that for all stopping times T < τ , Z T m Z T X k σk (s, xs )dBs + b(s, xs )ds, xT = x0 + k=1 0 0 We say that the SDE holds on {t < τ (ω)}. If τ can be chosen to be ∞, it is equivalent to say that for all t ≥ 0, Z t m Z t X k xt = x0 + σk (s, xs )dBs + b(s, xs )ds. k=1 0 0 If the functions σi (t, x) and b(t, x) do not depend on t, the SDE is said to be time homogeneous. We concentrate on time homogeneous SDEs (of Markovian type): m X dxt = σi (xt )dBti + b(xt )dt. (8.2) i=1 102 Example 8.2.3 The following SDE is not Markovian: Z t 2 dxt = (xr ) dr dBt . 0 8.3 Examples Example 8.3.1 Consider ẋ(t) = ax(t) on R where a ∈ R. Let x0 ∈ R. Then x(t) = x0 eat is a solution with initial value x0 . It is defined for all t ≥ 0. Let φt (x0 ) = x0 eat . Then (t, x) 7→ φt (x) is continuous and φt+s (x0 ) = φt (φs (x)). Example 8.3.2 Linear Equation. let a, b ∈ R. Let d = m = 1. Then x(t) = x0 eaBt − a2 t+bt 2 solves dxt = a xt dBt + b xt dt, x(0) = x0 . The solution exists for all time. To prove this statement, define a function f : R2 → R by f (x, t) = x0 eax+(b− a2 )t 2 . We apply Itô’s formula to f , (Bt , t) → f (Bt , t). 2 2 a = af , ∂∂xf2 = a2 f , and ∂f ∂s = (b − 2 )f , Z t Z a2 t f (Bt , t) = f (0, 0) + af (Bs , s)dBs + f (Bs , s) dhB, Bis 2 0 0 Z t a2 + (b − )f (Bs , s)ds 2 0 Z t Z t = x0 + af (Bs , s)dBs + bf (Bs , s)ds. Since f (0, 0) = x0 , ∂f ∂x 0 0 This proves that f (Bt , t) is indeed a solution. Is this solution unique? The answer is yes, let yt be a solution starting from the same point, we could compute and prove that E|xt − yt |2 = 0 for all t, which implies that xt = yt a.s. for all t. However we delay in proving uniqueness until Theorem 8.4.5. Example 8.3.3 Additive noise. Let a ∈ R be a number and b : R → R be a function, consider dxt = b(xt )dt + a dBt . 103 A special case of this is perturbation to Newtonian mechanics. A particle of mass 1, subject to a force which is proportional to its own speed, is subject to v̇t = −kvt . The following is called the Langevin equation: dvt (ω) = −kvt (ω)dt + dBt (ω). For each realisation of the noise (that means for each ω), the solution is an OrnsteinUhlenbeck process, Z t −kt e−k(t−r) dBr (ω). vt (ω) = v0 e + 0 We check that the above equation satisfies −k Z −k 0 t 0 vs ds = vt − v0 − Bt . Z t Z s −ks kr vs ds = −kv0 e ds − ke e dBr ds 0 0 0 Z t Z s −kt kr = −v0 + v0 e + e dBr d(e−ks ) 0 0 Z s Z s Z t s=t ekr dBr = −v0 + v0 e−kt + e−ks − e−ks d ekr dBr s=0 0 0 0 Z t ekr dBr − Bt = −v0 + v0 e−kt + e−kt Z t Rt −ks 0 = −v0 + vt − Bt . This proved that the Ornstein-Uhlenbeck process is solution to the Langevin equation, with life time ∞. Example 8.3.4 (1) Small Perturbation. Let > 0 be a small number, Z t xt = x0 + b(xs )ds + Bt . 0 As → 0, xt → xt . (Exercise) Rt √ (2) Let yt = y0 + 0 b(ys )ds + Wt . Assume that b is bounded, as → 0, yt on any finite time interval converges uniformly in time on any finite time interval [0, t], E sup0≤s≤t (ys − y0 ) → 0. A more difficult question to ask is: Does lim→0 y t exist? 104 8.3.1 Pathwise Uniqueness and Uniqueness in Law Example 8.3.5 Tanaka’s SDE dxt = sign(xt )dBt where ( −1, if x ≤ 0 sign(x) = 1, if x > 0. Take any probability space and any Brownian motion (Bt ). Define Z Wt = t sign(Bs ) dBs . 0 This is a local martingale with quadratic variation t and hence a Brownian motion. Furthermore Z Z t t sign(Bs )dWs = 0 dBs = Bt . 0 This means that the pair (Bt , Wt ) is a solution to the SDE dBt = sign(Bt )dWt . Question: Is the solution unique? 1. Answer 1: No. Both (Bt ) and (−Bt ) are solutions. Rt 2. Answer 2: Yes. Any solution, xt = x0 + 0 sign(xs )dBs , is a martingale with quadratic variation t. By Lévy Characterisation Theorem, the distribution of (xs , s ≤ t) is the Wiener measure on C([0, t]; Rd ). The probability distribution of the solutions to Tanaka’s equation is unique. Definition 8.3.6 • If the SDE has the following property, whenever xt and x̃t are two solutions with x0 = x̃0 almost surely then the law of {xt : t ≥ 0} is the same as the law of {x̃t , t ≥ 0}, we say that uniqueness in law holds. • We say pathwise uniqueness of solution holds for an SDE if whenever xt and x̃t are two solutions on the same probability space (Ω, F, Ft , P ) with the same Brownian motion Bt and same initial data ( x0 = x̃0 a.s.), then xt = x̃t for all t ≥ 0 almost surely. Uniqueness in law holds for Tanaka’s SDE. Pathwise uniqueness fails for Tanaka’s SDE. Uniqueness in law implies the following stronger conclusion: whenever x0 and x̃0 have the same distribution, the corresponding solutions have the same law. 105 8.3.2 Maximal solution and Explosion Time Definition 8.3.7 A solution (xt , t < τ ) of an SDE is a maximal solution if (yt , t < τ̄ ) is any other solution on the same probability space with the same driving noise and with x0 = y0 a.s., then τ ≥ τ̄ a.s.. We say that τ is the explosion time or the life time of (xt ). Let τ N be the first time that |xt | ≥ N . Then from part (3) of the definition for a solution, for each N , xt∧τ N = x0 + m Z X t∧τ N σk (xs )dBsk Z b(xs )ds. + 0 0 k=1 t∧τ N Furthermore τ = supN τ N . On τ < ∞, limN →∞ |xτN | = ∞. Example 8.3.8 Consider ẋ(t) = (x(t))2 , x(0) = x0 . Then x(0) = 0 x(t) ≡ 0, x0 x(t) = , x(0) = x0 6= 0. 1 − x0 t 1 is a solution starting from x(0). For example, let x0 = 1. x(t) = 1−t , t < 1) is the maximal solution. The time T = τ is the life time of the solution and limt→τ x(t) = ∞. Example 8.3.9 Consider the SDE on R1 , dxt = (xt )2 dBt + (xt )3 dt. 1 Let τ be the first time that Bt hits 1. Prove that x(t) = 1−B(t) is a solution with initial value 1. 1 1 Let f (x) = 1−x . On |x| < 1, the function f is smooth, f 0 (x) = (1−x) 2 = f 2 (x), f 00 (x) = 2 (1−x)3 = 2f 3 (x). Hence 1 =1+ 1 − Bt Z 0 t 1 dBs + (1 − Bs )2 Z 0 t 1 ds. (1 − Bs )3 The functions σ(x) = x2 and b(x) = x3 are C 1 and so locally Lipshcitz continuous. By the theorem below there is a unique strong solution and all solutions agree before they exit the ball of radius 1. Since lim t→τ 1 = ∞, 1 − Bt 1 ( 1−B , t < τ ) is the maximal solution and τ is the explosion time. t 106 The above SDE is equivalent to the following SDE in Stratonovich form dxt = (xt )2 ◦ dBt , i.e. Z t (xs )2 ◦ dBs . xt = x0 + 0 Definition 8.3.10 We say that the SDE is complete, or conservative or does not explode, if for all initial values the maximal solution exists for all time. 8.3.3 Strong and Weak Solutions Definition 8.3.11 A solution (xt , Bt ) on (Ω, F, Ft , P ) is said to be a strong solution if for each t, xt is adapted to the filtration of Bt . By a weak solution we mean one which is not strong. The solution for Tanaka’s SDE is a not strong solution. Note that if (Bt , σ, b) are given, any solution we could conceivably ‘construct’ concretely from the trio is adapted to the filtration FtB of the driving noise Bt , which could be strictly smaller than Ft . The solution constructed in Tanaka’s example is not constructed from a given driving Brownian motion, which explains somewhat why the construction looks slightly peculiar. This construction is typical for an SDE that has the uniqueness in law property without pathwise uniqueness property. Proposition 8.3.12 Suppose a random function Y with values in a metric space M is adapted to a the natural filtration of a stochastic process xs , 0 ≤ s ≤ t. Let B([0, T ]); M ) be the space of Borel measurable function from [0, T ] to M . Then there exists a function F : B([0, T ]); M ) → M such that Y = F (x· ). If xt is a strong solution, it is adapted to the filtration of Bt , and so xt is a function of (Bs , s ∈ [0, t]). Denote this function by F (x0 , ·) : W0m → Ŵ d , so xt (ω) = Ft (x0 , B(ω)). Here W0m is the space of continuous functions from R+ to Rm starting from 0. In the case of Ω is the Wiener space W0d equipped with the Wiener measure, Bt (ω) = ωt is a Brownian motion and we have Ft (x, ω). Example 8.3.13 ODE ẋt = (xt )α dt, α < 1, which has two solutions from zero: 1 1 the trivial solution 0 and xt = (1 − α) 1−α t 1−α . Both uniqueness fails. Example 8.3.14 Dimension d = 1. Consider dxt = σ(xt )dWt . Suppose that σ is Hölder continuous of order α, |σ(x) − σ(y)| ≤ c|x − y|α for all x, y. If α ≥ 1/2 then pathwise uniqueness holds for dxt = σ(xt )dWt . If α < 1/2 uniqueness no longer holds. For α > 1/2 this goes back to Skorohod (62-65) and Tanaka(64). The α = 1/2 case is credited to Yamada-Watanabe. 107 8.3.4 The Yamada-Watanabe Theorem Although it is not clear at the first glance what are the relation between pathwise uniqueness and uniqueness in law, it becomes clear later that the former implies the latter. The following beautiful, and somewhat surprising, theorem of Yamada and Watanabe, states that the existence of a weak solution for any initial distribution together with pathwise uniqueness implies the existence of a unique strong solution. Proposition 8.3.15 If pathwise uniqueness holds then any solution is a strong solution and uniqueness in law holds. Theorem 8.3.16 (The Yamada-Watanabe Theorem) If for each initial probability distribution there is a weak solution to the SDE and suppose that pathwise uniqueness holds then there exists a Borel measurable map: F : Rd × W0m → Ŵ d such that 1. For any Bt , and x0 ∈ Rd , Ft (x0 , B) is a solution with the driving noise Bt . 2. If xt is a solution on a filtered probability space with driving noise Bt , then xt = Ft (x, B) a.s. For a proof see see Revuz-Yor for detail. First Observation: If there is a solution there is a solution on a canonical probability space. Given a solution (xt , Bt ) let µ be its joint distribution. Consider Ŵ d × W m with this measure, the canonical filtration Gt generated by the coordinate process. Let ωt1 be the projection of the coordinate process to Ŵ d , wt the projection of the coordinate process to the second component W m . Then wt is a Brownian motion and ωt1 the solution with the driving Brownian motion wt , due to the following lemma. Lemma 8.3.17 Let f, g be locally bounded predictable processes (measurable with respect to the filtration generated by left continuous processes), and B,R W continut ous semi-martingales. If (f, B) = (g, W ) in distribution then (f, B, 0 fs dBs ) = Rt (g, W, 0 gs dWs ) in distribution. Second Observation: Given two solutions on two probability space we could build them on the same probability space: Ŵ d × Ŵ d × W m . Let Q1 , Q2 be the joint distribution of the two solutions (x1t , Bt1 ) and (x2t , Bt2 ), they are measures on Ŵ d × W m . Let Q̃i , i = 1, 2 be the regular conditional expectation of w·i given w· then Qi (dω i , dω) = Q̃i (ω, dω i )Q(dω). 108 Then Q̃1 (ω, dω 1 )⊗ Q̃2 (ω, dω 2 )Q(dω) induces a measure on Ŵ d × Ŵ d ×W m such that the projection to the third component of the coordinate process wt (ω1 , ω2 , ω) = ωt is a Ft Brownian motion where Ft is the standard filtration on the Wiener space on R2d+m . Then wt1 and wt2 are two solutions with the same driving Brownian motion. By pathwise uniqueness, they have the same law. Since they are equal and independent conditioned on wt , they are functions of wt and their laws are delta measures. 8.3.5 Strong Complteness, flow Suppose that path wise uniqueness holds and the SDE does not explode. Definition 8.3.18 If for each point x there is a solution Ft (x, ω) to X dxt = σi (xt ) ◦ dBti + b(xt )dt, i and there is a version of Ft (x, ω) such that (t, x) 7→ Ft (x) is continuous, we say that the SDE is strongly complete. Definition 8.3.19 Let ξ be Fs -measurable. We denote by Fs,t (ξ) the solution to: Z t XZ t i xt = ξ + σi (xr )dBr + b(xr )dr. i s s For simplification let Ft = F0,t . Definition 8.3.20 Let S be a stopping time we define the shift operator: θS B = BS+· − BS . The process (θS B)t := BS+t −BS is an F·+S BM. If (Bt ) is the canonical process on the Wiener space, this is θS (ω)(t) = ω(S + t) − ω(S). Theorem 8.3.21 Given 0 ≤ S ≤ T be stopping times, assume there is a unique global strong solution Ft (·, B), t ≥ 0) to the SDE. Then the flow property holds: FS,T (FS (x0 , B), ω) = FT (x, B). (8.3) And the cocycle property holds: FT −S (FS (x, ω), θS (ω)) = FT (x, ω). (8.4) Proof The flow property follows from the pathwise uniqueness of the solution. 109 8.3.6 Markov Process and its semi-group Definition 8.3.22 A family of Ft adapted stochastic process Xt is a Markov process if for all real valued bounded Borel measurable function f and for all 0 ≤ s ≤ t, E{f (Xt )|Fs } = E{f (Xt )|Xs }. It is strong Markov if for all stopping time τ and t ≥ 0, E{f (Xτ +t )|Fτ } = E{f (Xτ +t )|Xτ }. The remaining of this sub-section is not covered in 2012-12013. Definition 8.3.23 Let (E, B(E)) be a measurable space. A function Ps,t (x, Γ) defined for all 0 ≤ s ≤ t < ∞, ∈ E and A ∈ B(E) is a Markov transition function if, 1. For all 0 ≤ s ≤ t and x ∈ E, A 7→ Ps,t x, A) is a probability measure on B(E) 2. For all A ∈ B(E) and 0 ≤ s ≤ t, x 7→ Ps,t (x, A) : [0, ∞) × E → R is bounded and Borel measurable. 3. For all 0 ≤ s ≤ t ≤ u, all x ∈ E, and A ∈ B(E), Z Ps,u (x, A) = Ps,t (x, dy)Pt,u (y, A). y∈E This equation is the Chapman-Kolmogorov equation. If Ps,t depends only on the increment t − s we define Pt = Ps+u,t+u . Definition 8.3.24 A Markov process (Xt ) on E with Markov transition function Ps,t if for 0 ≤ s ≤ t and for all f : (E, B(E)) → R bounded measurable Z E{f (Xt )|Xs = x} = f (y)Ps,t (x, dy). y If the transition function is time homogeneous, the Markov process is a time homogeneous Markov process. For simplicity we only consider the homogeneous Markov processes. We define a family of linear operators on the space of bounded measurable functions on E: Z Tt f (x) = f (y)Pt (x, dy). y The Chapman-Kolmogorov equation will imply that Tt is a semi-group of linear operators: T0 f = f and Ts+t f = Ts Tt f . 110 Remark 8.3.25 If (Xt ) is a Markov process we may be tempted to define: {Ts,t , t ≥ s ≥ 0} on Bounded measurable functions by Ts,t f (x) = E{f (Xt )|Xs = x}. Indeed this is a good proposal needing some thoughts. The conditional expectation E{f (Xt )|Xs } is defined up to a set of measure zero and this set of measure zero may differ when the function f is changed. For the object E{f (Xt )|Xs = x} to be well defined we only need to consider matters related to regular conditional probabilities, such considerations are in general quite messy. However if we begin with a transition function, we have no more problem. A real valued Markov process Xt is a diffusion process its its transition probability satisfies certain properties. Roughly speaking, given xs = x, xs+h − xs ∼ b(s, x)h + δx + o(h) for some functions b, σ and a random function δ that satisfies and E(δx)2 = σ(s, x)h + o(h). We will review a number of properties concerning the Markov transition function. Let E = R for simplicity. If Ps,t (x, dy) is absolutely continuous with respect to dy its density is denoted by p(s, x, t, y). Theorem 8.3.26 If the transition density p(s, x, t, y) of a diffusion process is measurable in all its arguments, then for each s < u < t, x and for almost all y, Z p(s, x, u, z)p(u, z, t, y)dz = p(s, x, t, y). Theorem 8.3.27 Assume that the transition density p(s, x, t, y) of a diffusion process satisfies: 1. For 0 ≤ s < t, t − s > δ > 0, p(s, x, t, y) is continuous and bounded in s, t, x. 2. p is twice differentiable in x and once differentiable in t. Then for 0 < s < t, p(s, x, t, y) satisfies the backward Kolmogorov equation: ∂p ∂p 1 ∂2p (s, x, t, y) = −b(s, x) (s, x, t, y) − σ(s, x) 2 (s, x, t, y). ∂t ∂x 2 ∂x In the backward Kolmogorov equation we differentiate the x-variable. A integration by parts formula gives, 111 Theorem 8.3.28 Assume that the transition density p(s, x, t, y) of a diffusion pro∂p ∂2p cess is such that the partial derivatives ∂p ∂s (s, x, t, y), ∂y (s, x, t, y), ∂y 2 (s, x, t, y) ∂b exist. Assume that ∂σ ∂x (s, x) and ∂x (s, x) exist. Then the Kolomogrov’s equation (Fokker-Plank equation) holds for s < t: ∂p ∂ 1 ∂2 (s, x, t, y) = − (b(s, x)p(s, x, t, y)) + (σ(s, x)p(s, x, t, y)) . ∂t ∂y 2 ∂y 2 8.3.7 The semi-group associated to the SDE Let Ft (x) be the solution to the SDE with initial value x and life time τ (x). We define a family of linear operators {Pt , t ≥ 0} on Bounded measurable functions by Pt f (x) = E f (Ft (x))1t<τ (x) . The flow property gives the semi-group property of the family of operators: Pt+s = Pt Ps , P0 = Id. It is easier to work first with the assumption that there is no explosion, nevertheless it is worth remembering that we do not need to assume non-explosion. Definition 8.3.29 A family of Ft adapted stochastic process Xt is a Markov process with Markov semi-group Pt if for all f bounded Borel measurable, 0 ≤ s ≤ t, E{f (Xt )|Ft } = Pt−s f (Xs )). It is strong Markov if for a stopping time S, E{f (XS+t )|FS } = Pt f (XS ). Example 8.3.30 The Brownian motion (x + Bt ) is a strong Markov process. Let x ∈ R and f : Rn → R bounded Borel measurable and define Pt f (x) = Ef (x + Bt ). (8.5) We have, E{f (x + Bt+S )|FS }) = E{f (x + BS + (Bt+S − BS ))|FS } = Pt f (x + BS ). Given the existence of a strong solution and path wise uniqueness, the solution Ft (x0 , ω) to the SDE dxt = σ(t, xt )dBt + b(t, xt )dt is a strong Markov process. In fact the Cocylcle property implies the Markov property and the Cocycle property with stopping times implies the strong Markov proerty. Ef (FS+t (x, B)) = EE {f (Ft (FS (x), θS (B)))|FS (x)} = Ef (Ft (−, θS (B)))(FS (x)) = Pt f (FS (x)). 112 8.3.8 The Infinitesimal Generator for the SDE Let f : Rd → R be C 2 , we define a linear operator L : C 2 (Rd ) → C(Rd ) by ! d d m X 1 X X i ∂f ∂2f j Lf (x) = (x) + (x)bj (x). σk (x)σk (x) 2 ∂xi ∂xj ∂xj i,j=1 j=1 k=1 We also denote by Lie derivative notation the last term: d X ∂f (x)bj (x) ≡ h∇f, bi. ∂xj Lb f (x) = j=1 The above is abbreviated as m Lf = 1X Lσk Lσk f + Lb f. 2 k=1 Proposition 8.3.31 Let f : Rd → R be a C 2 function and (xt , t < τ ) a solution to the SDE E(σ, b), then for any stopping time T < τ , Z f (xT ) = f (x0 ) + T Z 0 T Lf (xs )ds. (df )xs (σ(xs )dBs ) + 0 The linear operator L, from C 2 to the set of bounded measurable functions, is called the infinitesimal generator of the SDE E(σ, b). Here we adopted the notation, Z T (df )xs (σ(xs )dBs ) = 0 = m Z X t (df )xs (σk (xs ))dBsk k=1 0 m X d Z t X k=1 j=1 0 ∂f (xs )(σkj (xs ))dBsk . ∂xj P j i Let σ T σ denotes d × d square matrix whose (i, j)-th entry is m k=1 σk (x)σk (x), it is positive definite. Let t > 0, the term Z t df Mt := f (xt ) − f (x0 ) − Lf (xs )ds 0 113 is seen to be a local martingale. (This leads to Stroock-Varadhan’s wonderful martingale method for the existence of weak solutions.) We say that f is the the domain of the generator of L, if lim t→0 Pt f − f = Lf. t For suitable functions f , given suitable conditions on σ, b (e.g. BC 2 ) d Pt f = L(Pt f ). dt ∂ The equation ∂t = L is called Kolmogorov’s forward equation. The equation ∂ ∂t = −L is called Kolmogorov’s backward equation. 8.4 8.4.1 Existence and Uniqueness Theorems Gronwall’s Lemma Gronwall’s Lemma is also called Gronwall’s inequality is a comparison theorem 1 for one R x dimensional ordinary differential equations. For g ∈ L , any a ∈ −∞ ∪ R, a f (t)dt is absolutely continuous, of finite variation on R. Furthermore Rx d dx 0 f (t)dt = g almost surely. (Only local integrability is required if we work on a finite interval.) Rt In the following proposition we assume that t0 β(s)f (s)ds exists. Proposition 8.4.1 Let t0 ≥ 0, β ∈ L1loc , then Suppose that f (t) ≤ C(t) + Rt t0 β(s)f (s)ds. Then Z t f (t) ≤ C(t) + Rt C(s)β(s)e s β(r)dr ds. t0 Proof First R Z s Z s R d − ts βr dr − ts βr dr e 0 β(r)f (r)dr = β(s)e 0 − β(r)f (r)dr + f (s) ds t0 t0 − ≤ C(s)β(s)e Rs t0 βr dr Integrate the above from t0 to t: Z t Z t R R − t β dr − s β dr e t0 r β(r)f (r)dr ≤ C(s)β(s)e t0 r ds. t0 t0 114 Rt Hence from the assumption, f (t) ≤ C(t) + Rt βr dr t0 β(s)f (s)ds, t Z − C(s)β(s)e f (t) ≤ C(t) + e 0 Z t Rt ≤ C(t) + C(s)β(s)e s βr dr ds t0 Rs t0 βr dr ds t0 Theorem 8.4.2 (Grownall’s Inequality) Let t0 ≥ 0. Suppose that β is locally integrable and C 0 (t) exists and is locally integrable. Then from Z t f (t) ≤ C(t) + β(s)f (s)ds t0 (1) we conclude Rt f (t) ≤ C(t0 )e t0 β(r)dr Z t + C 0 (s)e Rt s β(r)dr ds. t0 (2) Let C(t) be a constant. Then C 0 (t) = 0 and Rt f (t) ≤ Ce t0 βr dr . Proof By the previous Proposition, Z t f (t) ≤ C(t) + Rt C(s)β(s)e s βr dr ds. t0 To the right hand side we apply integration by parts, Z t Rt C(t) + C(s)β(s)e s βr dr ds = C(t) − e t0 Rt t0 βr dr t Z C(s) t0 Rt = C(t) − e t0 βr dr C(t)e − d − Rts βr dr e 0 ds ds Rt t0 βr dr Z t − C(t0 ) − 0 − C (s)e Rs t0 βr dr ds t0 Rt = C(t0 )e t0 βr dr Z t + C 0 (s)e Rt s βr dr ds t0 115 8.4.2 Main Theorems For a matrix A = (aj,k ) define kAk = qP j,k |aj,k |2 . Definition 8.4.3 We say that f : Rd → Rd are Lipschitz continuous with Lipschitz constant K, if there exists a constant K > 0 such that x, y ∈ Rd . |f (x) − f (y)| ≤ K |x − y| , If is locally Lipschitz continuous, if there is CN > 0 such that |f (x) − f (y)| ≤ CN |x − y|, x, y ∈ BN . A vector valued function f = (f1 , . . . , fn ) is Lipschitz continuous if and only if each fi is Lipschitz continuous. A differentiable functions f with its derivative df , ∂f equivalently its partial derivatives ∂x , bounded is Lipschitz continuous. Any C 1 i function is locally Lipschitz continuous. Theorem 8.4.4 Suppose that the coefficients are locally Lipschitz continuous. For each initial value x0 there is a unique strong solution (xt , t < τ ). Furthermore limt↑τ |xt | = ∞ on {ω : τ (ω) < ∞}. 8.4.3 Global Lipschitz case Below we will need Hölder’s inequality, for p, q ∈ (1, ∞) such that Z t Z |f (s)g(s)|ds ≤ p1 Z t p t q |f (s)| ds 0 |g(s)| ds 0 1 p + 1 q = 1, 1q . 0 Theorem 8.4.5 Suppose that the coefficients are Lipschitz continuous. For each initial value x0 there is a unique strong solution (xt , t ≥ 0). Furthermore the solution is sample continuous. Proof Let us prove the Lipschitz case with d = 1, m = 1: Z t Z t xt = x0 + σ(xs )dBs + b(xs )ds. 0 0 Step 1. Let (xt , yt ) be two solutions. Then Z t 2 Z t 2 2 (σ(xs ) − σ(ys ))dBs + 2 (b(xs ) − b(ys ))ds |xt − yt | ≤ 2 0 Z ≤2 2 t (σ(xs ) − σ(ys ))dBs 0 Z 0 t + 2t 0 116 (b(xs ) − b(ys ))2 ds. Apply Burkholder-Davies-Gundy inequality to see that, Z t Z t 2 2 E (b(xs ) − b(ys ))2 ds. E|xt − yt | ≤ 2E (σ(xs ) − σ(ys )) ds + 2t 0 0 Since σ(x) − σ(y)| ≤ K|x − y|, |b(x) − b(y)| ≤ K|x − y|, Z E|xt − yt |2 ≤ 2K 2 t E|xs − ys |2 ds. 0 By Grownall’s inequality, E|xt − yt |2 = 0 for all t and xt = yt almost surely. Step 2. Existence. Fix T > 1. Define, for all t ∈ [0, T ], (0) = x0 , (1) = x0 + xt xt Z t t Z σ(x0 )dBs + b(x0 )ds 0 0 ... (n) xt Z = x0 + t σ(x(n−1) )dBs + s Z 0 t b(x(n−1) )ds s 0 and (n+1) xt Z = x0 + t σk (x(n) s )dBs + t Z 0 b(x(n) s )ds. (8.6) 0 2A. We prove that for all t > 0, (n+1) 2 E sup |xs | < ∞. s≤t This holds for n = 0. We assume that this holds for n and prove it for n + 1. Z t 2 (n+1) 2 2 (n) E sup |xs | ≤4|x0 | + 4E sup σ(xr )dBr s≤t s≤t Z + 4E sup s≤t 0 0 2 (n) b(xr ) dr s Below we apply the Burkholder-Davies-Gundy inequality to the local martingale term. There is a constant C such that Z t 2 Z · Z · (n) (n) (n) E sup σ(xr )dBr ≤ CE σ(xr )dBr , σ(xr )dBr s≤t 0 0 Z = CE t σ 2 (xr(n) )dr. 0 117 0 t Apply Hölder’s inequality gives s Z |b(x(n) r )|dr sup s≤t 2 Z ≤t t |b(xr(n) )|2 dr. 0 0 Since σ, b are Lipschitz continuous, there is C2 > 0 such that |σ(x)| ≤ C2 (1 + |x|), |b(x)| ≤ C2 (1 + |x|). Z t Z t 4 2 (n) 2 2 ≤4|x | + 4CE σ (x )dr + 4tE |b(x(n) E sup |x(n+1) | 0 r r )| dr s s≤t 0 4 0 t Z ≤4|x0 | + 8CE (1 + 2 (x(n) r ) )dr Z t + 8tE 0 2 (1 + (x(n) r ) )dr 0 <∞. 2B. For any k ∈ N, t > 0, 2 E sup |x(k+1) − x(k) s s | s≤t Z ≤ 2E sup s≤t Z t ≤ 2CE s (σ(x(k) s ) − σ(x(k−1) )dBs s 2 + 2E sup s≤t 0 (σ(x(k) s ) 0 ≤ (2CK 2 + 2tK 2 ) σ(x(k−1) )2 ds s − Z t Z s (b(x(k) s ) 0 t Z (k−1) 2 E(b(x(k) ) ds s ) − b(xs + 2t 0 (k−1) 2 E(x(k) ) ds s − xs 0 ≤ (2CK 2 + 2tK 2 ) t Z 2 2 (k−1) 2 E sup (xt1 − xt1 ) dt1 r≤t1 0 2 (k) t Z ≤ (2CK + 2tK ) t1 Z dt1 0 E sup (x(k−1) − x(k−2) )2 dt2 r r r≤t1 0 ≤ ... 2 2 k Z t ≤ (2CK + 2tK ) Z dt1 0 t1 Z tk−1 ... 0 2 2 2 k ≤ E sup(x(1) r − x0 ) (2CK + 2tK ) ( r≤t Dk = C1 k! 2 E sup (x(1) r − x0 ) dtk r≤tk 0 tk k! ) . (1) Here C1 = E supr≤t (xr − x0 )2 and D = (2CK 2 + 2tK 2 )t. 118 − b(x(k−1) )ds s 2 2C. Let n > m, (m) |x(n) s − xs | ≤ n−1 X (k+1) |xt (k) − xt |. k=m (n) Hence {xs (ω)} is a Cauchy sequence if ∞ X (k+1) |xt (k) (ω) − xt (ω)| < ∞. k=1 By 2B, E ∞ X (k+1) |xt − (k) xt | ≤ ∞ X (k+1) E|xt − (k) xt |2 1 2 k=1 k=1 ∞ r X (k+1) (k) ≤ E sup |xs − xs |2 ≤ k=1 ∞ X k=1 s≤t C1 Dk < ∞. k! (n) 2D. Let xt (ω) = limn→∞ xt (ω). The process is continuous in time by the uniform convergence. Take n → ∞ in equation (8.6), note that Z t Z t σ(xs(n) )dBs → σ(xs )dBs 0 0 in L2 . There is an almost surely convergent subsequence. Hence Z t Z t xt = x0 + σ(xs )dBs + b(xs )ds. 0 0 (n) Step 3. The solution is certainly a function of Bt as each xt a strong solution and is defined for all time. 8.4.4 is, and is hence Locally Lipschitz Continuous Case If xt , t < τ is a continuous adapted stochastic process, a maximal solution to the SDE E(σ, b), with life times τ . Let τ N (ω) = inf{t : |xt (ω)| > N }. 119 Theorem 8.4.6 Assume that σ, b are locally Lipschitz continuous then there is a unique strong solution (xt , t < τ ) which is a maximal solution. Proof Write b = (b1 , . . . , bd ) in components. For each N , let bN j , j = 1, . . . , d, N N be a globally Lipschitz continuous function with bj = b if |x| ≤ N and bN j =0 N N N N if |x| > N + 1. Let b = (b1 , . . . , bd ). We also define a sequence of σ in the same way. Let xN t be the unique strong global solution to the SDE: N N N N dxN t = σ (xt )dBt + b (xt )dt. N Let τ N be the first time that |xN t | is greater or equal to N . Then xt agrees with N +1 N N xt for t < τ and τ increases with N . Define xt (ω) = xN t (ω), on {ω : t < τ N (ω)}. Then xt is defined up to t < τ where τ = sup τN . N Note that the exit time from BN of xt is τ N and limt↑τ (ω) xt (ω) = ∞ on {τ (ω) < ∞}. The sets {ω : t < τ N (ω)} increases with N . Let Ω̄ = ∪N {ω : t < τ N (ω)}. P (Ω̄) = lim P ({ω : t < τ N (ω)}). N →∞ On {t < τ (x0 )} = ∪N ΩN , we have patched up a solution xt . It is now clear that xt is a maximal solution for E(σ, b). It is a strong solution. The uniqueness is clear as two maximal solutions (xt , t < T1 ) and (yt , t < T2 ) are equal up to each exit time τ N from a ball of radius N . It follows that T1 ≥ τ N and T2 ≥ τ N . If for all t, lim P ({ω : t < τN (ω)}) = 1 N →∞ we have a global solution defined for all time t. 8.4.5 Non-explosion Definition 8.4.7 Let xt , t < τ is a maximal solution. It is a global solution is P (τ = ∞) = 1. We say that the SDE does not explode from x0 if there is a global solution with initial point x0 . We say that the SDE does not explode if there is a global solution for all starting point x0 . 120 Definition 8.4.8 A C 2 function V : Rd → R+ is a Lyapunov function (for the explosion problem associated to an infinitesimal generator L ) if 1. V ≥ 0, 2. lim|x|→∞ |V (x)| = ∞ and 3. LV ≤ cV + K for some finite number c, Proposition 8.4.9 Assume that σk are continuous and the SDE E(σ, b) has a solution xt . Suppose that there is a Lyapunov function V for the generator L the SDE does not explode. Proof Apply Itô’s formula to V and xt∧τN ∧T m where τN is the first time that V (xt ) is greater or equal to N and T m = inf{|xt | ≥ m}. Z V (xt∧τ N ∧T m ) = V (x0 )+ t∧τN ∧Tm m Z X LV (xs ) ds+ 0 k=1 t∧τN ∧Tm dV (xs )σk (xs )dBsk . 0 Since dV and σk are continuous and therefore bounded on the ball of radius m, the local martingale is a martingale. Taking expectation to see that Z t E[LV (xs )1s<τ N ∧Tm ] ds EV (xt∧τ N ∧Tm ) = V (x0 ) + 0 Z t ≤ V (x0 ) + E[K + CV (xs )]1s<τ N ∧Tm ds 0 Z t ≤ V (x0 ) + Kt + EV (xs∧τ N ∧Tm )ds. 0 It follows from Gronwall’ lemma, EV (xt∧τ N ∧Tm ) ≤ [V (x0 ) + Kt]eCt . Letting m → ∞ and note that xt does not explode before time T N , EV (xt∧τ N ) ≤ [V (x0 ) + Kt]eCt . Now EV (xt∧τ N ) = EV (xt )1t<τ N + N P (t ≥ τ N ) In particular N P (t ≥ τN ) ≤ [V (x0 ) + Kt]eCt . and P (t ≥ τN ) ≤ 1 [V (x0 ) + kt]eCt . N Taking N → ∞ to see that limN →∞ P (t ≥ τN ) = 0 and so τ ≥ t for any t. 121 P 2 Example 8.4.10 1. Assume that m k=1 |σk (x)| ≤ c(1+|x| ) and hb(x), xiRd ≤ 2 2 c(1 + |x| ). Then 1 + |x| is a Lyapunov function: 2 L(|x| + 1) = m X d X (σki (x))2 +2 k=1 i=1 d X bl (x)xl ≤ 2c(1 + |x|2 ). l=1 2. Show that the SDE below does not explode, by cnstruction of a Lyapunov function, dxt = (yt2 − x2t )dBt1 + 2xt yt dBt2 dyt = −2xt yt dBt1 + (x2t − yt2 )dBt2 . The original proof (not given in class): The system has a global solution as it can be transformed to dxt = dBt1 , dyt = dBt2 on R2 /{0} by the map φ(x, y) = (x/r2 , −y/r2 ) which sends points at infinity to the origin (0, 0) and vice versa. Then φ(xt , yt ) = (x0 + Bt1 , y0 + Bt2 ) is the solution to the second pair of SDEs. The fact that P ((x0 + Bt1 , y0 + Bt2 ) = (0, 0)) = 0 implies that (x0 + Bt1 , y0 + Bt2 ) does not hit the origin, and so (xt , yt ) does not explode to infinity. qP x2i be the radius function. Let φ be a 3. Let r(x) = |(x1 , . . . , xd )| = harmonic function, so φ(x) = c1 |x| + c2 , φ(x) = c1 log |x| + c2 , c1 φ(x) = + c2 , |x|n−2 dimension =1 dimension =2 dimension > 2 These functions satisfy ∆f = 0 and φ ∗ f solves ∆φ ∗ f = f . For dimension 1 and 2 harmonic functions can be used to build Lyapunov functions for generators of the form C∆, where C can be a function. Modify the function inside the ball of radius one so that the function is smooth. Harmonic functions in dimension 3 or greater are not useful for explosion problems. Problem 8.4.1 What kind of conditions on σ and b implies exponential integrability Eext < ∞? The following is not covered in the lectures. 122 Proposition 8.4.11 If the coefficients has linear growth, there is no explosion. Furthermore for any p > 1, there are constants cp , C such that any solution satisfies: p E sup |xs |p ≤ cp |x0 |p + CT 2 + CT p eCT s≤T for T > 0. Proof Non-explosion follows from that |x|2 is a Lyapunov function. We assume that d = m = 1. Let xt be a solution, Let p > 2 there is a constant cp such that p Z t p Z s p p σ(xr )dBr + cp b(xr )dr . |xs | ≤ cp |x0 | + cp 0 0 From this we see, Z t p Z p p E sup |xs | ≤ cp |x0 | +cp E sup σ(xr )dBr +cp E sup s≤t s≤t s≤t 0 p s |b(xr )| dr . 0 Below we apply the Burkholder-Davies-Gundy inequality to the local martingale term and Hölder’s inequality to the the second integral. There is a constant Cp such that Z · p Z · 2 p p E sup |xs | ≤cp |x0 | + cp Cp E σ(xr )dBr , σ(xr )dBr s≤t 0 0 t Z s |b(xr )|p dr + cp tp−1 E sup s≤t p 0 t Z 2 ≤cp |x0 | + cp Cp E σ (xr )dr p2 p−1 + cp t 0 Z t p |b(xr )| dr E 0 There is C > 0 such that σ(x)| ≤ C(1 + |x|), b(x)| ≤ C(1 + |x|). Z t p2 Z t p p 2 p−1 p E sup |xs | ≤cp |x0 | + cp Cp E C(1 + |xr |) dr + cp t E C(1 + |xr |) dr s≤t 0 0 Since p > 2 we may apply apply again Hölder’s inequality, for some constant C, Z t p p p p E sup |xs | ≤ cp |x0 | + Ct 2 + Ct + C E|xr |2 dr. s≤t 0 By Grownall’s inequality, p E sup |xs |p ≤ cp |x0 |p + Ct 2 + Ctp eCt . s≤t 123 8.5 Girsanov Theorem Let (Ω, F, Ft , P ) be a filtered probability space. Let Nt be a continuous lo1 calmartingale. The exponential martingale eNt − 2 hN,N it is a true martingale if 1 E e 2 hN,N it < ∞ for all t ≥ 0. (The condition is called Novikov criterion.) We define Q on Ft such that 1 dQ dP = eNt − 2 hN,N it . Then Q is a probability measure. Proposition 8.5.1 Let ft be a strictly positive continuous local martingale. There is a continuous local martingale Nt s.t. 1 ft = eNt − 2 hN,N it . Rt In fact Nt = log f0 + 0 fs−1 dfs . Proof Let g(x) = ex . ft = g(Nt − 12 hN, N it ). Apply Itô’s formula to conclude. Theorem 8.5.2 (Girsanov Theorem) Let P and Q be two equivalent probability measures on F∞ with f = dQ dP . Assume that ft = EP {f |Ft } is continuous. If Mt is real valued continuous (Ft , P ) local martingale then M̃t = Mt − hM, N it , is a (Ft , Q) local martingale. Here Nt = log f0 + (8.7) Rt 0 fs−1 dfs . Proof Taking Tn to be the first time that max(|ft |, |Mt |, hN, N it , hM, M it ) ≥ n. We only need to show that each stopped process M Tn is a martingale. For simplicity of notation we assume all processes concerned are bounded and prove that (M̃t ) is a Q-martingale. Let s < t, A ∈ Fs . We want to prove that Z Z M̃t dQ = M̃s dQ. A A Equivalently, Z Z M̃t ft dP = A M̃s fs dP. A We may assume that M0 = 0. Note that Z t hM, N it = fs−1 dhf, M is 0 124 and for any t > 0, Z t Z t hM, N it ft = fr dhM, N ir + hM, N ir dfr 0 0 Z t Z t hM, N ir dfr dhf, M ir + = 0 0 Z t hM, N ir dfr . = hf, M it + 0 By the defining property of the bracket process, Mt ft − hf, M it is a P martingale Z (M̃t ft − M̃s fs )dP A Z Z t Z s = Mt ft − Ms fs − hf, M it + hf, M is − hM, N ir dfr + hM, N ir dfr dP A 0 0 Z Z t Z s =− hM, N ir dfr − hM, N ir dfr dP = 0. A 0 0 Corollary 8.5.3 Assume the conditions of the Girsanov theorem. If Bt = (Bt1 , . . . , Btm ) is an (Ft , P ) Brownian motion. Define B̃t such that Btj = Btj − hN, B j it . Then (B̃t ) is an (Ft , Q) Brownian motion. Proof By Theorem 8.5.2, each Btj − hN, Btj it is Q martingale. Furthermore hBti , Btj i = δi,j t. By by Lévy’s characterisation theorem B̃t is a Ft Brownian motion with respect to Q. Example 8.5.4 Let T > 0 be a real number and let h : [0, T ] × Ω → Rd be an L2 R 1 T 2 progressively measurable function such that h0 = 0 and Ee 2 0 |hs | ds < ∞. Let P be a probability measure. Defien a measure Q such that Z T Z 1 T dQ := exp hhs , dBs i − |hs |2 ds . dP 2 0 0 Then Q is a probability measure by Novikov’s condition. Rt Take Nt = 0 hs dBs where Bs is a one dimensional Brownian motion with respect to P . By Itô isometry: Z · Z t Z t hB, hs dBs it = hs ds = hs ds. 0 Then Bt − Rt 0 0 0 hs dBs is a Brownian motion with respect to Q. 125 Example 8.5.5 Let σ, b : R → R be be bounded functions with bounded continuous first derivatives. Let U : R+ × → R be also in BC 1 . Consider two SDEs: dxt = σ(xt )dBt + b(xt )dt dyt = σ(yt )dBt + b(yt )dt + σ(yt )U (t, yt )dt Let x0 = y0 . Denote by xt , yt the solutions. By Yamada-Watanabe theorem, there is a Borel measurable function Ft such that xt = Ft (x0 , Bt ). Let Z t Z 1 t 2 ft = exp − U (s, ys )dBs − U (s, ys ) ds . 2 0 0 Fix T > 0. Define Q on FT such that dQ dP Z = fT . Then t B̃t = Bt + U (s, ys )ds 0 is a Q-Brownian motion. Note that dyt = σ(yt )dB̃t + b(yt )dt. Hence yt = Ft (x0 , B̃t ). Let f : R → R be bounded Borel measurable. By uniqueness in law, Z Z f (Ft (x0 , Bt ))dP = f (Ft (x0 , B̃t ))dQ. Ω Ω This means Z t Z 1 t 2 Ef (xt ) = Ef (yt ) exp − U (s, ys )dBs − U (s, ys ) ds . 2 0 0 Similarly, Z Ef (yt ) = Ef (xt ) exp 0 t 1 U (s, xs )dBs − 2 126 Z 0 t U (s, xs ) ds . 2 8.6 Summary List of Terminology 1. Martingales, stopping times, local martingales, sub-martingales, semi-martingales, 2. The bracket process. If (Mt ) and (Nt ) are two continuous local martingales, there exists a unique continuous process hM, N it of finite total variation vanishing at 0 such that Mt Nt − hM, N it is a local martingale. 3. Continuous L2 bounded martingales: “A continuous local martingale (Mt , t ≥ 0) with M0 = 0 is an L2 bounded martingale if and only if EhM, M i∞ < ∞.” In this case M 2 − hM, M i is a uniform integrable martingale. kM kH 2 = q q EhM, M i∞ = lim EMt2 . t→∞ 4. Itô integral with respect to a semi-martingale, Stratonovich integrals. 5. Independent Increments. A stochastic process (Xt , t ∈ I) is said to have independent increments if for any n and any 0 = t0 < t1 < · · · < tn , ti ∈ I, the increments {(Xti+1 − Xi )}di=1 are independent. Gaussian Processes. A stochastic process Xt is Gaussian if its finite dimensional marginal distributions are Gaussian. Brownian Motion. A sample continuous stochastic process (Bt : t ≥ 0) on R1 is the standard Brownian motion if B0 = 0 and the following holds: (a) For 0 ≤ s < t, the distribution of Bt − Bs is N (0, t − s) distributed. (b) Bt has independent increments. An Ft Brownian motion in Rd is a (sample continuous) stochastic process such that B0 = 0 and the following holds: Bt − Bs is independent of Fs and Bt − Bs is N (0, (t − s)I) distributed. 6. Wiener space 7. SDEs, weak and strong solutions, existence, uniqueness in law and pathwise uniqueness, non-explosion. Infinitesimal generators, d 1 X L= 2 i,j=1 m X ! σki σkj d X ∂2 ∂ + bj ∂xi ∂xj ∂xj j=1 k=1 127 List of Theorems 1. The Optional Stopping Theorem. For Martingales: Let Xt , t ≥ 0 be a right continuous martingale. Then for all bounded stopping times S ≤ T , E{XT |FS } = XS almost surely. If {Xt , t ≥ 0} is furthermore uniformly integrable, we define XT = X∞ on {T = ∞} and the statement holds for all stopping times. For super-martingales. Let (Xt , t ≥ 0) be a right continuous super-martingale. Let S and T be two bounded stopping times. Then E{XT |FS } ≤ XS almost surely. If furthermore {Xt , t ≥ 0} is uniformly integrable, the inequality holds for all stopping times which are not necessarily bounded. 2. Martingale convergence theorem: If Xt is a right continuous super-martingale (or sub-martingale) and Xt is L1 bounded then limt→∞ Xt exists a.s. Uniform Integrability Lemma on conditional expectations of L1 random variables: Let X : Ω → R be an integrable random function, then the family of functions {E{X|G} : G is a sub σ-algebra of F} is uniformly integrable. 3. Uniform Integrability of Martingales and End point Theorem: If (Xt , t ≥ 0) the following are equivalent. Below we use X∞ for the end point. (a) Xt converges in L1 , as t approaches ∞ (to a random variable X∞ ). (b) There exists a L1 random variable X∞ s.t. Xt = E{X∞ |Ft }. (c) (Xt , t ≥ 0) is uniformly integrable. 4. Let (Xt , t ∈ I) be a right continuous martingale or positive sub-martingale, where I is an interval. Maximal Inequality, p ≥ 1, P (sup |Xt | ≥ λ) ≤ t 1 sup E|Xt |p . λp t Doob’s Lp Inequalities, p > 1, 1 p p ≤ E sup |Xt | t∈I 128 1 p sup (E|Xt |p ) p . p−1 t 5. B-G-D inequality: For every p > 0, there exist universal constants cp and Cp such that for all continuous local martingales vanishing at 0, p p cp EhM, M iT2 ≤ E(sup |Mt |)p ≤ Cp EhM, M iT2 t<T where T is a finite number, infinity or a stopping time. 6. Characterisation theorem for Itô integrals. Let M be a continuous R t local martingale with M0 = 0. If H ∈ L2loc (M ), the stochastic integral 0 Hs dMs is the unique local martingale vanishing at 0 such that for all t > 0 and for all continuous local martingales Nt , Z · Z t h Hs dMs , N it = Hs dhM, N is 0 0 7. Itô’s formula. Let f be C 2 , xt = (x1t , . . . , xdt ) a continuous semi-martingale, for s < t, d Z t d Z X ∂f 1 X t ∂f 2 j f (xt ) = f (xs ) + (xr )dxr + (xr )dhxi , xj ir . 2 s ∂xj s ∂xi ∂xj j=1 i,j=1 Variation on Itô’s Formula Assume that F : [0, ∞) × Rd → R be C 1,2 . Z f (t, xt ) = f (s, xs ) + s + t d X ∂f (r, xr )dr + ∂r j=1 Z s t ∂f (r, xr )dxjr ∂xj d Z 1 X t ∂f 2 (r, xr )dhxi , xj ir . 2 ∂x ∂x i j s i,j=1 8. Lévy’s Martingale Characterization Theorem for Brownian Motions. An Ft adapted sample continuous stochastic process Xt in Rd vanishing at 0 is a standard Ft -Brownian motion if and only if each Xt is a Ft local martingale and hX i , X j it = δij t. 9. Pathwise existence and uniqueness theorem : locally Lipschitz implies there is one unique strong maximal solution from each initial point. Globally Lipschitz continuous implies non-explosion. Linear growth condition implies non-explosion. 10. Pathwise Uniqueness implies Uniqueness in Law. If pathwise uniqueness holds then any solution is a strong solution and uniqueness in law holds. 129 8.7 Glossary C([0, T ]); X), the space of continuous function from [0, T ] to M ◦, Stratonovich integration. 130 Bibliography [1] Ludwig Arnold. Random dynamical systems. Springer Monographs in Mathematics. Springer-Verlag, Berlin, 1998. [2] Patrick Billingsley. Convergence of probability measures. Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons Inc., New York, second edition, 1999. A Wiley-Interscience Publication. [3] V. I. Bogachev. Measure theory. Vol. I, II. Springer-Verlag, Berlin, 2007. [4] K. D. Elworthy. Stochastic differential equations on manifolds, volume 70 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, 1982. [5] K. D. Elworthy, Y. Le Jan, and Xue-Mei Li. On the geometry of diffusion operators and stochastic flows, volume 1720 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1999. [6] K. D. Elworthy, Xue-Mei Li, and M. Yor. The importance of strictly local martingales; applications to radial Ornstein-Uhlenbeck processes. Probab. Theory Related Fields, 115(3):325–355, 1999. [7] K. David Elworthy, Yves Le Jan, and Xue-Mei Li. The geometry of filtering. Frontiers in Mathematics. Birkhäuser Verlag, Basel, 2010. [8] Gerald B. Folland. Real analysis. Pure and Applied Mathematics (New York). John Wiley & Sons Inc., New York, second edition, 1999. Modern techniques and their applications, A Wiley-Interscience Publication. [9] Avner Friedman. Stochastic differential equations and applications. Vol. 1. Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1975. Probability and Mathematical Statistics, Vol. 28. 131 [10] Avner Friedman. Stochastic differential equations and applications. Vol. 2. Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1976. Probability and Mathematical Statistics, Vol. 28. [11] Ĭ. Ī. Gı̄hman and A. V. Skorohod. Stochastic differential equations. SpringerVerlag, New York, 1972. Translated from the Russian by Kenneth Wickwire, Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 72. [12] R. Z. Has0 minskiı̆. Stochastic stability of differential equations, volume 7 of Monographs and Textbooks on Mechanics of Solids and Fluids: Mechanics and Analysis. Sijthoff & Noordhoff, Alphen aan den Rijn, 1980. Translated from the Russian by D. Louvish. [13] Nobuyuki Ikeda and Shinzo Watanabe. Stochastic differential equations and diffusion processes, volume 24 of North-Holland Mathematical Library. North-Holland Publishing Co., Amsterdam, second edition, 1989. [14] Olav Kallenberg. Foundations of modern probability. Probability and its Applications (New York). Springer-Verlag, New York, second edition, 2002. [15] Ioannis Karatzas and Steven E. Shreve. Brownian motion and stochastic calculus, volume 113 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition, 1991. [16] H. Kunita. Lectures on stochastic flows and applications, volume 78 of Tata Institute of Fundamental Research Lectures on Mathematics and Physics. Published for the Tata Institute of Fundamental Research, Bombay, 1986. [17] Hiroshi Kunita. Stochastic flows and stochastic differential equations, volume 24 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 1990. [18] Roger Mansuy and Marc Yor. Aspects of Brownian motion. Universitext. Springer-Verlag, Berlin, 2008. [19] Peter Mörters and Yuval Peres. Brownian motion. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2010. With an appendix by Oded Schramm and Wendelin Werner. [20] Bernt Øksendal. Stochastic differential equations. Universitext. SpringerVerlag, Berlin, sixth edition, 2003. An introduction with applications. [21] K. R. Parthasarathy. Probability measures on metric spaces. AMS Chelsea Publishing, Providence, RI, 2005. Reprint of the 1967 original. 132 [22] Michael Reed and Barry Simon. Methods of modern mathematical physics. I. Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, second edition, 1980. Functional analysis. [23] Daniel Revuz and Marc Yor. Continuous martingales and Brownian motion, volume 293 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, third edition, 1999. [24] L. C. G. Rogers and David Williams. Diffusions, Markov processes, and martingales. Vol. 2. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons Inc., New York, 1987. Itô calculus. [25] L. C. G. Rogers and David Williams. Diffusions, Markov processes, and martingales. Vol. 1. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons Ltd., Chichester, second edition, 1994. Foundations. [26] Daniel W. Stroock. An introduction to Markov processes, volume 230 of Graduate Texts in Mathematics. Springer-Verlag, Berlin, 2005. [27] Daniel W. Stroock and S. R. Srinivasa Varadhan. Multidimensional diffusion processes, volume 233 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. SpringerVerlag, Berlin, 1979. [28] S. R. S. Varadhan. Stochastic processes, volume 16 of Courant Lecture Notes in Mathematics. Courant Institute of Mathematical Sciences, New York, 2007. [29] David Williams. Probability with martingales. Cambridge Mathematical Textbooks. Cambridge University Press, Cambridge, 1991. [30] Marc Yor. Some aspects of Brownian motion. Part II. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, 1997. Some recent martingale problems. 133