PRELIMINARY CONCEPTS Remark. The authors have made an appendix of prerequisite knowledge; it contains, notably, the following. 1. Riemann integral Definition 1.1. A bounded real-valued function f : [a, b] → R is called Riemann integrable if for every > 0 there exists a partition P such that U (P, f )−L(P, f ) < . Rb Definition 1.2. For f integrable on [a, b] we define a f (x) dx to be the common value Z b (1) f (x) dx = inf U (P, f ) = sup L(P, f ). P a P Remark. The following lemma will prove useful: that any integrable function can be approximated (in some sense) by continuous functions. The proof also shows that one can alternatively use step functions. Lemma 1.3. Suppose f is integrable on the circle (i.e., a 2π periodic function on R), bounded by B. Then there exists a sequence {fk } of continuous functions on the circle, bounded by B, such that Z π (2) |f (x) − fk (x)| dx → 0 as k → ∞ −π 2. Measure Zero Remark. A notion of “smallness” for sets. Definition 2.1. E ⊂ R is said to be of measure 0 if, given any > 0, there exists a countable family of open intervals {Ik } such that (1) P E ⊂ ∪k Ik ∞ (2) k=1 |Ik | < Lemma 2.2. The union of countably many sets of measure 0 is of measure 0. Theorem 2.3. A bounded function f : [a, b] → R is integrable if and only if its set of discontinuities is of measure 0. Remark. The authors did not include the following knowledge in the appendix, but it is also assumed. You will need to use this knowledge at various points, but not prove it. Elementary concepts: convergence of sequences, Cauchy sequences, Cauchy criterion, completeness, density. 2 PRELIMINARY CONCEPTS 3. Pointwise convergence Remark. Just as one can consider the convergence of a sequence of points, one can consider the convergence of a sequence {fn } of functions. The simplest type of convergence is “pointwise convergence”, implying that for each fixed point, the sequence of points {fn (x)} converges. (Reference: Rudin, Principles of Mathematical Analysis, chapter 7) Definition 3.1. E ⊂ X a matrix space, {fn : E → C} a sequence of functions. If limn→∞ fn (x) exists for each x ∈ E (call that limit f (x)), then we say {fn } converges pointwise to f . Definition 3.2 ( − N rephrasing). If, for each x ∈ E, we have that given any > 0, there exists an N ∈ N such that n > N implies |fn (x) − f (x)| < , then we say that {fn } converges pointwise to f on E. Remark. There are certain important qualities that are not preserved by pointwise convergence; that is, though all the {fn } might possess a certain property, their pointwise limit f may not necessarily have that property. Remarks. (1) The pointwise limit of continuous functions need not be continuous. (2) A pointwise convergent sum of continuous functions need not be continuous. (3) Limit and derivative need not commute. (4) Limit of a convergent sequence of integrable functions need not be integrable. (5) Limit and integration do not commute 4. Uniform Convergence Remark. There is a stronger form of convergence, however: so-called uniform convergence, which turns out to preserve at least some of the above properties. (Reference: Rudin, Principles of Mathematical Analysis, chapter 7) Definition 4.1 (uniform convergence). Let fn : E → C; n = 1, 2, 3, . . . . and f : E → C. If, for all > 0, there exists an N such that n > N implies |fn (x)−f (x)| < for all x ∈ E, then we say {fn } converges to f uniformly. Remark. Intuition: the tube. Theorem 4.2 (Cauchy criterion). {fn } converges uniformly to f on E ⇐⇒ for all > 0, there exists N such that m, n > N implies |fn (x) − fm (x)| < for all x ∈ E. Proof. ⇒) Fix > 0. Since convergence is uniform on E, we know there exists N > 0 such that if n > N then |fn (x) − f (x)| < /2 for all x ∈ E. ⇐) Note that for each fixed x ∈ E, {fn (x)} is a Cauchy sequence, and so converges to some value, f (x). By hypothesis, given > 0, there exists N such that n > N implies |fn (x) − fm (x)| < 2 for all n, m ≥ N . Let m → ∞; we get |fn (x) − f (x)| ≤ 2 . Theorem 4.3 (recharacterization). Say that limn→∞ fn (x) = f (x), and let Mn := supx∈E |fn (x)−f (x)|. Then fn → f uniformly on E if and only if limn→∞ Mn = 0. PRELIMINARY CONCEPTS 3 Theorem 4.4 (Weierstrass M-test). Let {fn : E → C} be a set of functions. P If there exist constants {Mn } such that |fn (x)| ≤ Mn for all x ∈ E and Mn P converges, then fn converges uniformly. Proof. Use the Cauchy Criterion for sums. 5. Consequences of uniform convergence Theorem 5.1. If {fn } are continuous functions that converge uniformly to some f , then f must be continuous. Proof. Exercise. Theorem 5.2. Suppose fn : [a, b] → C are Riemann integrable functions. If fn → f unifomly on [a, b], then f is itself Riemann integrable, and Z b Z b (3) f dx = lim fn dx. n→∞ a a Proof. Let n = sup[a,b] |fn (x) − f (x)|. Then fn − n ≤ f ≤ fn + n . Thus Z Z b Z b Z (4) (fn + n ) dx (fn − n ) dx ≤ f ≤ f ≤ a a Z (5) ⇒0≤ Z f≤ Z f≤ b 2n dx = 2n (b − a) a Let n → ∞; then n → 0, so the upper and lower integrals are equal. Remark. It is not true that uniform convergence of fn → f implies that fn0 → f 0 . However, one does have the following. Theorem 5.3. {fn : [a, b] → C} be differentiable functions. If fn0 converge uniformly on [a, b], and the {fn } converge pointwise at some point x0 ∈ [a, b] then (1) {fn } converge uniformly on [a, b] to some differentiable function f , and (2) f 0 (x) = limn→∞ fn0 (x); x ∈ [a, b] 6. Fubini’s theorem Remark. One theorem that will come in handy is the following. (Reference: Rudin, Real and Complex Analysis, pp. 165ff) Theorem 6.1. If f is a (measurable) function such that Z Z (6) |f (x, y)| dx dy < ∞, R R R R R R then R RR fR (x, y) dx dy = R R f (x, y) dy dxthe order of integration in the double integral R R f (x, y) dx dy may be reversed without changing the value of the integral. Remark. Note that this is not true in general; see Rudin, p. 166 for examples. THE WAVE EQUATION: D’ALEMBERT’S FORMULA Initial Setting; Reduction of Problem Suppose one has a string of length L > 0, fixed at both ends. We let u(x, t) denote the displacement of the string at position x at time t; e.g., u(x, 0) describes the string at time t = 0, etc. Physical considerations imply that if u is twice-differentiable, it satisfies the following (the “wave equation”): 1 ∂ 2u ∂ 2u (1) = 2, c2 ∂t2 ∂x for some constant c > 0. Changing our units (let X = a1 x, T = 1b t), and letting U (X, T ) = u(x, t), we get with appropriate choice (a = L L π , b = cπ ) of constants that the wave equation (1) above is equivalent to the following: 2 2 ∂ U ∂ U = 2 ∂T ∂X 2 where 0 ≤ X ≤ π. That is, without loss of generality, we may assume that the string is of length π, and that the constant c = 1, i.e., that The Wave Equation ∂ ∂T ∂ ∂X ∂ = b ∂t ∂ = a ∂x c2 π 2 ∂ 2 u c2 L2 ∂T 2 = π2 ∂ 2 u L2 ∂X 2 (2) (3) ∂ 2u ∂ 2u = 2 on 0 ≤ x ≤ π, t ≥ 0. ∂t2 ∂x Solutions on R are all combinations of traveling waves. Goal: find u(x, t) satisfying the above. Book gives two methods both relevant one more so 2 THE WAVE EQUATION: D’ALEMBERT’S FORMULA First observation. Suppose we ignore the initial conditions (that u(0, t) = u(π, t) = 0 for all t > 0). Then, for any twice differentiable function F , if we define either (4) u(x, t) = F (x + t) or u(x, t) = F (x − t), it is an easy calculation that either solves the equation. Such solutions are called traveling waves, for obvious reasons. In fact, any twice-differentiable solution u to the wave equation on R must be a combination of (opposing) traveling waves. For let ξ = x + t, η = x − t, and notate (5) ν(ξ, η) = u(x, t). Then, again by the Chain Rule, ∂ξ ∂ ∂η ∂ ∂ (6) = + ∂x ∂x ∂ξ ∂x ∂η ∂ ∂ (7) = + ∂ξ ∂η and, similarly, ∂ ∂ ∂ (8) = − , ∂t ∂ξ ∂η so the wave equation is equivalent to ∂ 2v ∂ 2v ∂ 2v ∂ 2v ∂ 2v ∂ 2v (9) +2 + = 2 −2 + , ∂ξ 2 ∂ξ∂η ∂η 2 ∂ξ ∂ξ∂η ∂η 2 i.e., ∂ 2v (10) = 0. ∂ξ∂η Thus (11) v(ξ, η) = F (ξ) + G(η), or, switching back to x, t notation, (12) u(x, t) = F (x + t) + G(x − t), a sum of two waves in opposite directions, as claimed. LECTURE 2: D’ALEMBERT’S FORMULA (CONT’D); STANDING WAVES; HEAT EQUATION Remark. Finishing up Chapter I (motivating the problems). From Chapter II onwards the presentation will be more rigorous; however this first chapter introduces some important themes. Method I: Traveling Waves Recall: last time we showed that every solution to the wave equation can be expressed as a sum of two travelling waves, in opposing directions. Returning to our string of length π, let u(x, 0) = f (x); 0 ≤ x ≤ π denote the initial configuration of the string; we extend f first as an odd function to [−π, π], and then as a 2π-periodic function to all of R; do likewise for u(x, t) (also set u(x, t) = u(x, −t) for t < 0). Then u is a solution for the wave equation on all of R, so must be a sum of traveling waves as above. What are F and G? Well, we know (1) u(x, t) = F (x + t) + G(x − t), so (2) f (x) = u(x, 0) = F (x) + G(x). In fact, we haven’t imposed enough constraints to determine the functions uniquely; we impose an additional (“initial velocity”) condition, that ∂u (3) (x, 0) = g(x), ∂t where g is an odd, 2π-periodic function satisfying g(0) = g(x) = 0 like f . Q. What about fixedendpoint string motion? Q. What are those waves? initial conditions Not uniquely determined; we impose Initial velocity condition D’ALEMBERT, STANDING WAVES, HEAT EQUATION 2 Thus our system is (4) (5) That’s enough; now we can solve it F (x) + G(x) = f (x) F 0 (x) − G0 (x) = g(x), which implies 2F 0 (x) = f 0 (x) + g(x) 2G0 (x) = f 0 (x) − g(x), (6) (7) forcing d’Alembert’s formula That’s one way; another which leads directly to the main question of Fourier analysis is Z x 1 (8) f (x) + g(y) dy + C1 F (x) = 2 0 Z x 1 (9) f (x) − g(y) dy + C2 G(x) = 2 0 and thus (note C1 + C2 = 0) Z 1 1 x+t (10) u(x, t) = [f (x + t) + f (x − t)] + g(y) dy. 2 2 x−t Method II: superposition of standing waves Remark. Motivation for the notion of Fourier series - the simplest version of Fourier analysis. Same problem Standing waves Again we examine the wave equation ∂ 2u ∂ 2u (11) = 2 ∂t2 ∂t with initial condition u(x, 0) = f (x) and initial velocity condition ∂u ∂t (x, 0) = g(x) as before. Idea: consider solutions called “standing waves,” i.e., of the simple form u(x, t) = φ(x)ψ(t). In that case, the functions must satisfy the system (12) (13) φ00 (x) − λφ(x) = 0 ψ 00 (t) − λψ(t) = 0, D’ALEMBERT, STANDING WAVES, HEAT EQUATION 3 and thus (tossing out the non-oscillating solutions) (14) ψ(t) = A cos(mt) + B sin(mt) (15) φ(x) = à cos(mx) + B̃ sin(mx) blah blah blah ....solutions are of the form (16) um (x, t) = (Am cos mt + Bm sin mt) sin mx for m = 1, 2, 3, . . . (we combine negative indices with positive). So we have an infinite number of solutions. Recall/observe that any linear combination of solutions is still a solution, and suppose u(x, t) is some linear combination of such solutions, i.e,. ∞ X (17) u(x, t) = (Am cos mt + Bm sin mt) sin mx; φ(0) = 0; so à = 0; φ(π) = 0; so m ∈ Z if B̃ 6= 0 infinite number of solutions m=1 What are the Am and Bm ? Recalling that u(x, 0) = f (x), we see ∞ X (18) Am sin mx = f (x). m=1 This motivates the question: given a sufficiently good function f on [0, π] with f (0) =Pf (π) = 0, can we find coefficients Am such that f (x) = ∞ m=1 Am sin mx? More generally, the question we will really be concerned with is the following: given an arbitrary function F on [−π, π], can we find coefficients am such that ∞ X (19) F (x) = am eimx m=−∞ (note that eimx = cos(mx) + i sin(mx)). Well, it is an easy calculation that Z π 1 0 if n 6= m (20) eimx e−inx dx = 1 if n = m 2π −π The first question (Note we also see from the initial vel. cond’n that g(x) = P∞ mB m sin mx.) m=1 The more general question 4 “formally”: i.e., if one ignores problems of convergence infinite dimensional orthonormal basis Main points: 1. introduce the terminology: heat equation, Laplacian; Dirichlet problem; polar coordinates, 2. observe that the same problem arises. Time-dependent heat equation ∂u ∂t =0 Dirichlet Problem D’ALEMBERT, STANDING WAVES, HEAT EQUATION so, at least formally, one would expect that Z π 1 an = (21) F (x)e−inx dx. 2π −π These coefficients are in fact called the Fourier coefficients of F , and our job will be to determine in what sense, and under what conditions, the above equality is true. Remark. Recall from linear algebra: one may recognize this as an inner product space; then we have an orthonormal basis (infinite dimensional space). Fourier coefficients are the coefficients... The heat equation Suppose one has an infinite plate (R2 ) with an initial heat distribution. Let u(x, y, t) denote the temperature of the place at time t, position (x, y). Physical considerations yield an equation governing the evolution of that heat: σ ∂u ∂ 2 u ∂ 2 u (22) = 2+ 2 κ ∂t ∂x ∂y We’ll mainly be concerned with the steady-state heat equation ∂ 2u ∂ 2u (23) ∆u := 2 + 2 = 0 ∂x ∂y The Dirichlet problem (D some open domain (set), and ∂D its boundary): ∆u = 0 on D (24) u = f on ∂D E.g., one can consider the Dirichlet problem on D = {(x, y) ∈ R2 : x2 + y 2 < 1} = {(r, θ) ∈ R × S 1 : r ∈ [0, 1)} Laplacian in polar coordinates the unit disk. In polar coordinates, the condition ∆u = 0 becomes (exercise) D’ALEMBERT, STANDING WAVES, HEAT EQUATION Same approach as before: separation of variables 5 1 ∂ 2u ∂ 2 u 1 ∂u (25) + = 0, ∆u = 2 + ∂r r ∂r r2 ∂θ2 2 ∂u ∂ 2u 2∂ u i.e., r (26) +r =− 2 ∂r2 ∂r ∂θ We take the same “separation” approach as above: let u(r, θ) = F (r)G(θ); then by a similar argument, we get 00 G (θ) + λG(θ) = 0 (27) r2 F 00 (r) + rF 0 (r) − λF (r) = 0 As before, we obtain a family of solutions, (28) um (r, θ) = r|m| eimθ ; m ∈ Z and, supposing that u be some linear combination ∞ X (29) am r|m| eimθ u(r, θ) = −∞ Different setting, same question of those solutions, we see that, in this different setting, we arrive at an identical question. For the boundary value condition requires that ∞ X (30) u(1, θ) = am eimθ = f (θ), −∞ so, our question is, again: “Given any reasonable function f on [0, 2π] with f (0) = f (2π), can we find coefficients am such that ∞ X (31) f (θ) = am eimθ ?” −∞ LECTURE 3: INTRODUCTION TO FOURIER SERIES Basic definitions: basic function classes (read pp. 31-33 yourself), Fourier series; Examples: calculations of Fourier series, Dirichlet and Poisson kernels (pp. 34-39) 1. Basic knowledge (1) Continuous functions on [0, L]. (2) Piecewise continuous functions (only finitely many discontinuities) (3) Riemann integrable functions (note bounded) (4) Functions on the circle (correspondence with 2π periodic functions on R such that f (0) = f (2π). Remark. We will assume that all of our functions are Riemann integrable 2. Definitions and examples Definition 2.1. Given an integrable function f : [a, b] → C, we define the Fourier series of f as ∞ X 2πinx fˆ(n)e L , some definitions Warning: only integrable functions Important defn.: Fourier series n=−∞ where Z b 1 fˆ(n) := f (x)e−2πinx/L dx L a denotes the n-th Fourier coefficient of f for n ∈ N. Example: the Fourier series of the 2π-periodic odd function defined on [0, π] by f (θ) = θ(π − θ). Well, our function is f (θ) = θ(π − θ) for θ > 0 (with derivative f 0 (θ) = π −2θ there), and f (θ) = θ(π +θ) for θ < Fourier coefficient Example of Fourier series 2 Note we don’t know if there’s any relation - think of Taylor series LECTURE 3: INTRODUCTION TO FOURIER SERIES 0 (with derivative π + 2θ there). Then, using integration by parts, we get Z π 1 ˆ f (θ)e−inθ dθ f (n) := 2π −π π Z 1 −f (θ) inθ 1 π 0 = e f (θ)e−inθ dθ + 2π in in −π −π Z 0 Z π 1 2 = θe−inθ dθ − θe−inθ dθ 2π in 0 Z π −π 1 θ(e−inθ + einθ ) dθ (C.O.V.: γ = −θ) =− inπ 0 π Z 2 θ 1 π =− sin(nθ) − sin(nθ) dθ inπ n n 0 0 2 = 3 [(−1)n+1 + 1] n πi 4 = 3 n πi for n odd, and 0 otherwise. Thus the Fourier series is f (θ) ∼ X 4 eikx . 3 k π k odd Simple observation Remark. There is, a priori, no guarantee that the Fourier series will converge at all; even if it does converge, it may not converge to f . In fact, if f and g agree everywhere excluding a finite set of points, their Fourier series will be identical. So asking for pointwise convergence to the original function is in fact a priori futile. partial sum Definition 2.2. We define the N th partial sum of the Fourier series of f , N ∈ N by N X SN (f )(x) := fˆ(n)e2πinx/L . n=−N Define trigonometric polynomials! Only sketch it! Let them fill in the details - it’s just calculus. LECTURE 3: INTRODUCTION TO FOURIER SERIES 3 Remark. In this text, “convergence of the Fourier series to f ” will always mean convergence of the above partial sums to f . 3. Some important constructions Dirichlet kernel: for now, opaque Definition 3.1. We define the Dirichlet kernel DN by N X DN (x) = einx n=−N where x ∈ [−π, π]. Lemma 3.2. DN (x) = sin((N + 21 )x) sin(x/2) Leave to them. Proof. Consider the geometric sums N X −1 X ix n (e ) and n=0 (eix )n . n=−N Question. (dumb) what are the Fourier coefficients of DN ? Definition 3.3. We define the Poisson kernel Pr (θ) by Pr (θ) = ∞ X r|n| einθ n=−∞ where θ ∈ [−π, π] and r ∈ [0, 1) Remark. Note that the sum is both absolutely and uniformly convergent. (Obviously: notice the |n|.) Question. (non-dumb?) what are the Fourier coefficients of Pr (θ)? (Point: uniform convergence is necessary.) Lemma 3.4. Pr (θ) = 1−r2 1−2r cos θ+r2 Poisson arose in equation kernel: heat 4 LECTURE 3: INTRODUCTION TO FOURIER SERIES Proof. Letting ω = reiθ , we have a sum of geometric series: ∞ ∞ X X n Pr (θ) = ω + ω̄ n n=0 n=1 1 ω̄ 1 − |ω|2 + = 1 − ω 1 − ω̄ |1 − ω|2 1 − r2 = 1 − 2r cos θ + r2 = 4. Uniqueness of Fourier Series Remark. The question: if the Fourier series did recover the original function uniquely, then functions with the same Fourier coefficients would have to be equal; in particular, if fˆ(n) = 0 for all n ∈ Z, then f ≡ 0. This is of course false. However one has the following. Finally: first real theorem! i.e., on [−π, π], with f (−π) = f (π) Theorem 4.1 (Uniqueness). Let f be an integrable function on the circle. Suppose fˆ(n) = 0 for all n ∈ Z. If f is continuous at θ0 , then f (θ0 ) = 0. Remark. Thus f ≡ 0 a.e., since integrable functions are continuous except on a set of measure zero. Remark. Observe that if fˆ(n) = 0 for all n ∈ Z, then f must be orthogonal to all finite linear combinations of the blah, i.e., orthogonal to all trigonometric polynomials. Proof. Proof by contradiction. Suppose f (0) > 0 (WLOG assume θ0 = 0). By continuity, we know f (x) > 21 f (0) in some neighborhood (−δ, δ) of 0. We now create a sequence {pk (θ)} of trigonometric polynomials (i.e., finite linear combinations of the {einx : n ∈ Z}) as follows: Let p(θ) = cos θ + , LECTURE 3: INTRODUCTION TO FOURIER SERIES Draw a picture 5 where > 0 is chosen so small that outside of (−δ, δ), we still have |p(θ)| < 1 − . 2 Notice that by continuity of cos θ, we can choose a small η > 0 such that inside of (−η, η), we have p(θ) > 1 + . 2 k Let pk (θ) = [p(θ)] (note that these are all trigonometric polynomials). Since the Fourier coefficients fˆ(n) := hf, e2πinx/L i are all P inx 0, we have hf, cn e i = 0 for all trigonometric polynomials. However, we have just created a sequence of trigonometric polynomials for which that does not happen....(to be continued) Point: powers of p will (uniformly) shrink outside of the δ neighborhood, but grow to infinity uniformly inside the η neighborhood (The Idea) LECTURE 4: UNIQUENESS, PART II 1. Uniqueness of Fourier Series, continued Recall: we were in the midst of the following theorem: Theorem 1.1 (Uniqueness). Let f be an integrable function on the circle. Suppose fˆ(n) = 0 for all n ∈ Z. If f is continuous at θ0 , then f (θ0 ) = 0. i.e., f integrable on [−π, π], with f (−π) = f (π) Proof. Proof by contradiction. Suppose f (0) > 0 (WLOG assume θ0 = 0). By continuity, we know f (x) > 21 f (0) in some neighborhood (−δ, δ) of 0. We now create a sequence {pk (θ)} of trigonometric polynomials (i.e., finite linear combinations of the {einx : n ∈ Z}) as follows: Let p(θ) = cos θ + , where > 0 is chosen so small that outside of (−δ, δ), we still have |p(θ)| < 1 − . 2 Notice that by continuity of cos θ, we can choose a small η > 0 such that inside of (−η, η), we have p(θ) > 1 + . 2 Let pk (θ) = [p(θ)]k (note that these are all trigonometric polynomials). P Since the Fourier coefficients fˆ(n) := hf, e2πinx/L i are all 0, we have hf, cn einx i = 0 for all trigonometric polynomials. However, we have just created a sequence of trigonometric polynomials for which that does not happen....(to be continued) Rπ The details: We estimate −π f (θ)pk (θ) dθ in three parts: (1) In the η-neighborhood, we have the crude estimate Z f (0) (1 + )k f (θ)pk (θ) dθ ≥ 2η 2 2 (−η,η) which goes to infinity as k does. (2) Outside of the δ neighborhood, we have the (again crude) estimate Z f (θ)pk (θ) dθ ≤ 2πB(1 − )k 2 (−δ,δ)c where B is the bound on (integrable) f ; this goes to 0 as k → ∞. Draw a picture Point: powers of p will (uniformly) shrink outside of the δ neighborhood, but grow to infinity uniformly inside the η neighborhood (The Idea) The details 2 LECTURE 4: UNIQUENESS, PART II (3) Between the two, p and f are non-negative, so the integral there is positive. Together, the above prove that Z lim f (θ)pk (θ) dθ = ∞, k→∞ (−π,π) a contradiction. Some consequences 2. Consequences of Uniqueness theorem Corollary 2.1. Suppose f is continuous on the circle. If fˆ(n) = 0 for all n ∈ Z, then f ≡ 0. Thus if two continuous functions have the same Fourier coefficients, they must be identical. Recovering the function from the Fourier series. Proof. Obvious. Question. At this point, do we know that if a function is continuous, then its Fourier series converges back to the function? (No. In fact, that is false.) Corollary P 2.2. Suppose f is a continuous function on the ˆ circle, and ∞ n=−∞ |f (n)| < ∞ (i.e., the Fourier series of f is absolutely convergent). Then lim SN (f )(θ) = f (θ) N →∞ uniformly on the circle. Note we do not have this for cts. fns. in general. Natural question P ˆ Proof. (Trivial.) Since ∞ n=−∞ |f (n)| < ∞, we know that the Fourier series converges uniformly to some continuous function, which has Fourier coefficients fˆ(n); n ∈ Z. Thus we have two continuous functions with the same Fourier coefficients, so they must, by the previous lemma, be identical. Question. When do we have absolute convergence of the sum of the Fourier coefficients? Introduce a useful notation LECTURE 4: UNIQUENESS, PART II 3 Definition 2.3 (Big O notation). We say f (x) = O(g(x)) as x → a if there exists a C > 0 such that f (x) lim ≤ C. x→a g(x) Corollary 2.4. Let f be a function on the circle. If f is twice continuously differentiable (i.e., is of class C 2 ), then fˆ(n) = O(1/|n|2 ) as |n| → ∞. Proof. Proof is trivial: integration by parts. Z 2π 2π fˆ(n) = f (θ)e−inθ dθ 0 2π Z −e−inθ 1 2π 0 = f (θ) + f (θ)e−inθ dθ in in 0 0 Z 2π 1 f 0 (θ)e−inθ dθ = in 0 Continuing with another integration by parts, Z 2π −inθ 2π 1 −e 1 = f 0 (θ) + f 00 (θ)e−inθ dθ 2 in in (in) 0 0 Z 2π 1 f 00 (θ)e−inθ dθ =− 2 n 0 So Z 2π Z 2π 1 1 2 00 −inθ ˆ |f (n)||n | ≤ f (θ)e dθ ≤ |f 00 | ≤ C, 2π 0 2π 0 as desired. This is a good place to introduce the following important notion: Definition 2.5. Let f be a function for which there exists a constant A such that for all x, h |f (x + h) − f (x)| ≤ A|h|α . ˆ (n)| lim|n|→∞ |f|n| ≤ C 2 for some C And thus we can recover the function. Stop there for a second! fb0 (n) = infˆ(n) and so on and so forth 4 LECTURE 4: UNIQUENESS, PART II Then we say that f satisfies a Holder condition (or is Holder continuous) of order α. Remark. In fact (as you will show) if a function is Holder continuous order α > 1/2, then the Fourier series converges absolutely (and thus uniformly to f ). LECTURE 5: CONVOLUTIONS AND GOOD KERNELS 1. Convolutions Remark. Now for a seemingly simple, but important notion.... Definition 1.1. Let f, g : R → C be 2π-periodic functions. The convolution f ∗ g of f and g is the function defined on [−π, π] by Z π 1 f (y)g(x − y) dy (f ∗ g)(x) := 2π −π Remarks. (1) It is an easy exercise (C.O.V.) to see that f ∗g = g ∗f . (2) Convolution as weighted average. (3) Turns out that many important constructs can be expressed in terms of convolutions. For example, consider f ∗ DN , the Dirichlet kernel : 1 (f ∗ DN )(x) := 2π Z π f (y) −π N X 1 = 2π = = −N N X −N N X e ein(x−y) dy −N Z inx N X E.g., The Hilbert Transform “kernel” = “that which one convolves against” π f (y)ein(x−y) dy −π 1 2π Z π f (y)e−iny dy −π einx fˆ(n) =: SN (f )(x), −N the N th partial sum of the Fourier series. The question about convergence of Fourier series can be thought of as the convergence of a sequence of particular weighted averages. 2 LECTURE 5: CONVOLUTIONS AND GOOD KERNELS 2. Properties of Convolution 1 L (T) a Banach algebra. Theorem 2.1 (Basic properties). Let f, g, h be 2π-periodic integrable functions, and c ∈ C. Then i. ii. iii. iv. v. vi. (Linearity I) f ∗ (g + h) = (f ∗ g) + (f ∗ h) (Linearity II) (cf ) ∗ g = c(f ∗ g) = f ∗ (cg) (Commutative) f ∗ g = g ∗ f (Associative) (f ∗ g) ∗ h = f ∗ (g ∗ h) (Continuity!) f ∗ g is continuous. (Interaction with Fourier transform) f[ ∗ g(n) = fˆ(n)ĝ(n) What is f\ ∗ DN (n)? ˆ f (n)χ[−N,N ] (n). Obvious if continuous Proof of (v), first for cts. fns. (fairly standard argument) Remark. If one assumes that f, g, h are continuous functions, then all of the properties, excluding the fifth, are immediate calculations. E.g., (using Fubini’s theorem) one can prove (vi) for continuous functions as follows. Z π 1 (f ∗ g)(x)e−inx dx f[ ∗ g(n) := 2π −π Z π Z π 1 1 = f (y)g(x − y) dy e−inx dx 2π −π 2π −π Z π Z π 1 1 = f (y)e−iny g(x − y)e−in(x−y) dx dy 2π −π 2π Z−ππ Z π 1 1 = f (y)e−iny g(x)e−inx dx dy 2π −π 2π −π = fˆ(n)ĝ(n) Proof. We will prove the second-to-last property, initially in the case that f, g are continuous. Then we will show how one obtains the result for integrable functions by approximating them with continuous ones. Step I. Suppose f, g are continuous; we want to show f ∗ g is also. I.e., we want to show that given any > 0, there exists a δ such that |x1 − x2 | < δ implies |(f ∗ g)(x1 ) − (f ∗ g)(x2 )| < . This is actually uniform continuity; note [−π, π] is compact. LECTURE 5: CONVOLUTIONS AND GOOD KERNELS “standard estimates” Key: uniform continuity of g What if we don’t have continuity? Approximate (in L1 ) with continuous functions. Why does this suffice? (Dumb question.) 3 Well, we estimate crudely with the triangle inequality: Z π 1 f (y)[g(x1 − y) − g(x2 − y)] dy |(f ∗ g)(x1 ) − (f ∗ g)(x2 )| ≤ 2π −π Z π 1 ≤ |f (y)| |g(x1 − y) − g(x2 − y)| dy. 2π −π f , being integrable, is bounded by some B on [−π, π], so we use the uniform continuity of g to choose δ > 0 such that |a − b| < δ implies |g(a) − g(b)| < B . Then, if |x1 − x2 | < δ, 1 2πB B = . In other we see the above is smaller than 2π words, the convolution is (uniformly) continuous. Question. Notice that we really needed the continuity of g (but only boundedness of f ). How are we going to do this without continuity? Step II. Suppose f, g are integrable. By the lemma in the appendix, any integrable function can be approximated in the L1 sense by a sequence of continuous functions. So let {fk } and {gk } be such sequences for f, g, respectively. It suffices to show the following. Claim: fk ∗ gk → f ∗ g uniformly on [−π, π]. Proof. First note that f ∗ g − fk ∗ gk = (f − fk ) ∗ g + fk ∗ (g − gk ), so |f ∗ g − fk ∗ gk | ≤ |(f − fk ) ∗ g| + |fk ∗ (g − gk )| Now Z π 1 |(f − fk ) ∗ g(x)| ≤ |f (x − y) − fk (x − y)| |g(y)| dy 2π −π Z π 1 ≤ sup |g(y)| |f (y) − fk (y)| dy → 0 2π y −π as k → ∞. Similarly for |fk ∗(g −gk )|; thus the convergence depends on k (and not on x); i.e., we have the desired uniform convergence. (End of proof of claim.) 4 LECTURE 5: CONVOLUTIONS AND GOOD KERNELS Then, by Step I, since fk ∗ gk are continuous, their uniform limit f ∗ g is also. LECTURE 6: CONVOLUTIONS AND GOOD KERNELS 1. Convolutions, continued Recall: we were proving certain properties about the convolution of two (integrable) functions; more importantly, we were showing examples of how one extends results on continuous functions to integrable functions, using the “L1 approximation lemma.” Claim: (vi) For integrable f, g, f[ ∗ g = fˆĝ. Proof. Let {fk }, {gk } be sequences of continuous functions converging to f, g in L1 . In the previous example, we showed that fk ∗ gk converges to f ∗ g uniformly, so, interchanging integral and limit, [ f\ k ∗ gk (n) → f ∗ g(n) for each fixed n ∈ Z. Now, since fk and gk are continuous, we know already b that f\ k ∗ gk = fk gbk . That is, we actually have fbk (n)gbk (n) → f[ ∗ g(n) Second example of extending a result from continuous functions to integrable ones. Key: uniform convergence of fk ∗ gk We have the result already for continuous functions. as k → ∞. So STS that fbk (n)gbk (n) → fˆ(n)ĝ(n). That follows from the L1 statement: fbk (n) → fˆ(n), since Z π 1 |fˆ(n) − fbk (n)| = [f (x) − fk (x)]e−inx dx 2π −π Z π 1 ≤ |f (x) − fk (x)| dx 2π −π which goes to 0 as k → ∞ (similarly for gk (n)) so we are done. Where L1 convergence comes into play. 2 Convolution as multiplication: What is 1? LECTURE 6: CONVOLUTIONS AND GOOD KERNELS 2. Good kernels (Approximations of the Identity) Remark. One thinks of convolution as a kind of multiplication; in fact, L1 (T) with convolution as multiplication is a so-called Banach algebra. Question. What’s the identity element (for this multiplication)? Well, there isn’t one (it would be the Dirac delta function, which is not a function). But although there isn’t an identity element, we can create a sequence of elements that approximate the identity: a so-called good kernel. An extremely useful notion: an approximation of the identity Definition 2.1. Let {Kn : T → R}∞ n=1 be a sequence of functions on the circle. If Rπ 1 i. (“sliding in” condition) 2π −π Kn (x) dx = 1 for all n ∈ N, Rπ 1 ii. there exists M > 0 such that 2π −π |Kn (x)| dx ≤ M for all n ∈ N, and iii. given any δ > 0, Z |Kn (x)| dx → 0 δ≤|x|≤π as n → ∞, then we say {Kn }∞ n=1 is a family of good kernels or (more commonly) an approximation to the identity. What’s the point? They approximate the identity, i.e., the Dirac delta function. Theorem 2.2. Let f be an integrable function on the circle, and {Kn } a family of good kernels. Then lim (f ∗ Kn )(x0 ) = f (x0 ) n→∞ whenever f is continuous at x0 . If f is continuous everywhere, then the above limit is uniform. Sliding in the function to take advantage of smoothness (here, continuity) Proof. Let > 0 be given. We want to control |f ∗Kn (x0 )− f (x0 )|. Well, LECTURE 6: CONVOLUTIONS AND GOOD KERNELS Using ||Kn ||1 ≤ M and continuity on the first part Using boundedness of f and the shrinking of Kn Q. What do the bounds depend on? 3 Z π 1 |f ∗ Kn (x0 ) − f (x0 )| = Kn (y)f (x0 − y) dy − f (x0 ) 2π −π Z π 1 = |Kn (y)[f (x0 − y) − f (x0 )]| dy 2π −π We know f is continuous at x0 , so choose δ > 0 such that |y| < δ implies |f (x0 − y) − f (x0 )| < . Then, obviously, we break the integral into two parts, the first over |y| < δ, the second its complement. For the first integral we get: Z π |Kn (y)| dy |Kn (y)[f (x0 − y) − f (x0 )]| dy ≤ 2π −π |y|<δ ≤ M. 2π For the second integral, Z Z 1 2B |Kn (y)[f (x0 − y) − f (x0 )]| dy ≤ |Kn (y)| dy 2π δ≤|y|≤π 2π δ≤|y|≤π 1 2π Z By the third property of good kernels, the second integral can be made arbitrarily small; thus we are done. Notice that the first bound is independent of x0 , but the second depends on δ (which depends on x0 ). However, if f were uniformly continuous, the second bound would not depend on x0 ; in that case the convergence would be uniform. Remark. Notice that if the Dirichlet kernel DN were an approximation of the identity, then we’d immediately have pointwise convergence of the Fourier series of an integrable at every point of continuity (recall that we have no such results for integrable functions). Unfortunately, the kernel that governs convergence of the Fourier series is not an approximation of the identity. The Dirichlet kernel is not good. Later: a continuous function whose Fourier series diverges LECTURE 7: CESARO AND ABEL SUMMABILITY Remark. As we are starting to see, even if the function is continuous, the Fourier series may not recover the function. On the other hand, there should be enough information in the Fourier coefficients to recover the function, especially for continuous functions: after all, we have the uniqueness theorem. How can we recover the function from the Fourier series, if not in the direct way? A different way of recovering the function. There should be enough information! We shall present two different ways of averaging the Fourier series to recover the function. 1. Cesaro summability P Definition 1.1. Given a sequence {cn }, let sn := nk=0 ck be the sequence of partial sums. We define the N th Cesaro th mean σN of the P∞sequence {sk } (a.k.a. the N Cesaro sum of the series k=0 ck ) by σN := Cesaro sum of the series s0 + s1 + · · · + sN −1 . N Remark. Convergence implies Cesaro summability. 2. Fejer’s theorem Definition 2.1. We define the Fejer kernel, FN , by letting FN (x) denote the N th Cesaro mean of the sequence {Dk (x)}; i.e., N −1 1 X FN (x) := Dk (x). N k=0 The Fejer kernel 2 Convolving with FN gives the Cesaro sum of the Fourier series LECTURE 7: CESARO AND ABEL SUMMABILITY Remark. Then consider: N −1 1 X f ∗ Dk (x) f ∗ FN (x) := N = 1 N k=0 N −1 X Sk (f )(x), k=0 the average of the first N partial sums of the Fourier series of f . The Fejer kernel is an approximation of the identity Lemma 2.2. One has 1 sin2 (N x/2) FN (x) = ; N sin2 (x/2) further, the Fejer kernel is a good kernel. Since FN is a good kernel, we immediately get We can recover the function! We knew this already. Proof. Exercise: prove the closed form. One can use the closed form to prove theRgood kernel criteria. FN being π 1 positive, the fact that 2π −π FN = 1 gives us the first two properties. The last follows from sin2 (x/2) ≥ cδ > 0 for δ ≤ |x| ≤ π. For then the |FN (x)| ≤ N1cδ in that region, so the integral converges to 0 as N → ∞. Theorem 2.3. Let f be an integrable function on the circle. If f is continuous at θ0 , then the Fourier series of f is Cesaro summable to f at θ0 . Further, if f is continuous on the entire circle, then the convergence of the Cesaro sums is uniform. Corollary 2.4. Let f be integrable on the circle. If fˆ(n) = 0 for all n ∈ Z, then f = 0 at all points of continuity of f . Proof. Obvious: at points of continuity, the Fourier series (namely, 0) is Cesaro summable to f . Corollary 2.5. Any continuous function on the circle can be uniformly approximated by trigonometric polynomials. LECTURE 7: CESARO AND ABEL SUMMABILITY 3 Proof. The Cesaro sums are averages of the partial sums (which are trigonometric polynomials) and so are polynomials themselves. 3. Abel means and summation P Definition 3.1. Let ∞ numbers. k=0 be a series of complexP i. We define the Abel means A(r) of the series ck by ∞ X A(r) := ck r k . Abel means of a series Taking the terms as coefficients of a power series k=0 ii. If, for every 0 ≤ r < 1, the Abel means A(r) converges, and lim A(r) = s P then we say the series ck is Abel summable to s. P k Example. Consider ∞ k=0 (−1) = 1 − 1 + 1 − 1 + 1 . . . . It diverges, but is Abel summable to s = 12 : ∞ X 1 A(r) := (−1)k rk = . 1+r k=0 P k Example. (Done in text.) Consider ∞ k=0 (−1) (k + 1) = 1 − 2 + 3 − 4 + 5 . . . . It diverges, but is Abel summable to s = 14 : ∞ X 1 A(r) := (−1)k (k + 1)rk = . (1 + r)2 Abel summable to s r→1 Examples k=0 4. The Abel means of a Fourier series and the Poisson kernel Definition 4.1. Suppose we know the Fourier series of a function f : ∞ X f (θ) ∼ an einθ . n=−∞ Abel means of a Fourier series LECTURE 7: CESARO AND ABEL SUMMABILITY 4 We define the Abel means Ar (f )(θ) of the Fourier series of the function f by ∞ X r|n| an einθ Ar (f )(θ) = n=−∞ Remarks. i. If we let c0 = a0 , cn = an einθ + a−n e−inθ , then the Abel means of the P Fourier series above equals the Abel means of the series ∞ k=1 ck . ii. For f integrable, |an | is uniformly bounded in n, so Ar (f ) converges absolutely, and for each fixed 0 ≤ r < 1, uniformly. This means can also be recognized as a convolution. Lemma 4.2 (Abel means as a convolution). Ar (f )(θ) = (f ∗ Pr )(θ) Proof. Recall the Poisson kernel ∞ X Pr (θ) := r|n| einθ n=−∞ (We also saw that 2 Pr (θ) = 1−2r1−r cos θ+r 2 for 0 ≤ r < 1.) Then Ar (f )(θ) := = ∞ X r|n| an einθ n=−∞ ∞ X r|n| n=−∞ = 1 2π Z 1 2π π f (φ) −π Z π f (φ)e−inφ dφ einθ −π ∞ X ! r|n| e−in(φ−θ) dφ n=−∞ = (f ∗ Pr )(θ). (Note we needed uniform convergence.) Pr is an approximation of the identity. Lemma 4.3. The Poisson kernel is an approximation of the identity (as r ↑ 1). LECTURE 7: CESARO AND ABEL SUMMABILITY 5 Proof. Recall we observed that ∞ X Pr (θ) := r|n| einθ n=−∞ converges absolutely. We also saw that Pr (θ) = 1 − |ω|2 1 − r2 = |1 − ω|2 1 − 2r cos θ + r2 where ω = reiθ . From the above, we note that Pr (θ) ≥ 0, and that Z π Z π X ∞ 1 1 r|n| einθ dθ Pr (θ) dθ = 2π −π 2π −π n=−∞ Z π ∞ X 1 = r|n| einθ dθ = 1, 2π −π n=−∞ Pr is nonnegative. so the first two conditions for approximations of the identity are satisfied. Now, notice that 1 − 2r cos θ + r2 = (1 − r)2 + 2r(1 − cos θ). In the region outside of |θ| < δ, the denominator is bounded below. 1 2 ≤ r ≤ 1 and δ ≤ |θ| ≤ π we see that 2 1 1 − 2r cos θ + r2 = 1 − + (1 − cos θ) ≥ cδ > 0; 2 For and thus 1 − r2 Pr (θ) ≤ cδ in δ ≤ |θ| ≤ π; thus Z Pr (θ) dθ = 0, lim r↑1 δ≤|θ|≤π the third condition is also satisfied. Immediate quence conse- 6 LECTURE 7: CESARO AND ABEL SUMMABILITY Corollary 4.4. Let f be an integrable function on the circle. Then the Abel means of the (Fourier series of ) f converges pointwise to f at every point of continuity. If, further, f is continuous on the circle, then the convergence is uniform. Again we can recover the function. LECTURE 8: DIRICHLET PROBLEM ON THE UNIT DISC 1. The Poisson kernel and the Dirichlet Problem on the unit disc Recall the Dirichlet problem: ∆u = 0 inside the unit disc D u = f on ∂D In Chapter I, we guessed that all solutions would be expressible as linear combinations of certain special functions; precisely, that they would be of the form ∞ X u(r, θ) = fˆ(m)r|m| eimθ Recall the Dirichlet problem Guessed form of the solution which we now recognize as Poisson integrals m=−∞ =: Ar (f )(θ) Z π 1 = f (φ)Pr (φ − θ)dθ 2π −π Now we shall see that this is true. Theorem 1.1. Let f be an integrable function defined on the unit circle. Then the Poisson integral u(r, θ) := (f ∗ Pr )(θ) solves the heat equation on the disk. That is, u ∈ C 2 (D) and satisfies: i. ∆u = 0. ii. If f is continuous at θ, then lim u(r, θ) = f (θ) r↑1 and if f is continuous everywhere then the convergence is uniform. The Poisson integral does solve the heat equation on the disc. 2 In fact, it is the unique solution. LECTURE 8: DIRICHLET PROBLEM ON THE UNIT DISC iii. If f is continuous, then u(r, θ) is the unique solution to the steady-state heat equation on the disc which satisfies conditions (i) and (ii). Proof. i. First off, notice that since f is integrable, again, the Fourier series expansion ∞ X u(r, θ) = fˆ(m)r|m| eimθ m=−∞ infinitely tiable differen- Term-by-term differentiation shows Poisson integral is harmonic. of the Poisson integral converges absolutely and uniformly on any disc 0 ≤ r < ρ, where ρ < 1. Because of (absolute and) uniform convergence (of the derivatives of the partial sums see Rudin, Thm 7.17), we know that u is differentiable term-by-term, infinitely many times. Using the polar form of the Laplacian, ∂ 2 u 1 ∂u 1 ∂ 2u ∆u = 2 + + ∂r r ∂r r2 ∂θ2 we get, by differentiating term-by-term, ∞ h X ∆u(r, θ) = |m|(|m| − 1)fˆ(m)r|m|−2 eimθ + m=−∞ |m|fˆ(m)r|m|−2 eimθ + −m2 fˆ(m)r|m|−2 eimθ i = 0, We do recover the boundary function. Uniqueness so the Poisson integral is indeed harmonic. (ii) is a restatement of the previous result. To show (iii), suppose there is another such solution, v(r, θ). It being twice-continuously differentiable, for each fixed r ∈ (0, 1) we know v(r, ·) has a (uniformly convergent) Fourier series ∞ X an (r)einθ , n=−∞ where an (r) = 1 2π Rπ −π v(r, θ)e−inθ dθ. LECTURE 8: DIRICHLET PROBLEM ON THE UNIT DISC 3 Now, that series must satisfy the heat equation, ∆v = 0. Plugging the above into the polar Laplacian, and assuming that we can differentiate term-wise, we get ∞ X [a00n (r) n=−∞ Fake reasoning 1 0 n2 + an (r) − 2 an (r)]einθ = 0 r r and thus a00n (r) 1 0 n2 + an (r) − 2 an (r) = 0 r r for all n ∈ Z. The equation above implies (Exercise 11, Chapter 1) that The coefficient must be of a certain form an (r) = An rn + Bn r−n for some An and Bn . Now, for n ≥ 1, Bn = 0 since Z π 1 an (r) := v(r, θ)e−inθ dθ 2π −π is bounded; to find An , take 1 An = lim an (r) := lim r↑1 r↑1 2π Z π v(r, θ)e−inθ dθ. −π Since v(r, θ) converges uniformly to f , we get An = fˆ(n). Thus for n ≥ 1, an (r) = fˆ(n)rn ; similarly for n < 0 (and for n = 0). All together, we have v(r, θ) = ∞ X fˆ(n)r|n| einθ n=−∞ and so the solution is indeed unique. The coefficient is actually the Fourier coefficient! 4 LECTURE 8: DIRICHLET PROBLEM ON THE UNIT DISC 2. Chapter III: L2 convergence of Fourier Series Definition 2.1. We define the L2 norm of a function f on the circle by Z π 1 |f (θ)|2 dθ. ||f ||2 := 2π −π L2 (T) Review of basic linear algebra Definitions: vector space, inner product, positive definite, norm, Hermitian inner product.... Some review: i. Vector space V over R, over C. ii. Examples: Rd , Cd iii. Inner product (X, Y ) on a vector space V over R. Symmetric (X, Y ) = (Y, X), (strictly) positive definite (X, X) ≥ 0, linear in both variables. iv. Given an inner product (·, ·), we can define a norm on V by ||X|| := (X, X)1/2 . E.g., the dot product on Rd gives rise to the standard Euclidean norm. v. Inner product (X, Y ) on a vector space V over C. Hermitian: (X, Y ) = (Y, X); linear in the first variable, conjugate linear in the second; etc. orthogonal vectors Definition 2.2. Let V be a vector space over R or C with inner product (·, ·) and associated norm ||·||. If (X, Y ) = 0, we say that X and Y are orthogonal and write X ⊥ Y . It turns out (see any good linear algebra book, e.g., Leon, S. Linear Algebra) that any vector space with inner product then has the following properties (read the proofs yourselves). Properties of (finitedimensional?) inner product spaces Theorem 2.3. Properties of inner product spaces i. The Pythagorean Theorem: If X ⊥ Y , then ||X + Y ||2 = ||X||2 + ||Y ||2 . LECTURE 8: DIRICHLET PROBLEM ON THE UNIT DISC 5 ii. The Cauchy-Schwarz Inequality: Given any X, Y ∈ V , |(X, Y )| ≤ ||X|| ||Y || iii. The Triangle Inequality: Given any X, Y ∈ V , ||X + Y || ≤ ||X|| + ||Y || Proof. The more complicated one is the Cauchy-Schwarz inequality. First, if ||Y || = 0, then 0 ≤ ||X + tY ||2 for all t ∈ R = (X + tY, X + tY ) = ||X||2 + 2t<(X, Y ) + 0. If <(X, Y ) > 0, then taking t << 0 gives a contradiction; similarly for <(X, Y ) < 0; thus <(X, Y ) = 0. Similarly for =(X, Y ). ) Now, if ||Y || = 6 0, let c = (X,Y (Y,Y ) ; then (X − cY ) ⊥ Y . By the Pythagorean theorem, ||X||2 = ||X − cY + cY ||2 = ||X − cY ||2 + ||cY ||2 ≥ |c|2 ||Y ||2 , i2 h (X,Y ) 2 i.e., ||X|| ≥ (Y,Y ) ||Y ||2 : the C-S inequality (squared). Showing that (X, Y ) = 0 for all X; note the inner product is positive, but not necessarily strictly. LECTURE 9: THE FOURIER SERIES RECOVER THE FUNCTION IN THE L2 SENSE 1. Two important infinite-dimensional vector spaces 1.1. `2 (Z). Definition 1.1. Let {an }n∈Z denote a (two-sided) sequence of complex numbers. We define the “little `2 norm” of {an }n∈Z by X |an |2 ||{an }n∈Z ||`2 = n∈Z `2 (Z) and let is finite. denote the vector space of all sequences whose little `2 norm Definition 1.2. For A = {an } and B = {bn } in `2 (Z), we define the inner product X (A, B) := an bn ; By the two-sided sum we mean the limit of the symmetric partial sums. Inner `2 (Z) product for n∈Z and let ||A|| := (A, A)1/2 denote the related norm. Question Question. Is `2 (Z) even a vector space? Proof. We need to show that if A, B ∈ `2 (Z), then so is A + B. That is, we need to show that X lim |an + bn |2 < ∞. N →∞ |n|≤N However, this is obvious (why?): well, by the finite-dimensional triangle inequality, 1/2 1/2 1/2 X X X |an + bn |2 ≤ |an |2 + |bn |2 |n|≤N |n|≤N |n|≤N ≤ ||A|| + ||B||. Taking the limit as N → ∞ shows that ||A + B|| ≤ ||A|| + ||B|| < ∞ so we’ve demonstrated that A + B ∈ `2 (Z) (and simultaneously that the infinite-dimensional triangle inequality holds). Just use the finite dimensional C-S and take the limit. Example 2: a preHilbert space L2 RECOVERY OF THE FUNCTION 2 Definition 1.3. An inner-product space with strictly positive-definite inner product, which is complete with respect to the induced metric, is called a Hilbert space. R is not a Hilbert space. Let R denote the set of complex-valued Riemann integrable functions on [0, 2π], with addition, scalar multiplication, inner product and norm defined as usual. It turns out that R fails to be a Hilbert space in two senses: first, the inner product is not strictly positive definite (||f || = 0 only implies f = 0 a.e.), and second, the space is not complete. 1.2. Checking Cauchy-Schwarz in R. Cute proof of Cauchy-Schwarz for R Proof. Using the fact that 2AB ≤ (A2 + B 2 ), we see that for any λ > 0, 1 |f (x)g(x)| ≤ [λ|f (x)|2 + λ−1 |g(x)|2 ] 2 Then Z 2π 1 |(f, g)| ≤ |f (x)g(x)| dx 2π 0 1 ≤ [λ||f ||2 + λ−1 ||g||2 ]; 2 taking λ = ||g|| ||f || yields the Cauchy-Schwartz inequality. 2. Fourier series and orthogonality Remark. We’ve already basically mentioned this before, but we can express the notion of Fourier series more simply using the language of inner products and orthogonality. Fourier series in the language of infinite-dimensional inner product spaces i. Let R denote integrable functions on the circle. ii. We define an inner product Z 2π 1 (f, g) := f (θ)g(θ) dθ. 2π 0 with induced norm ||f ||22 = (f, f ) and (not-quite) metric d(f, g) : ||f − g||2 . iii. Let en (θ) := einθ . Easy to see that {en }n∈Z is an orthonormal set. iv. Remark that the Fourier coefficients are precisely the coefficients of f in terms of {en }n∈Z : fˆ(n) = (f, en ). v. For example, we can express the partial sum of the Fourier series as X SN (f ) = (f, en )en . Consequences of having an orthonormal set. |n|≤N A few observations from linear algebra: P i. f − |n|≤N (f, en )en is orthogonal to en for all |n| ≤ N . Hilbert space L2 RECOVERY OF THE FUNCTION 3 ii. Using the above, the Pythagorean theorem implies ||f ||2 = ||f − SN (f )||2 + ||SN (f )||2 X |an |2 . = ||f − SN (f )||2 + |n|≤N P iii. Given any other approximation |n|≤N cn en , we have X X (an − cn )en ||2 , cn en ||2 = ||f − SN (f )||2 + || ||f − The partial sum of the Fourier series is the “best approximation” |n|≤N |n|≤N and thus SN (f ) is the best approximation of f in the L2 sense. 3. Main theorem: Recovery in the L2 sense Theorem 3.1. Let f be an integrable function on the circle. Then Z 2π 1 |f (θ) − SN (f )(θ)|2 dθ → 0 as N → ∞, 2π 0 i.e., the Fourier series converges to f “in the L2 sense.” Proof. Suppose g is continuous. By the Weierstrass approximation theorem, there exists a sequence of trigonometric polynomials pn which converge to g uniformly on the circle, i.e., such that given any > 0, there exists N such that n > N implies |g(x) − pn (x)| < for all x ∈ [0, 2π]. Thus 1/2 Z 2π 1 2 ||g − pn ||2 := |g(x) − pn (x)| dx < . 2π 0 By the “best approximation lemma”, we see that then for some sufficiently large N (not the same N as above), ||g − SN || ≤ as well. Suppose now that f is an integrable function. By the L1 approximation lemma, we can find a continuous function g such that g has the same bound (B, say) as f and Z 2π 1 2 |f (x) − g(x)| dx < , 2π 0 2B Proof is trivial for continuous functions, using Weierstrass approximation. For integrable functions, use the L1 approximation lemma. The L2 difference is Z 2π 1/2 1/2 Z 1 2B 2π 2 |f (x) − g(x)| dx ≤ |f (x) − g(x)| dx 2π 0 2π 0 ≤ . As before, we can find a trigonometric polynomial p approximating g s.t. ||g − p||2 < ; thus ||f − p||2 < 2 and, by the best approximation lemma, a partial sum SN (f ) of the Fourier series of f , for N sufficiently large, must approximate f with at most that error. (Main work is in the Weierstrass approximation theorem) LECTURE 10: L2 RECOVERY OF INTEGRABLE FUNCTIONS AND CONSEQUENCES 1. Main theorem: Recovery in the L2 sense Theorem 1.1. Let f be an integrable function on the circle. Then Z 2π 1 |f (θ) − SN (f )(θ)|2 dθ → 0 as N → ∞, 2π 0 (Have them do the second part as exercise.) i.e., the Fourier series converges to f “in the L2 sense.” Proof. Suppose g is continuous. By the Weierstrass approximation theorem, there exists a sequence of trigonometric polynomials pn which converge to g uniformly on the circle, i.e., such that given any > 0, there exists N such that n > N implies |g(x) − pn (x)| < for all x ∈ [0, 2π]. Thus ||g − pn ||2 := 1 2π 2π Z 1/2 |g(x) − pn (x)|2 dx < . Proof is trivial for continuous functions, using Weierstrass approximation. 0 By the “best approximation lemma”, we see that then for some sufficiently large N (not the same N as above), ||g − SN || ≤ as well. Suppose now that f is an integrable function. By the L1 approximation lemma, we can find a continuous function g such that g has the same bound (B, say) as f and Z 2π 2 1 , |f (x) − g(x)| dx < 2π 0 2B For integrable functions, use the L1 approximation lemma. The L2 difference is 1/2 1/2 Z 2π Z 1 2B 2π |f (x) − g(x)|2 dx ≤ |f (x) − g(x)| dx 2π 0 2π 0 ≤ . As before, we can find a trigonometric polynomial p approximating g s.t. ||g − p||2 < ; thus ||f − p||2 < 2 and, by the best approximation lemma, a partial sum SN (f ) of the Fourier series of f , for N sufficiently large, must approximate f with at most that error. Corollary 1.2 (Parseval’s Identity). Let f be an integrable P 2 function, and an = fˆ(n). Then limN →∞ N n=−N |an | converges to ||f ||2 . Proof. Recall that, by the Pythagorean theorem, X 2 2 ||f || = ||f − SN (f )|| + |an |2 . |n|≤N (Main work is in the Weierstrass approximation theorem) I.e., the map is an isometry. 2 L2 RECOVERY OF THE FUNCTION (R is incomplete.) Comment about physical meaning sounds Useful variant Parseval: Remark. In particular, notice that ||an ||`2 = ||f ||L2 ; and one has a correspondence between `2 and R. However, there exist sequences in `2 that do not arise as Fourier series of functions in R Corollary 1.3 (Riemann-Lebesgue Lemma). f integrable on the circle. Then fˆ(n) → 0 as |n| → ∞. of Polarization identity Lemma 1.4 (Polarized Parseval’s identity). Let f and g be integrable on the circle, with Fourier coefficients {an } and {bn }, respectively. Then Z 2π ∞ X 1 f (θ)g(θ) dθ = an bn . 2π 0 n=−∞ Proof. In any Hermitian inner product space, one has the polarization identity 1 (f, g) = [||f + g||2 − ||f − g||2 + i(||f + ig||2 − ||f − ig||2 ]. 4 Using this in R and Parseval’s identity, we get i X X X 1 hX 2 2 2 2 (f, g)2 = |an + bn | − |an − bn | + i( |an + ibn | − |an − ibn | 4 1 = [||an + bn ||2 − ||an − bn ||2 + i(||an + ibn ||2 − ||an − ibn ||2 ] 4 = (an , bn )`2 , by the polarization identity again...in `2 ! 2. Return to pointwise convergence Remark. Convergence in L2 does not guarantee that the Fourier series converges for any θ. HUH? (blimps) What about pointwise convergence? We do get some small results. Theorem 2.1. Let f be an integrable function on the circle. Suppose f is differentiable at θ0 . Then limN →∞ SN (f )(θ0 ) = f (θ0 ). L2 RECOVERY OF THE FUNCTION 3 Proof. Consider the “slope of the secant line” function f (θ −t)−f (θ ) 0 0 if t 6= 0 and |t| < π t F (t) := 0 −f (θ0 ) if t = 0 F , being differentiable at 0, is bounded near 0. It is, further, integrable on δ < |t| for all δ > 0; thus F is integrable on [−π, π] (see appendix). Then Z π 1 f (θ0 − t)DN (t) dt − f (θ0 ) SN (f )(θ0 ) − f (θ0 ) = 2π −π Z π 1 = [f (θ0 − t) − f (θ0 )]DN (t) dt 2π −π Z π 1 = F (t)tDN (t) dt. 2π −π Now, t sin((N + 1/2)t) sin(t/2) t = [sin(N t) cos(t/2) + cos(N t) sin(t/2)] sin(t/2) tDN (t) = so we get 1 2π 1 = 2π Z π F (t) −π Z π F (t)tDN (t) dt −π t [sin(N t) cos(t/2) + cos(N t) sin(t/2)] dt sin(t/2) which goes to 0 by the Riemann-Lebesgue lemma. Remark. In fact, the conclusion is true even for f only Lipschitz at θ0 . Alternate proof. (Chernoff, P. The American Mathematical Monthly, vol. 87, No. 5, May 1980, pp. 399-400.) Even cuter proof 4 (limit exists by L’Hôpital’s rule) L2 RECOVERY OF THE FUNCTION WLOG assume that x0 = 0 and f (x0 ) = 0. Since f (0) = 0 and f 0 (0) exists, the function g(x) = f (x)/[eix − 1] is bounded near 0 , and thus is integrable since f is. Then fˆ(n) = ĝ(n − 1) − ĝ(n), a telescoping series, and N X fˆ(n) = ĝ(−N − 1) − ĝ(N ); SN f (0) = −N which tends to 0 by the Riemann-Lebesgue lemma. Convergence of SN (f )(θ0 depends only on the behavior of f near θ0 . (Shocking!?) Corollary 2.2 (Localization principle of Riemann). Let f and g be integrable on the circle. Suppose f ≡ g in some neighborhood of a point θ0 . Then lim SN (f )(θ0 ) − SN (g)(θ0 ) = 0. N →∞ Remark. Note that neither f nor g need to be differentiable at θ0 , and that this does not imply that the Fourier series of either converges at θ0 , only that their convergence or divergence is connected (and, if they converge, they converge to the same limit). LECTURE 11: FOURIER SERIES NEED NOT CONVERGE AT POINTS OF CONTINUITY 1. Continuous function with Fourier series diverging at a point 1.1. A function which is not a Fourier series of a function in R. Consider the series −1 X einθ . n n=−∞ Suppose the above is the Fourier series of some Riemann integrable function, f . In that case, if we consider the Abel means at 0, we get ∞ X rn |Ar (f )(0)| = n n=1 which diverges as r → 1. However, we should also have Z π 1 |Ar (f )(0)| ≤ |f (θ)| Pr (θ) dθ ≤ sup |f (θ)| 2π −π θ a contradiction, so the above is not the Fourier series of a Riemann integrable function. 1.2. A continuous function whose Fourier series diverges at a point. Let X einθ X einθ ˜ and fN (θ) = fN (θ) = n n 1≤|n|≤N −N ≤n≤−1 Claim: (i) |f˜N (0)| ≥ c log N (ii) fN (θ) is uniformly bounded in N and θ L2 RECOVERY OF THE FUNCTION 2 Recall Tauber’s P theorem: if cn is Abel summable to P s, then cn actually converges to s if cn = o(1/n). (i) is evident. To prove (ii) will require a little machinery: P Lemma 1.1. Let ∞ n=1 cn be an infinite series. If P n (i) the Abel means Ar = ∞ n=1 r cn are bounded as r ↑ 1, and (ii) cn = O(1/n) P then the partial sum sequence SN = N n=1 is bounded. Proof. We will control the difference between SN and Ar . For N ∞ X X n SN − Ar = (cn − r cn ) − r n cn n=1 n=N +1 so |SN − Ar | ≤ N X n |cn ||1 − r | + n=1 Coarse bound (1 − rn ) on and the O(1/n) control ∞ X |rn ||cn | n=N +1 Now, let us make three observations: (i) (1 − rn ) = (1 − r)(1 + r + · · · + rn−1 ) ≤ n(1 − r). (ii) Since cn = O(1/n), we have n|cn | ≤ M for some M . (iii) We also have that for n ≥ N + 1, |cn | ≤ M N. Using those two facts to bound the two sums, we continue: N ∞ X M X n ≤M (1 − r) + r N n=1 n=N +1 M 1 . N 1−r If we take r = 1 − 1/N , then we get ≤ M N (1 − r) + Clever choice of r |SN − Ar | ≤ 2M. Since the Ar are bounded for large enough N , we see that SN are also bounded. Recall that we wanted to prove claim (ii), i.e.: L2 RECOVERY OF THE FUNCTION follows immediately from the lemma Claim: fN (θ) = and θ. P 1≤|n|≤N einθ n 3 is uniformly bounded in N Proof. (of claim (ii)) fN (θ) is the partial sum of the Fourier P inθ series n6=0 e n , the Fourier series of the sawtooth function f . Since (i) The Abel means are expressible as f ∗ Pr (θ), and since f is bounded (and ||Pr ||1 ≤ M ) the Abel means are bounded. (ii) cn = einθ /n + e−inθ /n, which is certainly O(1/n). we see SN (f )(θ) is uniformly bounded in N and θ. Use the convolution form of the Abel means 1.3. Creating the example: the Heart of the Matter. Recall X einθ and f˜N (θ) = fN (θ) = n 1≤|n|≤N X −N ≤n≤−1 einθ , n Notice fN has no n = 0 term. trigonometric polynomials of degree N . We define frequencyshifted versions of fN and f˜N , frequency-shifted versions of fN , f˜N PN (θ) = ei2N θ fN (θ) and P̃N (θ) = ei2N θ f˜N (θ), which are trigonometric polynomials of degree 3N and 2N − 1, respectively. Then if we consider the partial sums of PN , we see Lemma 1.2. PN SM (PN ) = P̃ N 0 if M ≥ 3N if M = 2N if M < N (obvious) Now choose any convergent positive series quence of integers {Nk } such that (i) Nk+1 > 3Nk (ii) limk→∞ αk log Nk = ∞ P αk and se- 4 L2 RECOVERY OF THE FUNCTION and let f (θ) = ∞ X Creating the function! αk PNk (θ). k=1 Since |PN (θ) = |fN (θ)|, which were uniformly bounded, the above series converges uniformly to a continuous periodic function. On the other hand, Key claim: It’s continuous. However.... |S2Nm (f )(0)| ≥ cαm log Nm + O(1) Proof. (of this last claim) X S2Nm (f )(0) = S2Nm ( αk PNk )(0) Use the lemma k = X αk S2Nm PNk (0) = αm S2Nm PNm (0) + X αk PNk (0) + m<k = αm P̃Nm (0) + 0 + X X αk PNk (0) m>k αk PNk (0) m>k Here’s how two claims into play. those come by the lemma. Now, P̃Nm (0) = f˜Nm (0) which, recall, is of modulus ≈ c log Nm , and X X | αk PNk (0)| ≤ B| αk | ≤ BA m>k P (Recall the αk = A was a convergent positive series.) k since the PNk (0) = fNk (0), which were uniformly bounded (by B, say) in N . Thus the partial sums of the Fourier series diverge at 0, despite the function being continuous everywhere. Remark. It is possible to see, using the Baire category theorem and Banach-Steinhaus theorem, that the set of continuous functions whose Fourier series diverge at a point is actually dense in the set of continuous functions (e.g., L2 RECOVERY OF THE FUNCTION 5 http://www.math.uchicago.edu/ may/VIGRE/VIGRE2010/ REUPapers/Stratmann.pdf for an undergraduate’s REU summary of the argument). LECTURE 12: FOURIER SERIES AND THE ISOPERIMETRIC INEQUALITY 1. Basic knowledge about curves. Definition 1.1. a C 1 mapping γ : [a, b] → R2 , such that γ 0 (s) 6= 0, is called a parametrized curve. We call the image of γ a curve (denoted by Γ, say). If γ is 1 − 1, we call Γ simple; if γ(a) = γ(b) we call Γ closed. parametrized curve Remarks. (1) Can extend γ to a b − a periodic function on R. (2) The smoothness conditions ensure the existence of a continuous tangent vector. Definition 1.2. We define the length of the curve Γ, parametrized by γ(s) = (x(s), y(s)) by length of a curve Z b Z b `= |γ 0 (s)| ds = (x0 (s)2 + y 0 (s)2 )1/2 ds a a Definition 1.3. Let s : [c, d] → [a, b] be a bijective, C 1 mapping. Then we call η(t) := γ ◦ s(t) a (re-)parametrization of Γ. [Note length is independent of parametrization.] If, further, |η 0 (t)| = 1 for all t we call η the arclength parametrization of Γ. Definition 1.4. We define, as the area of the region enclosed by the simple closed curve Γ, the value reparametrization of a curve arclength parametrization area enclosed by a curve 2 ISOPERIMETRIC INEQUALITY, WEYL’S EQUIDISTRIBUTION THEOREM Z 1 A= (x dy − y dx) 2 Γ Z b 1 = x(s)y 0 (s) − y(s)x0 (s) ds 2 a 2. The Isoperimetric inequality and Parseval’s identity Theorem 2.1. Let Γ ⊂ R2 be a simple closed curve. Let ` denote the length of Γ, A the area of its enclosed region. Then `2 A≤ , 2π with equality if and only if Γ is a circle. Statement inequality of The parametrization is the arclength parametrization. Translation of this fact in terms of Fourier series (via Parseval). Proof. Part I. WLOG (by scaling the plane by (x, y) → (δx, δy) one may assume that ` = 2π. Take the arclength parametrization γ(s) = (x(s), y(s)) = of Γ; then Z 2π 1 [x0 (s)2 + y 0 (s)2 ] ds = 1 2π 0 i.e., ||x0 ||22 + ||y 0 ||22 = 1. x(s) and y(s) are 2π periodic functions and have Fourier coefficients {an } and {bn }, respectively; their derivatives have coefficients {inan } and {inbn }, respectively. (Note that an = a−n and bn = b−n since x, y are real-valued.) Using Parseval’s identity on the above, we get (1) ||{inan }||2`2 + ||{inbn }||2`2 = 1, ∞ X i.e., (2) |n|2 [|an |2 + |bn |2 ] = 1. n=−∞ The area is what. Now, the area is by “definition” Z b 1 1 A = 2π x(s)y 0 (s) − y(s)x0 (s) ds 2 2π a ISOPERIMETRIC INEQUALITY, WEYL’S EQUIDISTRIBUTION THEOREM 3 Using Parseval again, we get (3) A=π X Translation of above via Parseval. an (−inbn ) − bn (−inan ) n∈Z (4) ≤π X |n||an bn − an bn | n∈Z Now, let’s bound this using the previous observation. Since (5) trivial quadratic inequality |an bn − an bn | ≤ 2|an | |bn | ≤ |an |2 + |bn |2 we see that (observe |n| ≤ |n|2 ) X (6) A≤π |n|(|an |2 + |bn |2 ) (with equality |an | = |bn |) n∈Z (7) ≤π X |n|2 (|an |2 + |bn |2 ) = π n∈Z using the statement (2) above about the arclength parametrization for the final equality. So we’ve shown the isoperimetric inequality. the case A = π Part II. Why can A = π only in the case of a circle? We make the following observations: i. First: equality in (6, 7) could only occur if we have no terms for |n| ≤ 2, i.e, x(s) = a−1 e−is + a0 + a1 eis and y(s) = b−1 e−is + b0 + b1 eis . ii. Further, since a−1 = a1 and b−1 = b1 , (7) implies that 2(|a1 |2 + |b1 |2 ) = 1. iii. Now, by (5), equality can hold only if |a1 b1 − a1 b1 | = 2|a1 | |b1 | = |a1 |2 + |b1 |2 = 1 2 iff 4 ISOPERIMETRIC INEQUALITY, WEYL’S EQUIDISTRIBUTION THEOREM and thus, since (|an | − |bn |)2 = 0 implies |an | = |bn |, |a1 | = |b1 | = 1/2. So 1 1 a1 = eiα and b1 = eiβ 2 2 for some α, β. iv. In this notation, 1 = 2|a1 b1 − a1 b1 | (from (iii) above) is equivalent to 1 i(α−β) [e − e−i(α−β) ] = 1 2 i.e., | sin(α − β)| = 1. so α − β = kπ/2 for some k ∈ Z. So 1 1 x(s) = e−iα e−is + a0 + eiα eis = a0 + cos(α + s) 2 2 1 1 and y(s) = e−iβ e−is + b0 + eiβ eis = b0 + cos(β + s) 2 2 kπ = b0 + cos(α + + s) = b0 ± sin(α + s) 2 - i.e., Γ is a circle, as desired. LECTURE 13: WEYL’S EQUIDISTRIBUTION THEOREM 1. Number theory: Weyl’s equidistribution theorem 1.1. Basic knowledge. Definition 1.1. Let x be a real number. Then (1) Let [x], the integer part of x, denote the greatest integer less than or equal to x. (2) Let hxi := x − [x] denote the fractional part of x. (3) Given x, y ∈ R, if x − y ∈ Z we say x ≡ y mod Z or x ≡ y mod 1. (Another easilyattained result) basic definitions Of course x ≡ y mod Z iff hxi = hyi. 1.2. Main theorem. The problem: consider the collection {hnγi : n ∈ N}: is it dense in [0, 1)? Kronecker’s theorem: yes, if γ ∈ / Q. Definition 1.2 (Definition of equidistributed sequence). If, for every interval (a, b) ⊂ [0, 1), then we call {ξn } an equidistributed sequence. #{n ∈ {1, 2, . . . , N } : ξn ∈ (a, b)} lim =b−a N →∞ N Theorem 1.3 (Weyl’s Equidistribution theorem). If γ ∈ /Q then {hnγi} is equidistributed in [0, 1). The question: {hnγi} dense? is equidistributed sequence Probability that an element of the (first N points of the) sequence lies in the interval main theorem Corollary 1.4 (Kronecker’s theorem). 1.3. Translation from number theory to analysis. We rephrase the main theorem as follows. Let χ(a,b) denote the characteristic function of (a, b) ⊂ [0, 1), extended as a 1-periodic function. Then we observe that Rephrasal of problem as an analysis question 2 WEYL’S EQUIDISTRIBUTION THEOREM #{1 ≤ n ≤ N : hnγi ∈ (a, b)} = N X χ(a,b) (nγ). n=1 So the theorem can be rephrased as follows: Theorem 1.5. Given any γ ∈ / Q, and any (a, b) ⊂ [0, 1), Z 1 N 1 X lim χ(a,b) (nγ) = χ(a,b) (x) dx N →∞ N 0 n=1 We show the cubature formula for continuous functions. 1.4. Main lemma. Lemma 1.6 (Main lemma). f continuous and periodic on [0, 1). If γ ∈ / Q, then Z 1 N 1 X f (nγ) → f (x) dx. N n=1 0 Technique: statement trivial for trigonometric polynomials, which are L∞ dense in the continuous periodic functions. (geometric sum) (by the uniform convergence of the Cesaro means for continuous functions, i.e,. the goodness of the Fejer kernel - here’s where the Fourier analysis enters) Proof. (trivial) i. Case: f ∈ {1, e2πix , . . . , e2πikx , . . . }. R1 f = 1 is obvious. Otherwise, the integral 0 f = 0, so we need to see that the average tends to 0. But, since e2πikγ 6= 1, N N 1 X 1 X 2πiknγ f (nγ) := e N n=1 N n=1 e2πikγ 1 − e2πikN γ = N 1 − e2πikγ which goes to 0 as N → ∞. ii. Case: trigonometric polynomials. The problem is linear, so the lemma holds for linear combinations of exponentials. iii. Case: f a continuous periodic function. We know that there exists a trigonometric polynomial P such that ||f (x) − P (x)||∞ < /3 and, by step i, that the lemma WEYL’S EQUIDISTRIBUTION THEOREM 3 holds for P . Then Z 1 N N 1 X 1 X f (nγ) − f (x) dx ≤ |f (nγ) − P (nγ)|+ N n=1 N 0 n=1 Z 1 Z N 1 1 X |P (x) − f (x)| dx. P (nγ) − P (x) dx + N n=1 0 0 The first and last are average differences between f and P (and thus are controlled by 3 ); the middle term is smaller than 3 for sufficiently large N by part i. 1.5. Proof of main theorem. Proof of Weyl’s equidistribution theorem. For each > 0, let f+ and f− be continuous, 1-periodic functions which (i) approximate χ(a,b) from above and below (ii) are bounded by 1 (iii) agree with χ(a,b) except on intervals of total length 2. Then the averages of f− , χ(a,b) , and f+ have the following relation: N N N 1 X − 1 X + 1 X f (nγ) ≤ χ(a,b) (nγ) ≤ f (nγ) N n=1 N n=1 N n=1 Proof of Weyl’s equidistribution theorem (draw picture) Now, the lemma implies (taking the limit as N → ∞) lim inf and lim sup! Z 1 Z 1 N N X X f− (x) dx ≤ lim inf χ(a,b) (nγ) ≤ lim sup χ(a,b) (nγ) ≤ f+ (x) dx N →∞ 0 N →∞ n=1 n=1 Since Z b − a − 2 ≤ 0 1 f− (x) dx Z and 1 f+ (x) dx ≤ b − a + 2 0 and the above is true for all , we see that the desired limit exists, and equals the desired b − a. 0 4 WEYL’S EQUIDISTRIBUTION THEOREM 1.6. Observations. Corollary 1.7. In fact, the main lemma holds even if f is merely Riemann integrable. Proof. Approximate f such a step function s such that ||f − s||∞ < /3 (see the proof of the L1 approximation lemma). Then Z 1 N N N 1 X 1 X 1 X f (nγ) − f (x) dx ≤ f (nγ) − s(nγ) N n=1 N n=1 N n=1 0 Z 1 Z 1 Z 1 N 1 X + s(nγ) − s(x) dx + s(x) dx − f (x) dx N n=1 0 0 0 Since s is a finite linear combination of characteristic functions of intervals, the middle term can be made smaller than /3 by taking N sufficiently large; the other terms are both also smaller than /3. Remark. Connection with dynamical systems: the system is “ergodic”; that is, for all irrational γ, denoting ρ(θ) = θ + 2πγ mod 2π, the “time average” N 1 X f (ρn (θ)) lim N →∞ N n=1 exists for each θ, and equals the “space average” Z 2π 1 f (θ) dθ 2π 0 Remark. Notice that along the way, we proved the forward direction of the following statement: Theorem 1.8. Weyl’s criterion: A sequence of real numbers {ξi } in (0, 1) is equidistributed if and only if for all WEYL’S EQUIDISTRIBUTION THEOREM k ∈ Z, 1 X 2πikξn lim e → 0. N →∞ N 5 LECTURE 14: A CONTINUOUS, NOWHERE DIFFERENTIABLE FUNCTION 1. Cool examples i. Weierstrass’s example: Let a ∈ N, a > 1; b ∈ (0, 1). Then if ab > 1 + 3π 2 , W (x) := ∞ X bn cos(an x) n=1 is a nowhere differentiable function (!). ii. Riemann’s near-example: R(x) := ∞ X sin(n2 x) n=1 n2 is differentiable att the points integers. πp q where p, q are odd P −nα i2n x Theorem 1.1. fα (x) = ∞ e , for α ∈ (0, 1) is a n=0 2 continuous, nowhere differentiable function. “at and only at” Using a lacunary Fourier series to create a cts nowhere differentiable function 2. The main idea: delayed means Recall that we have a couple of ways of summing the Fourier series: first, the standard way, taking SN (g) = g ∗ DN with DN the Dirichlet kernel; second, the Cesaro way, taking σN (g) = g ∗ FN , with FN the Fejer kernel. If we look at the partial sum SN on the Fourier coefficient side, we see that \ d S N (g)(n) = ĝ(n)DN (n) How things look on the Fourier coefficient side A CONTINUOUS, NOWHERE DIFFERENTIABLE FUNCTION 2 where d D N (n) = Partial sums: truncation 1 if |n| ≤ N 0 if |n| > N (Draw this!) I.e., we “chop off” the Fourier series for the terms with indices |n| > N . Doing the same thing for the Cesaro sums, we see that on the Fourier coefficient side one weights them as follows: Cesaro sums: weighted truncation \ σ\ N (g)(n) := g ∗ FN (n) 1 c c \ = ĝ(n) [D 0 (n) + D1 (n) + · · · + DN −1 (n)] N 1 ĝ(n) N [N − |n|] for |n| ≤ N = 0 for |n| > N. Equivalently, S0 (g)(x) + S1 (g)(x) + · · · + SN −1 (g)(x) N N −1 1 XX ak eikx = N `=0 |kl≤` 1 X = (N − |n|)an einx N |n|≤N X |n| an einx = 1− N σN (g)(x) := |n|≤N Definition 2.1. We define the delayed means ∆N (g) of the Fourier series of g by ∆N (g) := 2σ2N (g) − σN (g) = 2g ∗ F2N − g ∗ FN = g ∗ (2F2N − FN ). Delayed means: Big triangle - little triangle = trapezoid On the Fourier coefficient side, it is easy to see that A CONTINUOUS, NOWHERE DIFFERENTIABLE FUNCTION ĝ(n) \ ∆ ĝ(n)2(1 − N (g)(n) = 0 |n| 2N ) 3 if |n| ≤ N if N ≤ |n| ≤ 2N if |n| > 2N 3. Getting the contradiction Let’s recall our function: ∞ X n fα (x) = 2−nα ei2 x , with α ∈ (0, 1) n=0 Observation: For fixed N , if we choose the largest k for which 2k ≤ N , then ∆2k (fα ) = SN (fα ) Even though 2k ≤ N , there are no frequencies between 2k and N anyway, because of the lacunary nature of the series. the frequencies are of the form 2n , n ∈ N We don’t miss anything, because of the lacuna. Dumb observation: If 2N = 2n , then n ∆2N (f ) − ∆N (f ) = 2−nα ei2 x Question. Why not do this using the partial sum operator? Our contradiction will be obtained as follows. By the above dumb observation, for any point x0 , n (1) |∆2N (f )0 (x0 ) − ∆N (f )0 (x0 )| = |i2n 2−nα ei2 x0 | (2) = 2n(1−α) = (2N )1−α . However, we shall prove the following Lemma 3.1. Let g be continuous. If g is differentiable at x0 , then σN (g)0 (x0 ) = O(log N ). Since ∆N (g)0 (x0 ) = 2σ2N (g)0 (x0 ) − σN (g)0 (x0 ) = O(log N ), if fα is differentiable at some point x0 we would have a contradiction with (??) above. (We can catch the top term in an obvious way.) 4 A CONTINUOUS, NOWHERE DIFFERENTIABLE FUNCTION Proof of Lemma. Recall: σn g(x0 ) := FN ∗ g(x0 ) (with FN the Fejer kernel), so Z π σN (g)0 (x0 ) = FN0 (x0 − t)g(t) dt Z−ππ = FN0 (t)g(x0 − t) dt −π Take the derivative inside. Standard trick: use cancellation to slide in a constant to take advantage of smoothness Rπ by change of variables. Using the fact that −π FN0 (t) dt = 0, we see that we can slide in a constant to the above: and thus, using the differentiability of g at x0 , Z π = FN0 (t)[g(x0 − t) − g(x0 )] dt −π 0 Z π |σN (g) (x0 )| ≤ C −π Useful estimates |FN0 (t)| |t| dt. FACTS: Useful estimates on FN0 i. |FN0 (t)| ≤ AN 2 ii. |FN0 (t)| ≤ |t|A2 Break up the integral Putting it all together, we see Z π 0 |σN (g) (x0 )| ≤ C |FN0 (t)| |t| dt Z−π Z 0 ≤C |FN (t)| |t| dt + C |FN0 (t)| |t| dt |t|≥ 1 |t|≤ N1 Z N Z A 2 1 ≤C |t| dt + C AN dt 2 N |t|≥ N1 |t| |t|≤ N1 The first term is O(log N ), and the second O(1). A CONTINUOUS, NOWHERE DIFFERENTIABLE FUNCTION Proof of the second fact. WTS that |FN0 (t)| ≤ Well 1 sin2 (N t/2) FN (t) = , N sin2 (t/2) 5 problem A |t|2 . sin(N t/2) cos(N t/2) 1 cos(t/2) sin2 (N t/2) so = − . N sin2 (t/2) sin3 (t/2) 1 sin(N t/2) cos(N t/2) 1 cos(t/2) sin2 (N t/2) = − 1 N sin(t/2) sin2 (t/2) Now | sin(N t/2)| ≤ CN |t| and | sin(t/2)| ≤ c|t| for |t| ≤ π so we get CN |t| cos(N t/2) 1 cos(t/2)C 2 N 2 |t|2 0 so |FN (t)| ≤ − .... c2 |t|2 N c3 |t|3 FN0 (t) Remark. Something seems wrong here. LECTURE 15: FINALLY, THE FOURIER TRANSFORM! Remark. Normally, one defines the Fourier transform on L2 (R). However, we cannot define this space (without first defining the Lebesgue integral). Instead, we’ll work on the Schwartz class S(R). When you are older (Book III) you’ll see that this space is dense in L2 (R), and that one can extend uniquely our Fourier transform to L2 (R). By the way, restricting ourselves to the Schwartz space “is a device that allows us to come quickly to the main conclusions, formulated in a direct and transparent fashion” (but in some sense oversimplifies the matter). 1. Basic definitions improper integrals Definition 1.1. We define (if the limit exists) Z ∞ Z N f (x) dx = lim N →∞ −∞ f (x) dx −N Definition 1.2. Let f : R → C be continuous. if there exists A > 0 such that |f (x)| ≤ A 1 + x2 for all x ∈ R, then we say f is of moderate decrease, and denote by M (R) the (vector) space of such functions. Remark. It is easy to see that the improper integral converges for f ∈ M (R) R (Exercise: show that then {IN := N −N f } forms a Cauchy sequence. R Rf Vector space M (R) of Functions of Moderate Decrease Could use 1 + instead of 2. 2 LECTURE 15: FINALLY, THE FOURIER TRANSFORM! Well, given > 0, we see that Z |IM − IN | ≤ f N ≤|x|≤M Z ≤A N ≤|x|≤M 1 1 1 dx = 2A − x2 N M 2A ≤ , N which we can make smaller than as long as N is sufficiently large.) Basic properties of improper integration Proposition 1.3. (Properties of the improper integral) Let f, g ∈ M (R), and α, β ∈ C. Then i. (Linearity) Z ∞ Z Z [αf + βg] = α f + β g. −∞ ii. (Translation invariance) Z ∞ Z f (x − α) dx = −∞ ⇒ L1 norm doesn’t change under dilation ∞ f (x) dx. −∞ iii. (Scaling under dilations) Given any δ > 0, Z Z ∞ δ f (δx) dx = f (x) dx. R iv. (Continuity) Z lim h→0 Stein will start to leave out details now...but this is nothing. R R −∞ ∞ |f (x − h) − f (x)| dx = 0. −∞ Proof. The proofs are straightforward. Proof of ii. It suffices to show that Z N Z N lim f (x − α) dx − f (x) dx = 0. N →∞ −N −N Well, via change of variables, Z N Z N Z N −α Z N f (x − α) dx − f (x) dx = f (x) dx − f (x) dx −N −N −N −α −N Z −N Z N f = f+ −N −α N −α LECTURE 15: FINALLY, THE FOURIER TRANSFORM! Now, take N > 2h. Then Z −N Z N Z |f | + |f | ≤ −N −α 1 2 N ≤|x|≤2N N −α 3 A 3A dx = , x2 N which goes to 0 as N → ∞. Proof of iv. Given > 0, we want to show there exists H > 0 such that if |h| < H, then Z N |f (x − h) − f (x)| dx < . lim N →∞ −N WLOG take |h| ≤ 1 and N0 large enough that Z Z |f | ≤ and |f (x − h)| dx ≤ 4 4 |x|≥N0 |x|≥N0 By the uniform continuity of f , we can choose H such that for h < H, sup |f (x − h) − f (x)| < 4N0 |x|≤N0 Then for any N > N0 , we see that Z N Z N0 |f (x − h) − f (x)| dx ≤ |f (x − h) − f (x)| dx −N −N0 Z Z + |f | + |f (x − h)| dx |x|≥N0 Z N0 ≤ |x|≥N0 |f (x − h) − f (x)| dx + −N0 ≤ 2 + = . 2 2 Thus Z ∞ |f (x − h) − f (x)| dx ≤ −∞ for all h < H. Point: f ∈ M (R) implies f lives inside a lemon. 4 LECTURE 15: FINALLY, THE FOURIER TRANSFORM! 2. The Fourier Transform Definition 2.1. Given f ∈ M (R), we define the Fourier transform fˆ of f by Z ∞ fˆ(ξ) := f (x)e−2πixξ dx −∞ Fourier transform Remark. Easy to see that fˆ is a bounded, continuous function that vanishes at ∞. However, fˆ is not necessarily in M (R) itself. This lack motivates the definition of the following class of functions. Rapidly decreasing function Definition 2.2. We call a function rapidly decreasing if for every k ≥ 0, we have sup |x|k |f (x)| < ∞, x∈R i.e., the function shrinks faster than the reciprocal of any polynomial function. S(R) Definition 2.3. Let f be an infinitely differentiable (C ∞ ) function. If f and all of its derivatives are rapidly decreasing, we call f a Schwartz class function and write f ∈ S(R). Remark. Observe that S(R) is a vector space (over C) and is closed under both differentiation and multiplication by polynomials. Principle: decay of the Fourier transform corresponds to the smoothness of f (connects with 2.4 in chapter II). 3. Fourier transform on S(R) Definition 3.1. For f ∈ S(R) we define the Fourier transform of f by Z ∞ fˆ(ξ) = f (x)e−2πixξ dx −∞ Proposition 3.2. Let f ∈ S(R), h ∈ R, and δ > 0. Then (the Fourier transform maps the following): LECTURE 15: FINALLY, THE FOURIER TRANSFORM! 5 i. f (x + h) −→ fˆ(ξ)e2πihξ (translation becomes modulation) ii. f (x)e−2πixh −→ fˆ(ξ + h) iii. f (δx) −→ δ −1 fˆ(δ −1 ξ) (dilation) iv. f 0 (x) −→ 2πiξ fˆ(ξ) (differentiation becomes polynomial multiplication) v. −2πixf (x) −→ dξd fˆ(ξ) Proof. The only one of interest is (v). We want to show that d ˆ \ (ξ) → 0 f (ξ) + 2πixf dξ Fix > 0. Consider: fˆ(ξ + h) − fˆ(ξ) \ + 2πixf (ξ) h Z Z 1 −2πix(ξ+h) − e−2πixξ ] dx + 2πixf (x)e−2πixξ dx = f (x) [e h R −2πixh ZR e −1 + 2πix dx = f (x)e2πixξ h R ...(to be continued). LECTURE 16: BASIC PROPERTIES OF THE FOURIER TRANSFORM Question (Badgering). What’s the difference between the theory we’re developing now and the theory we developed before? What was the “best” result we had for convergence of Fourier series? Do we have anything parallel here? 1. Fourier transform on S(R) Recall the Schwartz class S(R). Notice that it is closed under differentiation and under multiplication by polynomials. Definition 1.1. For f ∈ S(R) we define the Fourier transform of f by Z ∞ fˆ(ξ) = f (x)e−2πixξ dx −∞ Proposition 1.2. Let f ∈ S(R), h ∈ R, and δ > 0. Then (the Fourier transform maps the following): i. f (x + h) −→ fˆ(ξ)e2πihξ (translation becomes modulation) ii. f (x)e−2πixh −→ fˆ(ξ + h) iii. f (δx) −→ δ −1 fˆ(δ −1 ξ) (dilation) iv. f 0 (x) −→ 2πiξ fˆ(ξ) (differentiation becomes polynomial multiplication) v. −2πixf (x) −→ dξd fˆ(ξ) Proof. The main proof of interest is (v). We want to show that d ˆ \ (ξ) → 0 f (ξ) + 2πixf dξ definition of the Fourier transform: analogous to definition of Fourier coefficients interactions with the Fourier transform 2 BASIC PROPERTIES Fix > 0. Consider: fˆ(ξ + h) − fˆ(ξ) h \ (ξ) + 2πixf Z Z 1 −2πix(ξ+h) f (x) [e = − e−2πixξ ] dx + 2πixf (x)e−2πixξ dx h −2πixh R ZR e −1 f (x)e−2πixξ = + 2πix dx h R We do need the second estimate Now, f (x) and xf (x) are both rapidly decreasing, so there exists Z N ∈ N such that Z |f | < and |x|≥N For each fixed x0 , we can find an h. By continuity, that h will will in a small neighborhood of x0 . Cover [−N, N ] with such neighborhoods; choose the minimum h, which is independent of x. eix − 1 = 2| sin x2 | |x||f (x)| dx < |x|≥N and, by L’Hôpital’s Rule, for sufficiently small h we have, for the compact set |x| ≤ N , e−2πixh − 1 + 2πix ≤ h N Outside of |x| ≤ N , we have the bound 2 |sin(−πixh)| e−2πixh − 1 + 2πix = + 2πix h h ≤ A + 2π|x| since sin h h is bounded. Thus we have Z N ≤ −N fˆ(ξ + h) − fˆ(ξ) \ + 2πixf (ξ) h −2πixh e − 1 + 2πix dx + f (x)e−2πixξ h ≤ C Corollary 1.3. f ∈ S(R) implies fˆ ∈ S(R). BASIC PROPERTIES 3 Proof. Recall that S(R) is closed under differentiation and multiplication by polynomials; thus if f ∈ S(R), then so is k d 1 [(−2πix)l `f (x)]. k (2πi) dx Thus the Fourier transform of the above, namely (by the previous proposition) ` d k fˆ(ξ) ξ dξ is bounded (since the Fourier transform of any Schwartz class function is bounded). Thus fˆ and all its derivatives are rapidly decreasing, as desired. Immediate consequence of the above and the closure of S(R) 2. Creating an approximation of the identity using dilated Gaussians 2 Definition 2.1. We call f (x) = e−x the Gaussian. Remark. The Gaussian is a Schwartz class function; in fact 2 e−ax is in S(R) for all a > 0. The choice a = π is particular because Z ∞ 2 Z ∞ Z ∞ 2 2 −πx2 e−π(x +y ) dx dy e dx = −∞ −∞ −∞ Z 2π Z ∞ 2 e−πr r dr dθ = Z0 ∞ 0 2 2 = 2πre−πr dr = [−e−πr ]∞ 0 = 1. 0 2 Theorem 2.2. Let f (x) = e−πx . Then fˆ = f . Proof. We shall show that fˆ satisfies a certain boundary value problem. Let Z ∞ 2 F (ξ) := fˆ(ξ) = e−πx e−2πixξ dx −∞ (You probably saw this in multivariable calculus.) 4 BASIC PROPERTIES Then by property (v) in Proposition 1.2, we have Z ∞ 0 F (ξ) = [−2πixf (x)]e−2πixξ dx, which −∞ Z ∞ =i f 0 (x)e−2πixξ dx since f 0 (x) = −2πxf (x) −∞ (the Fourier transform of f 0 (x) is 2πiξ fˆ(ξ)) = i2πiξ fˆ(ξ) = −2πξF (ξ). Elementary differential equations (separation of variables) 2 implies that F (ξ) = Ce−πξ and, plugging in ξ = 0 we R ∞ , −πx 2 ˆ see C = F (0) = f (0) = −∞ e dx = 1 by the previous calculation. LECTURE 17: FOURIER INVERSION 1. Creating an approximation of the identity using dilated Gaussians Recall: 2 Definition 1.1. We call f (x) = e−x the Gaussian. 2 Remark. The Gaussian is a Schwartz class function; in fact e−ax is in S(R) for all a > 0. The choice a = π is particular because Z ∞ 2 Z ∞ Z ∞ 2 2 2 e−πx dx = e−π(x +y ) dx dy −∞ −∞ −∞ 2π Z ∞ Z = 2 e−πr r dr dθ = Z0 ∞ (You probably saw this in multivariable calculus.) 0 2 2 2πre−πr dr = [−e−πr ]∞ 0 = 1. 0 −πx2 Theorem 1.2. Let f (x) = e . Then fˆ = f . Proof. We shall show that fˆ satisfies a certain boundary value problem. Let Z ∞ 2 F (ξ) := fˆ(ξ) = e−πx e−2πixξ dx −∞ Then by property (v) in Proposition ??, we have Z ∞ F 0 (ξ) = [−2πixf (x)]e−2πixξ dx, which −∞ Z ∞ =i f 0 (x)e−2πixξ dx since f 0 (x) = −2πxf (x) −∞ = i2πiξ fˆ(ξ) = −2πξF (ξ). 2 Elementary differential equations (separation of variables) implies that F (ξ) = Ce−πξ , R∞ 2 and, plugging in ξ = 0 we see C = F (0) = fˆ(0) = −∞ e−πx dx = 1 by the previous calculation. (the Fourier transform of f 0 (x) is 2πiξ fˆ(ξ)) 1.1. Creating the approximation of the identity. √ Now, let, for δ > 0, Kδ denote a dilated (by δ) Gaussian: Kδ (x) = δ −1/2 f (δ −1/2 x) = δ −1/2 e−πx 2 /δ Then, by the interaction of dilation and Fourier transform Miraculously (?), dilations of this eigenfunction form an approximation of the identity. 2 LECTURE 17: FOURIER INVERSION (function dilated by ρ maps to Fourier transform scaled by ρ−1 ) we see We do get an approximation of the identity. cδ (ξ) = fˆ(δ 1/2 ξ) = e−πδξ 2 . Corollary 1.3. K Theorem 1.4. {Kδ }δ>0 is, as δ ↓ 0, an approximation of the identity. Proof. Easy. Kδ is a dilation of the Gaussian (a positive function), which has L1 norm of 1, so the first two conditions are satisfied. For the last condition, Z Z 1 2 √ e−πx /δ dx |Kδ (x)| dx = δ |x|>η Z|x|>η √ 2 = e−πy dy (let x = δy) |y|> √ηδ which obviously (the integrand is rapidly decreasing) goes to 0 as δ does. Remark. Consider: what would be the Fourier transform of the Dirac delta function? As expected (?), on the Fourier cδ (ξ) = e−πδξ 2 , which converges transform side, we see that K pointwise to the constant (≡ 1) function. Convolution of S(R) functions Definition 1.5. Given f, g ∈ S(R), we define the convolution f ∗ g by Z ∞ (f ∗ g)(x) := f (x − t)g(t) dt. −∞ Corollary 1.6. Given any f ∈ S(R), lim+ f ∗ Kδ (x) = f (x) δ→0 uniformly. The approximation of the identity works as before. Proof. First, we note that any Schwartz class function f ∈ S(R) is uniformly continuous on the real line. For given any > 0, we can choose an interval [−R, R] outside of LECTURE 17: FOURIER INVERSION Point: we needed uniform continuity (first on the circle, now on R) and the decay of the good kernel. 3 which |f (x)| ≤ /4. f being continuous, it is uniformly continuous on [−R, R], and thus onR all of R. f ∈ S(R) ⇒ f is ∞ continNow then, using the fact that −∞ Kδ = 1 and that uniformly uous. (Recall the Kδ (x) ≥ 0, Fourier coefficient result.) Z ∞ Kδ (t)[f (x − t) − f (x)] dt |(f ∗ Kδ )(x) − f (x)| = −∞ Z Z ≤ + Kδ (t)|f (x − t) − f (x)| dt |t|≥η |t|≤η Both of these can be made small, independently of x (exercise). (Away from the origin, use the vanishing of the approximation of the identity; near the origin, use the uniform continuity of f . The end.) 2. Fourier inversion formula Proposition 2.1 (“Multiplication formula”). Let f, g ∈ S(R). Then Z ∞ Z ∞ f (x)ĝ(x) dx = fˆ(y)g(y) dy −∞ −∞ Remark. Recall Fubini’s theorem: If F (x, y) is a continuous function on R2 satisfying the condition |F (x, y)| ≤ Multiplication formula: forerunner to Plancherel’s theorem A (1 + x2 )(1 + y 2 ) then Z ∞ Z ∞ Z ∞ Z ∞ F (x, y) dy dx = −∞ −∞ F (x, y) dx dy. −∞ −∞ Proof. Let F (x, y) = f (x)g(y)e−2πixy . Then certainly F satisfies the decay conditions for Fubini’s theorem, and Just a consequence of Fubini’s theorem LECTURE 17: FOURIER INVERSION 4 Z ∞ Z ∞ Z −2πixy ∞ ∞ Z f (x)g(y)e dy dx = f (x)g(y)e−2πixy dx dy, −∞ −∞ −∞ Z ∞Z ∞ Z−∞ ∞ Z ∞ fˆ(y)g(y) dy, i.e., f (x)ĝ(x) dx = −∞ −∞ −∞ −∞ which is what we wanted to show. Theorem 2.2 (Fourier inversion). Let f ∈ S(R). Then Z ∞ fˆ(ξ)e2πixξ dξ. f (x) = −∞ Use i. multiplication formula ii. that the Gaussian behaves nicely, and iii. that we have an approximation of the kernel. Proof. We first prove this for x = 0. Let Gδ (x) be what should be the inverse Fourier trans2 cδ (ξ) = Kδ (ξ) form of Kδ : i.e., let Gδ (x) = e−πδx ; then G (see scratchwork below). Then the multiplication formula says Z ∞ Z ∞ f (x)Kδ (x) dx = fˆ(ξ)Gδ (ξ) dξ. −∞ −∞ Let δ → 0 on both sides. The LHS is f ∗ Kδ (0), which converges to f (0) (since the family is an approximation of R∞ the identity). The RHS “clearly” converges to −∞ fˆ(ξ) dξ as δ ↓ 0 (exercise - recall fˆ ∈ S(R)), so the statement is proven for x = 0. For a general x, let F (y) = f (y+x). Then by the previous case, Z ∞ Z ∞ f (x) = F (0) = F̂ (ξ) dξ = fˆ(ξ)e2πixξ dξ. −∞ −∞ −πx2 Scratchwork: Let γ(x) = e√ , i.e., the normalized Gaussian. Then our above choice of Gδ (x) = γ( δx), and thus (scaling becomes dilation) 1 1 ξ 2 c Gδ (ξ) = √ γ̂ √ = √ e−πξ /δ = Kδ (ξ) δ δ δ Inverse transform Fourier Definition 2.3. Given g ∈ S(R), we define the inverse Fourier transform ǧ of g by LECTURE 17: FOURIER INVERSION Z ∗ 5 ∞ g(ξ) e2πixξ dξ F (g)(x) = ǧ(x) := −∞ Thus the Fourier inversion theorem can be written: for f ∈ S(R), ˇ f (x) = fˆ(x). Note that F(f )(y) = F ∗ (f )(−y). It is easy to see that for g ∈ S(R), ǧˆ(ξ) = g(ξ) (i.e., F ◦F ∗ = F ∗ ◦ F = I) and thus the Fourier transform is bijective on S(R). 3. Plancherel’s theorem Theorem 3.1 (Plancherel’s theorem). For any f ∈ S(R), we have ||fˆ||L2 (R) = ||f ||L2 (R) . Actually we don’t really need the above properties: Plancherel can be obtained directly from the multiplication and inversion formulae as follows. Simple Proof. The multiplication theorem says that for f, g ∈ S(R), Z ∞ Z ∞ f (x)ĝ(x) dx = fˆ(y)g(y) dy −∞ −∞ So that g such that ĝ = f¯, i.e., g = fˇ¯. Then we get Z ∞ Z ∞ f (x)f¯(x) dx = fˆ(y)fˇ¯(y) dy −∞ Z−∞ ∞ ¯ = fˆ(y)fˆ(y) dy, ˆ ¯ = f¯ ˆ fˇ¯ = f¯; fˇ −∞ as desired. proof of Plancherel’s theorem: basically just multiplication theorem LECTURE 18: WEIERSTRASS APPROXIMATION; HEAT EQUATION ON THE LINE 1. F and convolutions: Plancherel’s theorem Some further properties of the Fourier transform: Proposition 1.1 (Fourier transform and convolutions). Let f, g ∈ S(R). Then i. f ∗ g ∈ S(R) ii. f ∗ g = g ∗ f . \ iii. (f ∗ g)(ξ) = fˆ(ξ)ĝ(ξ) Convolutions of S(R) functions; plus the Fourier transform (Please read the proofs yourselves.) These properties, plus the inversion formula, can be used to give us the following useful theorem: Theorem 1.2 (Plancherel’s theorem). For any f ∈ S(R), we have ||fˆ||L2 (R) = ||f ||L2 (R) . Proof. Let f [ (x) := f (−x), and h = f ∗ f [ (which is in S(R), by the above proposition). Applying the inversion formula to h at x = 0, we get Z ∞ h(0) = ĥ(ξ) dξ Z−∞ ∞ [ i.e., (f ∗ f )(0) = |fˆ(ξ)|2 dξ Z ∞ Z−∞ ∞ 2 i.e., |f (x)| dx = |fˆ(ξ)|2 dξ −∞ −∞ The moral: always make your mistakes in pairs, so that one might cancel the other one out. 2 WEIERSTRASS APPROX’N; HEAT EQUATION 2. Application: Weierstrass Approximation Theorem Theorem 2.1. Let f : [a, b] → C. If f is continuous, then given any > 0, there exists a polynomial P such that sup |f (x) − P (x)| < . x∈[a,b] Create an extension to all of R Approximate the extension. Approximate the approximation. Actually we approximate the kernel using a polynomial and then notice that the convolution with the polynomial approximates the convolution with the kernel Any continuous function on a closed and bounded interval can be uniformly approximated by a polynomial. Proof. Let [−M, M ] be an interval containing [a, b]; let g be a continuous function on R that agrees with f on [a, b] and vanishes outside of [−M, M ]; let B denote a bound for g. Since g is uniformly continuous, we can choose δ0 such that for all x ∈ R, |g(x) − (g ∗ Kδ0 )(x)| < 2 Now, ∞ 2 X 1 1 −πx (−πx20 /δ0 )n δ 0 =√ Kδ0 (x) = √ e . n! δ0 δ0 n=0 The series expression converges uniformly on every compact interval of R, so there exists an N0 such that N0 (−πx2 /δ0 )n 1 X ≤ Kδ0 (x) − √ n! 4M B δ0 n=0 for all x ∈ [−2M, 2M ]; denote the finite sum by R(x). Then, for x ∈ [−M, M ] (so that x − t ∈ [−2M, 2M ] for t ∈ [−M, M ]), using the above bound we see Z M |(g ∗ Kδ0 )(x) − (g ∗ R)(x)| = g(t)[Kδ0 (x − t) − R(x − t)] dt −M Z M ≤ |g(t)| |Kδ0 (x − t) − R(x − t)| dt −M ≤ 2M B sup z∈[−2M,2M ] < /2 |Kδ0 (z) − R(z)| WEIERSTRASS APPROX’N; HEAT EQUATION 3 and thus g is uniformly approximated by g∗R on [−M, M ]. g is, of course, f on [a, b] ⊂ [−M, M ]. The last thing we need to show is that g ∗ R is actually a polynomial. Well, Z M (g ∗ R)(x) = g(t)R(x − t) dt Putting the two bounds together, Was that thing a polynomial? −M 2 (−π(x−t) and R(x−t) = √1δ n=0 n! P2N00 n (of the form n an (t)x ) in x PN0 /δ0 )n which is a polynomial 3. Application to PDEs 3.1. Time-dependent Heat equation on the line. Remark. Crucial property of Fourier Transform: interchanges differentiation with multiplication by polynomials. The problem: given an infinite rod and an initial temperature distribution f (x) at t = 0, what is u(x, t), the temperature at point x ∈ R at time t > 0? Physical considerations imply that ∂u ∂ 2 u = 2. ∂t ∂x The heat equation 3.2. Finding the solution via the Fourier transform. Taking (formally - assuming that, in particular, a solution exists and that it is in S(R)) the Fourier transform (in the first variable) of both sides, we get ∂ û (ξ, t) = −4π 2 ξ 2 û(ξ, t) ∂t and thus, fixing ξ, one gets a trivial differential equation, viz. (formally = assuming that everything works) 4 WEIERSTRASS APPROX’N; HEAT EQUATION ∂ û ∂t (ξ, t) û(ξ, t) = −4π 2 ξ 2 , ⇒ û(ξ, t) = A(ξ)e−4π 2 2 ξ t . Taking then the Fourier transform of the initial condition, we get û(ξ, 0) = fˆ(ξ), so A(ξ) = fˆ(ξ). Thus 2 2 û(ξ, t) = fˆ(ξ)e−4π ξ t , The solution is convolution with a particular kernel (which we call the heat kernel). and (taking the inverse Fourier transform of both sides) we see that a solution, if it exists, must be of the form u(x, t) = f ∗ Ht (x), where Ht (x) = K4πt (x). We call Ht (x) the heat kernel of the line. Theorem 3.1. Let f ∈ S(R), and let u(x, t) := (f ∗Ht )(x). Then i. u(x, t) is C 2 (R2+ ) and solves the heat equation, ii. u(x, t) → f (x) uniformly in x as t → 0 (and thus is continuous on R2+ ) iii. u(x, t) → f (x) in L2 as t → 0. Proof. Using the Fourier inversion formula, we see that Z ∞ ct (ξ)e−2πξx dξ; u(x, t) = fˆ(ξ)H −∞ Main part: showing convergence in L2 , using Plancherel differentation under the integral sign proves that it (is not only infinitely differentiable but also) solves the heat equation. (ii) is immediate, since {Kδ } is an approximation of the identity. WEIERSTRASS APPROX’N; HEAT EQUATION 5 How do we show L2 convergence? By Plancherel’s theorem, Z ∞ Z ∞ 2 ct (ξ) − fˆ(ξ)|2 dξ |u(x, t) − f (x)| dx = |fˆ(ξ)H −∞ Z−∞ ∞ 2 2 = |fˆ(ξ)|2 |e−4π tξ − 1| dξ. −∞ Let > 0 be fixed. Using the rapid decrease of f ∈ S(R), choose N such Z that 2 2 |fˆ(ξ)|2 |e−4π tξ − 1| dξ < . (1) Bound the part away from the origin |ξ|≥N Then for all t small enough (note that fˆ is bounded), we have 2 2 (2) . sup |fˆ(ξ)|2 |e−4π tξ − 1| < 2N |ξ|≤N So Z (3) |fˆ(ξ)|2 |e−4π 2 tξ 2 − 1| dξ < |ξ|≤N for all small t, so the L2 difference is smaller than /2. Bound the part near the origin LECTURE 19: THE FOURIER TRANSFORM AND PARTIAL DIFFERENTIAL EQUATIONS, CT’D. 1. The Heat Equation, ct’d. Recall: last time we proved the following. We defined the heat kernel by Ht (x) := K4πt (x), i.e., 1 −x2 /4t Ht (x) = e (4πt)1/2 ct (ξ) = e−4π2 tξ 2 H We solved the heat equation using the appropriate kernel. Theorem 1.1. Let f ∈ S (R), and let u(x, t) := (f ∗ Ht )(x). Then i. u(x, t) is C 2 (R2+ ) and solves the heat equation, ii. u(x, t) → f (x) uniformly in x as t → 0 (and thus is continuous on R2+ ) iii. u(x, t) → f (x) in L2 as t → 0. We also saw that the solution could be expressed as: Z ∞ ct (ξ)e−2πξx dξ; u(x, t) = fˆ(ξ)H −∞ and noted that for each fixed t, the convolution f ∗ Ht was in S (R). In fact, we have something stronger: Corollary 1.2. u(·, t) ∈ S (R) uniformly in t in the sense that given any T > 0, ` k ∂ sup |x| u(x, y) < ∞ ∂x` x∈R;t∈(0,T ) For fixed t, f ∗ Ht ∈ S (R) (convolution of Schwartz functions). In fact, uniformly in S (R). for each k, ` ≥ 0. Proof. We shall show that u(x, t) is rapidly decreasing, uni∂` formly for t ∈ (0, T ); the argument is identical for the ∂x ` u. We bound |u| as follows: Do it for u. Break into parts where |x−y| ≈ |x|.... 2 FOURIER TRANSFORM AND PDES Z Z |u(x, t)| ≤ |y|≤ |x| 2 Note that we couldn’t have used this bound for all y. |f (x − y)|Ht (y) dy + same |y|≥ |x| 2 So now, think: |y| ≤ |x| 2 , so |x − y| ≈ |x|. Thus, using the rapid decay of f , we see that for the first integral, Z Z CN |f (x − y)|Ht (y) dy ≤ Ht (y) dy |x| (1 + |x|)N |y|≤ |y|≤ |x| 2 2 CN ≤ ; (1 + |x|)N that is, for any N ∈ N the first integral shrinks faster than 1 (1+|x|)N (is rapidly decreasing). For the second integral, we see that (recall Ht (y) := 2 −y 1 4t e 1/2 (4πt) ) for |y| ≥ |x| 2 , we have, for t ∈ (0, T ), Ht (y) ≤ Note that we couldn’t have used this bound for all y either. C −cx2 e t , t1/2 thus Z Z C −cx2 |f (x − y)|Ht (y) dy ≤ 1/2 e t |f (x − y)| dy |x| t |y|≥ 2 |y|≥ |x| 2 −cx2 1 ≤ C 0 1/2 e t t is rapidly decreasing in x.... Theorem 1.3 (Uniqueness of solution). Suppose u(x, t) satisfies the following conditions: i. u solves the heat equation on R2+ ii. u(x, 0) = 0 iii. u is continuous on R2+ iv. u(·, t) ∈ S (R) uniformly in t (in the sense of Corollary (1.2)) then u ≡ 0. FOURIER TRANSFORM AND PDES 3 Proof. Define the energy of the solution u(x, t) at time t by Z E(t) := ||u(x, ·)||L2 (R) := |u(x, t)|2 dx R We notice that E(0) = 0 and that E ≥ 0; now we claim that E 0 (t) ≤ 0. Differentiating under the integral sign, we get Z dE = [∂t u(x, t)ū(x, t) + u(x, t)∂t ū(x, t)] dx dt R Since u solves the heat equation, we see that ∂t u = ∂x2 u; so (passing the derivatives over using integration by parts) Z dE = [∂x2 u(x, t)ū(x, t) + u(x, t)∂x2 ū(x, t)] dx dt RZ =− This is so cool. [∂x u(x, t)∂x ū(x, t) + ∂x u(x, t)∂x ū(x, t)] dx ZR = −2 |∂x u(x, t)|2 dx ≤ 0. R The end. 2. Steady-state heat equation in UHP The boundary value problem we will examine now is the following. ( 2 ∂2 ∂ ∆u := ∂x2 + ∂y2 u = 0 on R2+ u(x, 0) = f (x) As before, we proceed formally. Taking the Fourier transform in the first variable of the data above, we get ∂2 2 2 −4π ξ û(ξ, y) + 2 û(ξ, y) = 0 ∂y û(ξ, 0) = fˆ(ξ) As before, elementary ODE theory gives that the solution must be of the form û(ξ, y) = A(ξ)e−2π|ξ|y + B(ξ)e2π|ξ|y . Steady-state heat equation in R2+ . 4 FOURIER TRANSFORM AND PDES for some functions A(ξ), B(ξ). We assume B(ξ) ≡ 0, since otherwise we’d have to take the inverse Fourier transform of an exponentially growing function. The boundary condition û(ξ, 0) = fˆ(ξ) implies then that A(ξ) = fˆ(ξ), i.e., û(ξ, y) = fˆ(ξ)e−2π|xi|y i.e., u(x, y) = f ∗ Py (x) cy (ξ) = e−2π|ξ|y . for Py satisfying P The Poisson kernel on R2+ (is there some relation?) Definition 2.1. We define the Poisson kernel on the UHP by 1 y Py (x) := π x2 + y 2 LECTURE 20: THE STEADY-STATE HEAT EQUATION IN THE UPPER HALF PLANE. Recall the problem: Steady-state heat equation in R2+ . ∆u = 0 on R2+ u(x, 0) = f (x) Taking the Fourier transform in the first variable, we ultimately saw that û(ξ, y) = fˆ(ξ)e−2π|xi|y i.e., u(x, y) = f ∗ Py (x) cy (ξ) = e−2π|ξ|y . for Py satisfying P Definition 0.1. We define the Poisson kernel on the UHP by 1 y Py (x) := π x2 + y 2 Lemma 0.2 (Py (x) is what we would want it to be). Z ∞ e−2π|ξ|y e2πiξx dξ = Py (x) Z −∞ ∞ Py (x)e−2πixξ dx = e2π|ξ|y . The Poisson kernel on R2+ (Obvious question: is there some relation between the Poisson kernel on the circle and on the line?) −∞ Proof. RJust a calculation. Dividing the integral into two R∞ 0 parts, −∞ and 0 , we see: Z ∞ Z ∞ −2πξy 2πiξx e e dξ = e2πi(x+iy)ξ dξ 0 0 2πi(x+iy)ξ ∞ 1 e =− = 2πi(x + iy) 0 2πi(x + iy) Inverse Fourier transfprm of e−2π|ξ|y is Py (x), and tautologically equivalent statement 2 FOURIER TRANSFORM AND PDES Similarly, Calculus I exercise. Z 0 1 ; 2πi(x − iy) −∞ together these two calculations give the first identity. The second identity is equivalent to the first via the Fourier inversion theorem (which we can use since e−2π|ξ|y and Py are both of moderate decrease). The question is: for what PDEs is one guaranteed this phenomenon? Cf. Green’s functions. More calculus. e2πξy e2πiξx dξ = As before, we are extremely lucky: the kernel with which we convolve to obtain a solution of our equation is again an approximation of the identity: Lemma 0.3. Py is an approximation of the identity on R as y ↓ 0. Proof. Trivial. We have already shown above that ||Py ||1 = 1, and the Poisson kernel is positive, so the only property we need prove is the third, which is an(other) integral calculus problem: Z ∞ y π δ −1 dx = · · · = − tan x2 + y 2 2 y δ which goes to 0 as y → 0. Now let us show that our intuited solution actually is one: Theorem 0.4. Let f ∈ S (R); let u(x, y) := (f ∗ Py )(x) for (x, y) ∈ R2+ . Then i. u(x, y) ∈ C 2 (R2+ ) and ∆u = 0 ii. u(x, y) → f (x) uniformly as y → 0. iii. u(·, y) converges to f in L2 as y → 0. iv. Letting u(x, 0) = f (x), then u is continuous on R2+ and vanishes at infinity in the sense that u(x, y) → 0 as |x| + y → ∞. FOURIER TRANSFORM AND PDES 3 Proof of (iv). (The rest are similar to the proof in the case of the heat equation on the line, and left as exercises). We note that if f is any function of moderate decrease, then we have the following bound (first estimate): y 1 |(f ∗ Py )(x)| ≤ C + . 1 + x2 x2 + y 2 For Z Z Z ∞ f (x − t)Py (t) dt = + . −∞ |t|< |x| 2 First estimate proof of first estimate |t|> |x| 2 For the first integral, |x − t| ≈ |x|, so C C ≈ . |f (x − t)| ≤ 1 + (x − t)2 1 + x2 For the second integral, |t| > |x| 2 , so y y Py (t) = 2 < C . t + y2 x2 + y 2 We also have the second estimate C |(f ∗ Py )(x)| ≤ . y Second estimate since supx Py (x) ≤ Cy (and we are “averaging” Py with a function of moderate decrease - exercise). To show the vanishing at infinity, then: when |x| ≥ |y|, as |x| + y → ∞, we see that we have the bound 1 |y| 1 |x| + ≤ + 1 + x2 x2 + y 2 1 + x2 x2 + y 2 1 |x| ≤ + → 0. 1 + x2 x2 And if |x| ≤ |y|, then C |f ∗ Py (x)| ≤ → 0. y |(f ∗ Py )(x)| ≤ 4 FOURIER TRANSFORM AND PDES Uniqueness of the solution will follow from the following important (and in fact, characterizing) property of harmonic functions: MVP Point: a consequence of the Laplacian being equal to 0. Lemma 0.5 (Mean Value Property). Let Ω ⊂ R2 be open; let u ∈ C 2 (Ω). if ∆u = 0 in Ω, then given any closed disc BR (x, y) ⊂ Ω, one has, for all r ∈ [0, R], Z 2π 1 u(x + r cos θ, y + r sin θ) dθ. u(x, y) = 2π 0 Proof. Let U (r, θ) = u(x+r cos θ, y+r sin θ). Then ∆u = 0, expressed in polar form, is equivalent to (easy exercise) ∂ ∂U ∂ 2U +r r . 0= ∂θ2 ∂r ∂r Average the above over the circle; then letting F (r) = R 2π 1 2π 0 U (r, θ) dθ, we get Z 2π ∂ ∂F 1 ∂ 2U r r = − 2 (r, θ) dθ ∂r ∂r 2π 0 ∂θ Since ∂U ∂θ is periodic, the RHS of the above is 0; thus the continuous function r ∂F ∂r is constant, and thus (take r = 0) ∂F ∂F r ∂r = 0 (implying ∂r = 0); i.e., F is constant. Since F (0) = u(x, y), we see that F (r) = u(x, y) for all r ∈ [0, R]. The end. LECTURE 21: THE STEADY-STATE HEAT EQUATION IN THE UPPER HALF PLANE (END); POISSON SUMMATION FORMULA 1. Mean Value Property and Uniqueness of Solutions Recall: we were about to prove the uniqueness of solutions for the steady-state heat equation in the UHP using the following important property of harmonic functions. Lemma 1.1 (Mean Value Property). Let Ω ⊂ R2 be open; let u ∈ C 2 (Ω). if ∆u = 0 in Ω, then given any closed disc BR (x, y) ⊂ Ω, one has, for all r ∈ [0, R], Z 2π 1 u(x, y) = u(x + r cos θ, y + r sin θ) dθ. 2π 0 MVP gives Remark. In fact, if a function is continuous on an (open, connected) domain in Rn and satisfies the Mean Value Property, then it is harmonic and in C ∞ . Theorem 1.2 (Uniqueness of solutions to the steady-state heat equation on R2+ ). Let u be a solution to ∆u = 0 on R2+ . If i. u is continuous on R2+ , ii. u(x, 0) = 0, and iii. u(x, y) vanishes at infinity then u ≡ 0. Proof. By contradiction. Suppose u(x, y) is (WLOG) realvalued and u(x0 , y0 ) > 0 for some (x0 , y0 ) ∈ R2+ . + Choose a semi-disc DR := DR ∩ R2+ with R sufficiently large that u(x, y) ≤ 12 u(x0 , y0 ) in the complement (note + + (x0 , y0 ) ∈ DR ). Since u is continuous on DR , it attains Uniqueness of solutions 2 Observe: where a max is attained, by the MVP the function must attain that max on every circle centered at that point. Other properties of harmonic functions Connection between analysis on circle and R. Periodization of a function POISSON SUMMATION FORMULA + . Notice that its maximum M at some point (x1 , y1 ) ∈ DR u ≤ M throughout R2+ . Now, by the MVP, we know that Z 2π 1 u(x1 + ρ cos θ, y1 + ρ sin θ) dθ u(x1 , y1 ) = 2π 0 for, in particular, ρ ∈ (0, y1 ). Since u(x1 , y1 ) = M , u ≡ M on that entire circle. Let ρ → y1 ; we see that then (since u is continuous on R2+ ) u(x1 , 0) = M also: ※. Remarks (Other properties of harmonic functions). i. Maximum principle: If u is continuous on the closure of a bounded domain D and harmonic on the interior, then the maximum must be attained on the boundary (unless u is constant). ii. Liouville’s theorem: A harmonic function on Rn which is bounded must be constant. 2. Poisson summation formula Definition 2.1. Let f ∈ S (R). We define the periodization of f to be the (continuous) 1-periodic function F1 : [0, 1] → C defined by F1 (x) := ∞ X f (x + n) n=−∞ Remark. The sum converges absolutely and uniformly on every compact subset of R, so converges to a continuous function. Poisson summation Theorem 2.2 (Poisson summation formula). Let f ∈ S (R). Then X X f (x + n) = fˆ(n)e2πinx , n∈Z The periodic function is the one we get via a discrete version of the Fourier transform. n∈Z POISSON SUMMATION FORMULA Remark. In particular, X f (n) = n∈Z X 3 fˆ(n). n∈Z Proof. Call the second function F2 . Notice that F2 is (because fˆ ∈ S (R) also) again absolutely and uniformly converging (on all of R), and so is continuous. Since F1 and F2 are continuous functions on the circle, showing that they have the same Fourier coefficients would force their difference to be 0 everywhere. So we calculate the Fourier coefficients of F1 : using uniform convergence, we see ! Z 1 X XZ 1 −2πimx f (x + n)e−2πimx dx f (x + n) e = 0 n∈Z n∈Z = 0 XZ n∈Z Z ∞ = Continuous ⇒ STS Fourier coefficients equal. i.e., Fourier coefficients as functions on [0,1] n+1 f (y)e−2πimy dy n f (y)e−2πimy dy = fˆ(m), −∞ which is the m-th Fourier coefficient of F2 . Remark. Theorem (and proof) holds if f, fˆ are of moderate decrease. 3. Applications of Poisson summation 3.1. Heat Kernels on the Circle and on the Line. Recall from Chapter IV: given a function u(x, t) where t ≥ 0, x ∈ [0, 1] describing the temperature distribution on a ring (with initial distribution f (x)), it can be shown that u must satisfy the following problem: ∂u ∂ 2 u ∂t = ∂x2 u(x, 0) = f (x) Using Poisson summation formula Recall heat kernel on the circle. 4 POISSON SUMMATION FORMULA As usual, we look for standing wave solutions: u(x, t) = A(x)B(t), resulting in (an infinite superposition) X 2 2 u(x, t) = an e−4π n t e2πinx . n∈Z Setting t = 0 shows us that an = fˆ(n). Definition 3.1. We notate X 2 2 Ht (x) = e−4π n t e2πinx n∈Z heat kernel on the circle Heat kernel on the circle is the periodization of the heat kernel on the line and call Ht the heat kernel for the circle. Then the solution for the heat equation on [0, 1] with initial data f can be written as u(x, t) = (f ∗ Ht )(x) (where the convolution is on [0,1]). Recalling that the heat kernel on the line was given by 1 −x2 /4t ct (ξ) = e−4π2 tξ 2 , Ht (x) = e ; i.e., H (4πt)1/2 we note the following. Theorem 3.2. Ht (x) = X Ht (x + n). n∈Z How cool! Proof. This is exactly the Poisson summation formula: X X ct (n)e2πinx Ht (x + n) = H n∈Z n∈Z = X e−4π 2 tn2 2πinx e =: Ht (x). n∈Z LECTURE 21: THE STEADY-STATE HEAT EQUATION IN THE UPPER HALF PLANE (END); POISSON SUMMATION FORMULA Theorem 0.1 (Uniqueness of solutions to the steady-state heat equation on R2+ ). Let u be a solution to ∆u = 0 on R2+ . If i. u is continuous on R2+ , ii. u(x, 0) = 0, and iii. u(x, y) vanishes at infinity then u ≡ 0. Proof. By contradiction. Suppose u(x, y) is (WLOG) realvalued and u(x0 , y0 ) > 0 for some (x0 , y0 ) ∈ R2+ . + Choose a semi-disc DR := DR ∩ R2+ with R sufficiently large that u(x, y) ≤ 12 u(x0 , y0 ) in the complement (note + + (x0 , y0 ) ∈ DR ). Since u is continuous on DR , it attains + its maximum M at some point (x1 , y1 ) ∈ DR . Notice that u ≤ M throughout R2+ . Now, by the MVP, we know that Z 2π 1 u(x1 , y1 ) = u(x1 + ρ cos θ, y1 + ρ sin θ) dθ 2π 0 for, in particular, ρ ∈ (0, y1 ). Since u(x1 , y1 ) = M , u ≡ M on that entire circle. Let ρ → y1 ; we see that then (since u is continuous on R2+ ) u(x1 , 0) = M also: ※. Recall: 1. Poisson summation formula Theorem 1.1 (Poisson summation formula). Let f ∈ Then X X f (x + n) = fˆ(n)e2πinx , n∈Z n∈Z Observe: where a max is attained, by the MVP the function must attain that max on every circle centered at that point. Maximum principle: If u is continuous on the closure of a bounded domain D and harmonic on S (R). the interior, then the maximum must be attained on the boundary (unless u is constant). Connection between analysis on circle and R. Poisson summation The periodic function is the one we get via a discrete version of the Fourier trans- 2 Using Poisson summation formula Recall heat kernel on the circle. POISSON SUMMATION FORMULA 2. Applications of Poisson summation 2.1. Heat Kernels on the Circle and on the Line. Recall from Chapter IV: given a function u(x, t) where t ≥ 0, x ∈ [0, 1] describing the temperature distribution on a ring (with initial distribution f (x)), it can be shown that u must satisfy the following problem: ∂u 2 = ∂∂xu2 ∂t u(x, 0) = f (x) As usual, we look for standing wave solutions: u(x, t) = A(x)B(t), resulting in (an infinite superposition) X 2 2 u(x, t) = an e−4π n t e2πinx . n∈Z Setting t = 0 shows us that an = fˆ(n). Definition 2.1. We notate X 2 2 Ht (x) = e−4π n t e2πinx n∈Z heat kernel on the circle and call Ht the heat kernel for the circle. Then the solution for the heat equation on [0, 1] with initial data f can be written as u(x, t) = (f ∗ Ht )(x) (where the convolution is on [0,1]). Recalling that the heat kernel on the line was given by 1 −x2 /4t ct (ξ) = e−4π2 tξ 2 , Ht (x) = e ; i.e., H (4πt)1/2 Heat kernel on the circle is the periodization of the heat kernel on the line we note the following. Theorem 2.2. Ht (x) = X Ht (x + n). n∈Z A cool consequence of the above expression: Corollary 2.3. The heat kernel {Ht } on the circle is an approximation of the identity (on the circle) as t ↓ 0. Proof was too hard to do until now. Proof. Using uniform convergence, it is immediate that Z 1/2 Ht (x) dx = 1. −1/2 Since Ht ≥ 0, the above theorem implies that Ht ≥ 0 (not at all obvious otherwise); so the first two properties of POISSON SUMMATION FORMULA 3 good kernels are satisfied. It remains to see that given any η < 1/2, Z |Ht (x)| dx → 0 as t → 0. η<|x|≤ 12 Well, consider: Ht (x) = X Ht (x + n) n∈Z = Ht (x) + X Ht (x + n) =: Ht (x) + Et (x). n∈Z∗ Since {Ht } is a good kernel, it suffices to show that Z |Et (x)| dx → 0 |x|≤ 21 as t → 0. We shall see that (claim:) c |Et (x)| ≤ Ce− t . For, consider: 1 X −(x+n)2 e 4t 4πt n∈Z ∗ C X −cn2 √ ≤ e t . t n∈Z∗ Et (x) := √ since |x| ≤ 21 . Now, for t ∈ (0, 1] (which we can assume) we see that n2 1 1 2 ≥ +n t 2 t This is so cool...we can estimate the difference between the (good) heat kernel on the line and its periodization. 4 POISSON SUMMATION FORMULA (the n2 /t is greater than either of the terms averaged) and so C X − cn2 |Et (x)| ≤ √ e t t n∈Z∗ C − c X − cn2 ≤ √ e 2t e 2 t n∈Z∗ c ≤ Ce− t . R This bound implies the desired control on |x|≤ 1 |Et (x)| dx 2 and thus the third property of good kernels. Showing Poisson kernels related 2.2. Poisson kernels on the disc and upper half plane. Recall the Poisson kernels on the disc and upper half plane: 1 y 1 − r2 and P (x) = . Pr (ϑ) = y 1 − 2r cos ϑ + r2 π y 2 + x2 Corollary 2.4. With r = e2πy , X Pr (2πx) = Py (x + n) n∈Z Don’t need to mention this in class. Proof. Use the Poisson summation formula. 3. Digression into analytic number theory (Reference: Whittaker, E.T., and G.N. Watson, A Course of Modern Analysis: An Introduction to the General Theory of Infinite Processes and of Analytic Functions; With an Account of the Principal Transcendental Functions, Cambridge University Press, 1902.) theta function Definition 3.1. For s > 0, the theta function ϑ(s) is defined ∞ X 2 ϑ(s) := e−πn s n=−∞ Functional relation: consequence of Poisson summation POISSON SUMMATION FORMULA 5 Theorem 3.2 (Functional relation for ϑ). For s > 0, 1 s−1/2 ϑ = ϑ(s). s 2 Proof. Consider the function f (x) = e−πsx ; its Fourier transform (exercise) is 2 πξ 1 fˆ(ξ) = s− 2 e− s . Then, by Poisson summation, ∞ ∞ X X 1 πn2 −πs(x+n)2 e = s− 2 e− s e2πinx . n=−∞ n=−∞ Evaluating at x = 0 yields the desired relation. Another theta function Definition 3.3. We define the theta function Θ(z|τ ) for z ∈ C, =(τ ) > 0 by ∞ X 2 Θ(z|τ ) := eiπn τ e2πinz . n=−∞ Remarks. i. Θ(0|is) = ϑ(s). ii. Θ(x|4πit) = Ht (x) Definition 3.4. For s ∈ C such that <(s) > 1, we define the celebrated Riemann zeta function by ∞ X 1 ζ(s) = ns n=1 It can be shewn that ϑ, ζ, and Γ are related by Z ∞ s 1 π −s/2 Γ(s/2)ζ(s) = t 2 −1 (ϑ(s) − 1) dt. 2 0 Remark. This will become more relevant later (in your life). Digressing even further.... The Riemann zeta function 6 POISSON SUMMATION FORMULA 4. The Heisenberg Uncertainty Principle state function expected position Remark (Motivation). To what extent can one simultaneously locate the position and momentum of a particle? In quantum mechanics, a particle has associate with it a state function ψ (of L2 norm 1) which governs the position in the sense that the probability that the particle lies in aR particular region (a, b) ∈ R (one-dimensional space) is 2 (a,b) |ψ| . Then the expectation (expected position) is given by Z ∞ x|ψ(x)|2 dx, x := −∞ variance of position and the variance (uncertainty of the expectation) is given by Z ∞ (x − x)2 |ψ(x)|2 dx. −∞ Heisenberg Uncertainty Principle One has an analogous function describing the momentum of the particle. Importantly, it turns out that the probability of the momentum belonging to an interval (a, b) is R 2 (a,b) |ψ̂(ξ)| dξ. We shall now see the Heisenberg Uncertainty Principle, i.e., that 1 Variance of position × Variance of momentum & . 16π 2 Theorem 4.1. Let ψ ∈ S (R), and suppose ||ψ||2 = 1. Then Z ∞ Z ∞ 1 x2 |ψ(x)|2 dx ξ 2 |ψ̂(ξ)|2 dξ ≥ 16π 2 −∞ −∞ 2 −Bx and where B > 0, A2 = p equality holds iff ψ(x) = Ae 2B/π. In fact, we have, for every x0 , ξ0 ∈ R, blahblahblah (with the individual terms minimized when x0 = x, ξ0 = ξ. Easy calculation: integration by parts Proof. Integration by parts implies the following. POISSON SUMMATION FORMULA 7 ∞ Z |ψ(x)|2 dx 1= −∞ Z ∞ d x |ψ(x)|2 dx dx Z−∞ ∞ 0 0 =− xψ (x)ψ(x) + xψ (x)ψ(x) dx. =− −∞ Thus Z 1≤2 ∞ |x||ψ(x)||ψ 0 (x)| dx −∞ ∞ Z ≤2 −∞ Z ∞ =2 1/2 Z x |ψ(x)| dx 2 ∞ 0 2 2 1/2 |ψ (x)| dx −∞ x2 |ψ(x)|2 dx 1/2 4π 2 −∞ Z ∞ ξ 2 |ψ̂(ξ)|2 dξ 1/2 , −∞ using the Plancherel theorem (and the basic properties of the Fourier transform) for the equality in the last line. Now, equality can hold only if equality held in the application of the Cauchy-Schwartz inequality. which implies that the functions must be scalar multiples of each other: ψ 0 (x) = βxψ(x) for some scalar β. Again, elementary ODE theory implies 2 ψ(x) = Aeβx /2 . To ensure the function is in S (R), we requirep β = −2B for 2 some positive B; then ||ψ||2 = 1 forces A = 2B/π. Followed by Cauchy-Schwartz and Plancherel’s theorem LECTURE 21: THE STEADY-STATE HEAT EQUATION IN THE UPPER HALF PLANE (END); POISSON SUMMATION FORMULA Recall: 1. Poisson summation formula Theorem 1.1 (Poisson summation formula). Let f ∈ Then X X f (x + n) = fˆ(n)e2πinx , n∈Z Connection between on circle S (R). analysis and R. Poisson summation n∈Z 2. Applications of Poisson summation 2.1. Heat Kernels on the Circle and on the Line. Recall from Chapter IV: given a function u(x, t) where t ≥ 0, x ∈ [0, 1] describing the temperature distribution on a ring (with initial distribution f (x)), it can be shown that u must satisfy the following problem: ∂u 2 = ∂∂xu2 ∂t u(x, 0) = f (x) The periodic function is the one we get via a discrete version of the Fourier transform. Using Poisson summation formula Recall heat kernel on the circle. As usual, we look for standing wave solutions: u(x, t) = A(x)B(t), resulting in (an infinite superposition) X 2 2 u(x, t) = an e−4π n t e2πinx . n∈Z Setting t = 0 shows us that an = fˆ(n). Definition 2.1. We notate X 2 2 Ht (x) = e−4π n t e2πinx n∈Z and call Ht the heat kernel for the circle. heat kernel on the circle Then the solution for the heat equation on [0, 1] with initial data f can be written as u(x, t) = (f ∗ Ht )(x) (where the convolution is on [0,1]). Recalling that the heat kernel on the line was given by 1 −x2 /4t ct (ξ) = e−4π2 tξ 2 , e ; i.e., H Ht (x) = (4πt)1/2 we note the following. Heat kernel on the circle is the periodization of the heat kernel on the line 2 POISSON SUMMATION FORMULA Theorem 2.2. Ht (x) = X Ht (x + n). n∈Z A cool consequence of the above expression: Corollary 2.3. The heat kernel {Ht } on the circle is an approximation of the identity (on the circle) as t ↓ 0. Proof was too hard to do until now. Proof. Using uniform convergence, it is immediate that Z 1/2 Ht (x) dx = 1. −1/2 Since Ht ≥ 0, the above theorem implies that Ht ≥ 0 (not at all obvious otherwise); so the first two properties of good kernels are satisfied. It remains to see that given any η < 1/2, Z |Ht (x)| dx → 0 as t → 0. η<|x|≤ 21 Well, consider: X Ht (x) = Ht (x + n) n∈Z = Ht (x) + X Ht (x + n) =: Ht (x) + Et (x). n∈Z∗ This is so cool...we can estimate the difference between the (good) heat kernel on the line and its periodization. Since {Ht } is a good kernel, it suffices to show that Z |Et (x)| dx → 0 |x|≤ 21 as t → 0. We shall see that (claim:) c |Et (x)| ≤ Ce− t . POISSON SUMMATION FORMULA 3 For, consider: 1 X −(x+n)2 e 4t Et (x) := √ 4πt n∈Z ∗ C X −cn2 e t . ≤√ t n∈Z∗ since |x| ≤ 21 . Now, for t ∈ (0, 1] (which we can assume) we see that 1 1 n2 ≥ + n2 t 2 t (the n2 /t is greater than either of the terms averaged) and so C X − cn2 |Et (x)| ≤ √ e t t n∈Z∗ X cn2 c C ≤ √ e− 2t e− 2 t n∈Z∗ c ≤ Ce− t . R This bound implies the desired control on |x|≤ 1 |Et (x)| dx 2 and thus the third property of good kernels. 2.2. Poisson kernels on the disc and upper half plane. Recall the Poisson kernels on the disc and upper half plane: 1 − r2 1 y and P (x) = . Pr (ϑ) = y 1 − 2r cos ϑ + r2 π y 2 + x2 Showing Poisson kernels related Corollary 2.4. With r = e2πy , X Pr (2πx) = Py (x + n) n∈Z Don’t need to mention this in class. Proof. Use the Poisson summation formula. 4 POISSON SUMMATION FORMULA 3. Digression into analytic number theory (Reference: Whittaker, E.T., and G.N. Watson, A Course of Modern Analysis: An Introduction to the General Theory of Infinite Processes and of Analytic Functions; With an Account of the Principal Transcendental Functions, Cambridge University Press, 1902.) theta function Definition 3.1. For s > 0, the theta function ϑ(s) is defined ∞ X 2 ϑ(s) := e−πn s n=−∞ Functional relation: consequence of Poisson summation Theorem 3.2 (Functional relation for ϑ). For s > 0, 1 −1/2 = ϑ(s). s ϑ s 2 Proof. Consider the function f (x) = e−πsx ; its Fourier transform (exercise) is 2 πξ 1 fˆ(ξ) = s− 2 e− s . Then, by Poisson summation, ∞ ∞ X X 1 πn2 −πs(x+n)2 e = s− 2 e− s e2πinx . n=−∞ Another theta function n=−∞ Evaluating at x = 0 yields the desired relation. Definition 3.3. We define the theta function Θ(z|τ ) for z ∈ C, =(τ ) > 0 by ∞ X 2 Θ(z|τ ) := eiπn τ e2πinz . n=−∞ Remarks. i. Θ(0|is) = ϑ(s). ii. Θ(x|4πit) = Ht (x) Digressing even further.... POISSON SUMMATION FORMULA 5 Definition 3.4. For s ∈ C such that <(s) > 1, we define the celebrated Riemann zeta function by ∞ X 1 ζ(s) = ns n=1 The Riemann zeta function It can be shewn that ϑ, ζ, and Γ are related by Z ∞ s 1 π −s/2 Γ(s/2)ζ(s) = t 2 −1 (ϑ(s) − 1) dt. 2 0 Remark. This will become more relevant later (in your life). 4. The Heisenberg Uncertainty Principle Remark (Motivation). To what extent can one simultaneously locate the position and momentum of a particle? In quantum mechanics, a particle has associate with it a state function ψ (of L2 norm 1) which governs the position in the sense that the probability that the particle lies in region (a, b) ∈ R (one-dimensional space) is Ra particular 2 (a,b) |ψ| . Then the expectation (expected position) is given by Z ∞ x := x|ψ(x)|2 dx, state function expected position −∞ and the variance (uncertainty of the expectation) is given by Z ∞ (x − x)2 |ψ(x)|2 dx. variance of position −∞ One has an analogous function describing the momentum of the particle. Importantly, it turns out that the probability of the momentum belonging to an interval (a, b) is R 2 (a,b) |ψ̂(ξ)| dξ. We shall now see the Heisenberg Uncertainty Principle, i.e., that 1 Variance of position × Variance of momentum & . 16π 2 Heisenberg Uncertainty Principle 6 POISSON SUMMATION FORMULA Theorem 4.1. Let ψ ∈ S (R), and suppose ||ψ||2 = 1. Then Z ∞ Z ∞ 1 x2 |ψ(x)|2 dx ξ 2 |ψ̂(ξ)|2 dξ ≥ 16π 2 −∞ −∞ 2 −Bx and where B > 0, A2 = p equality holds iff ψ(x) = Ae 2B/π. In fact, we have, for every x0 , ξ0 ∈ R, blahblahblah (with the individual terms minimized when x0 = x, ξ0 = ξ. Easy calculation: integration by parts Proof. Integration by parts implies the following. Z ∞ 1= |ψ(x)|2 dx −∞ Z ∞ d =− x |ψ(x)|2 dx dx Z−∞ ∞ 0 0 xψ (x)ψ(x) + xψ (x)ψ(x) dx. =− −∞ Thus Z 1≤2 ∞ |x||ψ(x)||ψ 0 (x)| dx −∞ ∞ Z ≤2 −∞ Z ∞ =2 1/2 Z x |ψ(x)| dx 2 ∞ 0 2 1/2 |ψ (x)| dx −∞ x2 |ψ(x)|2 dx 1/2 4π 2 Z −∞ Followed by Cauchy-Schwartz and Plancherel’s theorem 2 ∞ ξ 2 |ψ̂(ξ)|2 dξ 1/2 , −∞ using the Plancherel theorem (and the basic properties of the Fourier transform) for the equality in the last line. Now, equality can hold only if equality held in the application of the Cauchy-Schwartz inequality. which implies that the functions must be scalar multiples of each other: ψ 0 (x) = βxψ(x) for some scalar β. Again, elementary ODE theory implies 2 ψ(x) = Aeβx /2 . POISSON SUMMATION FORMULA 7 To ensure the function is in S (R), we requirep β = −2B for 2 some positive B; then ||ψ||2 = 1 forces A = 2B/π. LECTURE 22: POISSON SUMMATION FORMULA, CONTINUED 1. Review from last lecture Recall: Theorem 1.1 (Poisson summation formula). Let f ∈ S (R). Then X X f (x + n) = fˆ(n)e2πinx , n∈Z n∈Z Definition 1.2. We notate X 2 2 Ht (x) = e−4π n t e2πinx Connection between analysis on circle and R. Poisson summation The periodic function is the one we get via a discretization of the Fourier transform. n∈Z and call Ht the heat kernel for the circle. Then the solution for the heat equation on [0, 1] with initial data f can be written as u(x, t) = (f ∗ Ht )(x) (where the convolution is on [0,1]). Recalling that the heat kernel on the line was given by 1 2 ct (ξ) = e−4π2 tξ 2 , Ht (x) = e−x /4t ; i.e., H 1/2 (4πt) we note the following. Theorem 1.3. Ht (x) = X Ht (x + n). n∈Z 2. The Heat Kernel on the Line is Good A cool consequence of the above expression: Corollary 2.1. The heat kernel {Ht } on the circle is an approximation of the identity (on the circle) as t ↓ 0. heat kernel on the circle Heat kernel on the circle is the periodization of the heat kernel on the line 2 POISSON SUMMATION FORMULA Proof. Using uniform convergence, it is immediate that Z 1/2 Ht (x) dx = 1. −1/2 Since Ht ≥ 0, the above theorem implies that Ht ≥ 0 (not at all obvious otherwise); so the first two properties of good kernels are satisfied. It remains to see that given any η < 1/2, Z |Ht (x)| dx → 0 as t → 0. η<|x|≤ 21 Well, consider: X Ht (x) = Ht (x + n) n∈Z = Ht (x) + X Ht (x + n) =: Ht (x) + Et (x). n∈Z∗ This is so cool...we can estimate the difference between the (good) heat kernel on the line and its periodization. Since {Ht } is a good kernel, it suffices to show that Z |Et (x)| dx → 0 |x|≤ 21 as t → 0. We shall see that (claim:) c |Et (x)| ≤ Ce− t . Error here? [Proof of claim.] Consider: 1 X −(x+n)2 e 4t 4πt n∈Z ∗ C X −cn2 e t . ≤√ t n∈Z∗ Et (x) := √ since |x| ≤ 21 . Now, for t ∈ (0, 1] (which we can assume) we see that n2 1 1 2 ≥ +n t 2 t Proof was too hard to do until now. POISSON SUMMATION FORMULA 3 (the n2 /t is greater than either of the terms averaged) and so C X − cn2 |Et (x)| ≤ √ e t t n∈Z∗ C − c X − cn2 ≤ √ e 2t e 2 t n∈Z∗ c ≤ Ce− t . R This bound implies the desired control on |x|≤ 1 |Et (x)| dx 2 and thus the third property of good kernels. 2.1. Poisson kernels on the disc and upper half plane. Recall the Poisson kernels on the disc and upper half plane: 1 y 1 − r2 and P (x) = . Pr (ϑ) = y 1 − 2r cos ϑ + r2 π y 2 + x2 (Seems to be some error here (in the last inequality), but showing c |Et (x)| ≤ √Ct e− 2t gives the desired bound anyway.) Showing Poisson kernels related Corollary 2.2. With r = e2πy , X Pr (2πx) = Py (x + n) n∈Z Proof. Use the Poisson summation formula. Don’t need to mention this in class. 3. Digression into analytic number theory (Reference: Whittaker, E.T., and G.N. Watson, A Course of Modern Analysis: An Introduction to the General Theory of Infinite Processes and of Analytic Functions; With an Account of the Principal Transcendental Functions, Cambridge University Press, 1902.) Definition 3.1. For s > 0, the theta function ϑ(s) is defined ∞ X 2 ϑ(s) := e−πn s theta function n=−∞ Functional relation: consequence of Poisson summation 4 POISSON SUMMATION FORMULA Theorem 3.2 (Functional relation for ϑ). For s > 0, 1 s−1/2 ϑ = ϑ(s). s 2 Proof. Consider the function f (x) = e−πsx ; its Fourier transform (exercise) is 2 πξ 1 fˆ(ξ) = s− 2 e− s . Then, by Poisson summation, ∞ ∞ X X πn2 1 −πs(x+n)2 e = s− 2 e− s e2πinx . n=−∞ Another theta function n=−∞ Evaluating at x = 0 yields the desired relation. Definition 3.3. We define the theta function Θ(z|τ ) for z ∈ C, =(τ ) > 0 by ∞ X 2 Θ(z|τ ) := eiπn τ e2πinz . n=−∞ Remarks. i. Θ(0|is) = ϑ(s). ii. Θ(x|4πit) = Ht (x) Digressing even further.... The Riemann zeta function Definition 3.4. For s ∈ C such that <(s) > 1, we define the celebrated Riemann zeta function by ∞ X 1 ζ(s) = ns n=1 It can be shewn that ϑ, ζ, and Γ are related by Z ∞ s 1 π −s/2 Γ(s/2)ζ(s) = t 2 −1 (ϑ(s) − 1) dt. 2 0 Remark. This will become more relevant later (in your life). LECTURE 23: HEISENBERG UNCERTAINTY; BACKGROUND FOR F ON Rd 1. The Heisenberg Uncertainty Principle Remark (Motivation). To what extent can one simultaneously specify the position and momentum of a particle? In quantum mechanics, a particle has associated with it a state function ψ (of L2 norm 1) which governs the position in the sense that the probability that the particle lies in region (a, b) ∈ R (one-dimensional space) is Ra particular 2 (a,b) |ψ| . Then the expectation (expected position) is given by Z ∞ x := x|ψ(x)|2 dx, (Or, phrase in terms of time-frequency localization.) state function expected position −∞ and the variance (uncertainty of the expectation) is given by Z ∞ (x − x)2 |ψ(x)|2 dx. variance of position from the expected position −∞ One has an analogous function describing the momentum of the particle. Importantly, it turns out that the probability of the momentum belonging to an interval (a, b) is R 2 (a,b) |ψ̂(ξ)| dξ. We shall now see the Heisenberg Uncertainty Principle, i.e., that 1 Variance of position × Variance of momentum & . 16π 2 Theorem 1.1. Let ψ ∈ S (R), and suppose ||ψ||2 = 1. Then Z ∞ Z ∞ 1 ξ 2 |ψ̂(ξ)|2 dξ ≥ x2 |ψ(x)|2 dx 16π 2 −∞ −∞ Heisenberg Uncertainty Principle HEISENBERG UNCERTAINTY; F ON RD 2 2 −Bx and where B > 0, A2 = p equality holds iff ψ(x) = Ae 2B/π. In fact, we have, for every x0 , ξ0 ∈ R, Z ∞ Z ∞ 1 (x − x0 )2 |ψ(x)|2 dx , (ξ − ξ0 )2 |ψ̂(ξ)|2 dξ & 2 16π −∞ −∞ with the individual terms (and, subsequently, the product) minimized when x0 = x, ξ0 = ξ. Easy calculation: integration by parts Proof. Integration by parts implies the following. Z ∞ 1= |ψ(x)|2 dx −∞ Z ∞ d =− x |ψ(x)|2 dx dx Z−∞ ∞ 0 0 xψ (x)ψ(x) + xψ (x)ψ(x) dx. =− −∞ Thus Z ∞ |x||ψ(x)||ψ 0 (x)| dx 1≤2 −∞ ∞ Z ≤2 Z 1/2 Z x2 |ψ(x)|2 dx −∞ ∞ |ψ 0 (x)|2 dx 2 2 1/2 4π 2 Z ∞ 2 2 ξ |ψ̂(ξ)| dξ −∞ Followed by Cauchy-Schwartz and Plancherel’s theorem 1/2 −∞ x |ψ(x)| dx =2 ∞ 1/2 , −∞ using the Plancherel theorem (and the basic properties of the Fourier transform) for the equality in the last line. Now, equality can hold only if equality held in the application of the Cauchy-Schwartz inequality. which implies that the functions must be scalar multiples of each other: ψ 0 (x) = βxψ(x) for some scalar β. Again, elementary ODE theory implies 2 ψ(x) = Aeβx /2 . HEISENBERG UNCERTAINTY; F ON Rd 3 To ensure the function is in S (R), we requirep β = −2B for 2 some positive B; then ||ψ||2 = 1 forces A = 2B/π. To get the second part of the theorem, apply the first to −2πixξ0 e ψ(x + x0 ). 2. Fourier Transform on Rd : Background Remark. Not much content in this section: basically just a re-hash of the one-dimensional theory. Only thing of interest is to understand the details of how one makes that extension. Basic terms d Let x = (x1 , · · · , xd ) denote a vector in R ; let |x| denote its magnitude, and x · y denote the standard inner product on Rn . Definition 2.1. Let x ∈ Rd , and let α ∈ Zd be a multiindex, i.e., a d-tuple of non-negative integers. We notate xα := xα1 1 xα2 2 · · · xαd d . and α1 α2 αd α ∂ ∂ ∂ ∂ := ··· ∂x ∂x1 ∂x2 ∂xd ∂ |α| = α1 ∂x1 · · · ∂xαd d Definition 2.2. Let R : Rd → Rd be a linear transformation. If R(x) · R(y) = x · y for all x, y ∈ Rd (or, equivalently: if Rt = R−1 ) then we call R a rotation. Of course | det(R)| = 1; if det(R) = 1, we say R is a proper rotation; otherwise is it called improper. Remark. Given any orthonormal basis {e1 , . . . , ed }, then {R(e1 ), . . . , R(ed )} is another orthonormal basis. Conversely, given any two orthonormal bases {ei }, {e0i }, we can define a rotation R by letting R(ei ) = e0i . multi-index rotations HEISENBERG UNCERTAINTY; F ON RD 4 Rapidly decreasing functions Integral on R d Definition 2.3. Let f : Rd → C. If i. f is continuous, and ii. for all α, |xα f (x)| is bounded, then f is called rapidly decreasing. (If |x|α |f (x)| is bounded for α = d + 1, we call the function of moderate decrease.) Definition 2.4. Given f a function of rapid (or merely moderate) decrease, we define Z Z f = lim f (x) dx. N →∞ Rd Polar coordinate integration on Rd QN 2.1. Polar coordinates. 2.1.1. R2 . Recall: one can, using a change to polar coordinates, express the integral of a function over the plane as Z Z 2π Z ∞ f (r cos θ, r sin θ) r dr dθ. f (x) dx = R2 bogus 0 0 With the notation Z Z g(γ) dσ(γ) := S1 2π g(cos θ, sin θ) dθ, 0 we can rewrite the above as Z Z Z f (x) dx = R2 S1 ∞ f (rγ)r dr dσ(γ) 0 2.1.2. R3 . In R3 , recall, using spherical coordinates, we can do a change-of-variables to show that Z Z 2π Z π Z ∞ f= f (r sin θ cos φ, r sin θ sin φ, r cos θ)r2 dr sin θ dθ dφ. R3 0 0 0 As above, if we abbreviate (or more honestly, define) Z Z 2 Z π g(γ) dσ(γ) = π g(sin θ cos φ, sin θ sin φ, cos θ) sin θ dθ dφ, S2 0 0 HEISENBERG UNCERTAINTY; F ON Rd 5 then the formula immediately above can be more concisely expressed as Z Z ∞ Z f= f (rγ)r2 dr dσ(γ). R3 S2 0 2.1.3. Rd . In general, one has the following formula: Z Z ∞ Z f= f (rγ)rd−1 drdσ(γ). Rd S d−1 General formula 0 3. Elementary Theory of the Fourier Transform in Rd Definition 3.1. Let f : Rd → C be an infinitely differentiable function. If, for each pair of multi-indices α and β, β ∂ sup xα f (x) < ∞ ∂x d x∈R then we say that f ∈ S (Rd ). Definition 3.2. Given f ∈ S (Rd ), we define its Fourier transform fˆ : Rd → C by Z fˆ(ξ) := f (x)e−2πix·ξ dx. Rd Proposition 3.3. Let f ∈ S (Rd ), h ∈ Rd , and δ > 0. Then i. f (x + h) → fˆ(ξ)e2πiξ·h ˆ ii. f (x)e−2πix·h → f(x + h) iii. f (δx) → δ1d fˆ ξδ ∂ α αˆ iv. ∂x f (x) → (2πiξ) fα(ξ) ∂ v. (−2πix)α f (x) → ∂ξ fˆ(ξ) vi. f (Rx) → fˆ(Rξ) for all rotations R. Just as in the one-dimensional case. Interchange of differentiation and multiplication by corresponding monomials, etc. 6 HEISENBERG UNCERTAINTY; F ON RD Proof of (vi). A calculation: Z f (R(x))e−2πix·ξ dx F [f (Rx)](ξ) := Z Rd −1 f (y)e−2πi(R y)·ξ |det(R−1 )|dy (let y = Rx) = ZRd f (y)e−2πiy·(Rξ) dy = fˆ(Rξ), as desired. = Rd As before, we have the following: Corollary 3.4. F maps S (Rd ) to itself. 3.1. Slight digression: Radial functions. Definition 3.5. A function f on Rd is called radial if its value is constant on spheres about the origin, i.e., there exists f0 : R≥0 → C such that f (x) = f0 (|x|). Remark. Obviously, f is radial ⇐⇒ f (Rx) = f (x) for all rotations R. Corollary 3.6. If f is radial, then fˆ is also. Proof. WTS fˆ(Rξ) = fˆ(ξ) for all rotations R. By the above, fˆ(Rξ) = F [f (Rx)](ξ) = F (f )(ξ) = fˆ(ξ). 3.2. Fourier inversion and Plancherel on Rd . Theorem 3.7. Let f ∈ S (Rd ). Then the Fourier inversion formula Z f (x) = fˆ(ξ)e2πix·ξ dξ Rd holds, as does the Plancherel theorem, ||f ||2 = ||fˆ||2 . HEISENBERG UNCERTAINTY; F ON Rd 7 Proof. The proof is basically analogous to that of the onedimensional case: i. First one shows (via iteration) that the d-dimensional Gaussian is an eigenfunction of F , i.e., 2 2 F (e−π|x| )(ξ) = e−π|ξ| . ii. Using the interaction of dilation and F , we next see that π|ξ|2 1 2 F[e−πδ|x| ](ξ) = d/2 e− δ . δ π|x|2 1 − δ e , one shows that {Kδ } iii. Then, letting Kδ (x) = δd/2 is an approximation of the identity. iv. Multiplication formula: for f, g ∈ S (Rd ), Z Z f ĝ = fˆg Rd Rd (proof obtained via Fubini’s theorem exactly as in the one-dimensional result). Once one has (iii) and (iv) above, the proof of the inversion formula follows exactly as in the one-dimensional case; once one has (iv) and the inversion formula, one can analogously play the “hats game” to get Plancherel’s theorem (or go via convolutions). Remark. Not all that interesting. However, what we will explore next, the wave equation in Rd × R and how the theory differs in odd and even dimensions, is quite subtle. LECTURE 24: THE WAVE EQUATION ON Rd × R 1. Radial functions and the Fourier transform Definition 1.1. A function f on Rd is called radial if its value is constant on spheres about the origin, i.e., there exists f0 : R≥0 → C such that f (x) = f0 (|x|). Radial functions Remark. Obviously, f is radial ⇐⇒ f (Rx) = f (x) for all rotations R. Corollary 1.2. If f is radial, then fˆ is also. Fourier transform of radial function is radial. Proof. WTS fˆ(Rξ) = fˆ(ξ) for all rotations R. Well, Z fˆ(Rξ) = f (x)e−2πiRξ·x dx ZRd −1 = f (x)e−2πiξ·R x dx ZRd = f (Ry)e−2πiξ·y |det(R−1 )| dy ZRd = f (y)e−2πiξ·y dy = fˆ(ξ). Rd 2. The Wave Equation Definition 2.1. We define the d-dimensional Laplacian by Laplacian ∆= ∂2 ∂2 + · · · + . ∂x21 ∂x2d LECTURE 24: THE WAVE EQUATION ON RD × R Definition 2.2. The Cauchy problem for the wave equation is the following: 2 ∆u = ∂∂tu2 u(x, 0) = f (x) ∂u ∂t (x, 0) = g(x) 2 The formal argument to find a solution: As usual, we first run a formal argument. Taking the Fourier transform in the first variable, we see that the wave equation becomes ∂ 2 û 2 2 2 −4π (ξ1 + · · · + ξd )û(ξ, t) = 2 (ξ, t) ∂t where in the above theˆindicates the Fourier transform in the first variable. For each fixed ξ, then, one obtains an ordinary differential equation in t with solution û(ξ, t) = A(ξ) cos(2π|ξ|t) + B(ξ) sin(2π|ξ|t). Fourier transform of initial conditions As usual, taking the Fourier transforms of the initial conditions yields û(ξ, 0) = fˆ(ξ) ∂ û ∂t (ξ, 0) = ĝ(ξ); then, of course A(ξ) = fˆ(ξ) and 2π|ξ|B(ξ) = ĝ(ξ). The guessed solution. It actually is a solution. Thus the solution is expected to be ˆ g(ξ) ˆ û(ξ, t) = f (ξ) cos(2π|ξ|t) + sin(2π|ξ|t); 2π|ξ| and, in fact, it is: Theorem 2.3. Given the Cauchy problem for the wave equation 2 ∆u = ∂∂tu2 u(x, 0) = f (x) ∂u ∂t (x, 0) = g(x), The Cauchy problem for the wave equation LECTURE 24: THE WAVE EQUATION ON Rd × R 3 the following is a solution: Z ĝ(ξ) fˆ(ξ) cos(2π|ξ|t) + u(x, t) := sin(2π|ξ|t) e2πix·ξ dξ. 2π|ξ| Rd Proof is an exercise in differentiation (using the fact that we can differentiate under the integral sign.) Proof. Differentiate under the integral signs. First, differentiating with respect to x1 , . . . , xd , we get ! Z d X ĝ(ξ) ξk2 e2πix·ξ dξ. fˆ(ξ) cos(2π|ξ|t) + sin(2π|ξ|t) (2πi)2 ∆u(x, t) = 2π|ξ| Rd k=1 Z ĝ(ξ) = sin(2π|ξ|t) (−4π 2 i|ξ|2 )e2πix·ξ dξ. fˆ(ξ) cos(2π|ξ|t) + 2π|ξ| Rd Diffeentiating u(x, t) with respect to t also yields the same thing, obviously; so our guessed solution does solve the wave equation. Checking the initial conditions is equally clear: Z u(x, 0) := fˆ(ξ)e2πix·ξ dξ = f (x) Rd by Fourier inversion. Similarly, Z ∂u ĝ(ξ) (x, t) = cos(2π|ξ|t) e2πix·ξ dξ; fˆ(ξ)(−2π|ξ|) sin(2π|ξ|t) + (2π|ξ|) ∂t 2π|ξ| Rd evaluating at t = 0 shows the initial velocity condition ∂u (x, 0) = g(x) ∂t is also satisfied. Remark. In fact, the solution is unique. This fact can be shown via a conservation of energy argument. 3. Conservation of Energy Definition 3.1. Let u be a solution of the wave equation. We define the (total = kinetic + potential)) energy of the Uniqueness of solution is a conservation of energy argument. LECTURE 24: THE WAVE EQUATION ON RD × R solution as Z 2 2 2 ∂u ∂u ∂u + + ··· + dx. E(t) := ∂x1 ∂xd Rd ∂t 4 Definition of energy of a solution Lemma 3.2. Let a, b ∈ C; let α ∈ R. Then |a cos α + b sin α|2 + | − a sin α + b cos α|2 = |a|2 + |b|2 Pythagorean theorem again. Proof. (cos α, sin α) and (− sin α, cos α) are orthonormal in C2 ; so by the Pythagorean Theorem, |(a, b) · (cos α, sin α)|2 + |(a, b) · (− sin α, cos α)|2 = |(a, b)|2 . Theorem 3.3. For the aforementioned solution u of the wave equation, E(t) is constant. Proof. Recall ĝ(ξ) û(ξ, t) = fˆ(ξ) cos(2π|ξ|t) + sin(2π|ξ|t); 2π|ξ| thus, using Plancherel’s theorem, Z Z X d 2 ∂u dx = 2π|ξ|fˆ(ξ) cos(2π|ξ|t) + ĝ(ξ) sin(2π|ξ|t) ∂x d d j R R j=1 Note: tives first deriva- and Z Rd ∂u ∂t 2 Z dx = −2π|ξ|fˆ(ξ) sin(2π|ξ|t) + ĝ(ξ) cos(2π|ξ|t) Rd Then, by the lemma (a = 2π|ξ|fˆ(ξ), b = ĝ(ξ)), we get Z 2 2 2 ∂u ∂u ∂u + ··· + dx E(t) = + ∂t ∂x ∂x d 1 d ZR = (4π 2 |ξ|2 |fˆ(ξ)|2 + |ĝ(ξ)|2 ) dξ, Rd which is independent of t. 2 dξ. 2 dξ. LECTURE 24: THE WAVE EQUATION ON Rd × R 4. Wave Equation in R3 × R motivation 5 Recall d’Alembert’s solution to the wave equation: Z u(x + t) + u(x − t) 1 x+t u(x, t) = + g(y) dy. 2 2 x−t “This suggests a generalization to higher dimensions, where we might expect to write the solution of our problem as averages of the initial data.” Definition 4.1. Let f : R3 → C. We define the spherical mean of f at x with radius t by Z 1 Mt (f )(x) = f (x − tγ)dσ(γ), 4π S 2 that is, the average of f over the sphere of radius t centered at x. Some useful lemmata: Lemma 4.2. Let f ∈ S(R3 ). Then for each fixed t, Mt (f ) ∈ S(R3 ). Further, Mt (f ) is infinitely differentiable in t, and each t-derivative is in S(R3 ). Lemma 4.3 (Fourier transform of the surface measure). Z 1 sin(2π|ξ|) e−2πiξ·γ dσ(γ) = . 4π S 2 2π|ξ| c Remark. Denote the LHS by dσ(ξ). We notice that (it is the Fourier transform of a radial “function.”): the above lemma shows that it is then radial. Proof. We first observe the formula is true for ξ = (0, 0, ρ) for any ρ > 0 (ρ = 0 is immediate); then we show the left hand side is a radial function. By definition, Z Z 2π Z π 1 1 −2πiξ·γ e dσ(γ) := e−2πiξ·γ sin θ dθ dφ 4π S 2 4π 0 0 spherical mean: first example of a convolution of a function with a measure. LECTURE 24: THE WAVE EQUATION ON RD × R (where γ = (sin θ cos φ, sin θ sin φ, cos θ) in the right-hand side). Z 2π Z π 1 = e−2πiρ cos θ sin θ dθ dφ 4π 0 0 Z π 1 e−2πiρ cos θ sin θ dθ = 2 0 Z 1 1 2πiρu = e du (C.O.V.: u = − cos θ) 2 −1 sin(2πρ) 1 2πiρu 1 = ; e = −1 4πiρ 2πρ so the lemma is proven for ξ = (0, 0, ρ) for ρ ≥ 0. (To be continued....) 6 LECTURE 25: THE WAVE EQUATION ON R3 × R In order to solve the initial value for the wave equation in dimensions bigger than one, we shall use the method of spherical means, due to Hadamard. The intuitive grounds for this methodology lies in our conception of waves as being made of a superposition of spherically symmetric fronts emanating from point sources. Huygens was a pioneer of this view of waves, which he used to show that the laws of optics suggested that light was a wave phenomenon. (http://www.math.nyu.edu/faculty/tabak/PDEs/WE.pdf) 1. Wave Equation in R3 × R Recall d’Alembert’s solution to the wave equation: Z u(x + t) + u(x − t) 1 x+t + g(y) dy. u(x, t) = 2 2 x−t “This suggests a generalization to higher dimensions, where we might expect to write the solution of our problem as averages of the initial data.” Definition 1.1. Let f : R3 → C. We define the spherical mean of f at x with radius t by Z 1 f (x − tγ)dσ(γ), Mt (f )(x) = 4π S 2 that is, the average of f over the sphere of radius t centered at x. Some useful lemmata: Lemma 1.2. Let f ∈ S(R3 ). Then for each fixed t, Mt (f ) ∈ S(R3 ). Further, Mt (f ) is infinitely differentiable in t, and each t-derivative is in S(R3 ). motivation spherical mean: first example of a convolution of a function with a measure. LECTURE 25: THE WAVE EQUATION ON R3 × R Lemma 1.3 (Fourier transform of the surface measure). Z sin(2π|ξ|) 1 e−2πiξ·γ dσ(γ) = . 4π S 2 2π|ξ| 2 c Remark. Denote the LHS by dσ(ξ). We notice that (it is the Fourier transform of a radial “function.”): the above lemma shows that it is then radial. Formula is true for a single ξ = (0, 0, ρ). Proof. We first observe the formula is true for ξ = (0, 0, ρ) for any ρ > 0 (ρ = 0 is immediate); then we show the left hand side is a radial function. By definition, Z Z 2π Z π 1 1 e−2πiξ·γ sin θ dθ dφ e−2πiξ·γ dσ(γ) := 4π S 2 4π 0 0 (where γ = (sin θ cos φ, sin θ sin φ, cos θ) in the right-hand side). Z 2π Z π 1 = e−2πiρ cos θ sin θ dθ dφ 4π 0 0 Z 1 π −2πiρ cos θ = e sin θ dθ 2 0 Z 1 1 2πiρu = e du (C.O.V.: u = − cos θ) 2 −1 1 2πiρu 1 sin(2πρ) = e = ; −1 4πiρ 2πρ Both sides of the equation are radial functions; so proving it for a single vector proves it for the entire sphere. so the lemma is proven for ξ = (0, 0, ρ) for ρ ≥ 0. START FROM HERE At this point, it suffices to show that the Fourier transform of the surface measure is a radial function. We recall the fact (see last page of appendix) that given any function f of moderate decrease, Z Z f (R(γ))dσ(γ) = f (γ) dσ(γ). S d−1 S d−1 LECTURE 25: THE WAVE EQUATION ON R3 × R Then, given any rotation R, Z 1 c e−2πi(Rξ)·γ dσ(γ) dσ(Rξ) := 4πZ S 2 1 −1 = e−2πiξ·R γ dσ(γ) 4π ZS 2 1 c e−2πiξ·γ dσ(γ) = dσ(ξ). = 4π S 2 3 The next lemma basically says that the Fourier transform of the spherical averaging operator (which, for t = 1 is convolution f ∗ dσ) is the product of Fourier transforms. c (fˆ(ξ)dσ(ξt)) Lemma 1.4. \ ˆ sin(2π|ξ|t) . M t (f )(ξ) = f (ξ) 2π|ξ|t Proof. By definition, \ M e−2πix·ξ f (x − γt) dσ(γ) dx t (f )(ξ) := R3 S2 Z Z 1 −2πix·ξ = f (x − γt)e dx dσ(γ) 4π S 2 R3 Z Z 1 = f (y)e−2πi(y+γt)·ξ dy dσ(γ) (let y = x − γt) 4π S 2 R3 Z 1 = fˆ(ξ) e−2πi(γt)·ξ dσ(γ) 4π S 2 Z 1 sin(2π|ξ|t) = fˆ(ξ) e−2πiγ·tξ dσ(γ) = fˆ(ξ) 4π S 2 2π|ξ|t by the previous lemma. Z Fourier transform of the spherical averaging operator: calculation using the previous lemma 1 4π Z Once we have the above lemmata, the solution becomes clear: 4 The explicit formula of the solution Easy once we have the above lemmata Calculus I trick Elementary school arithmetic trick LECTURE 25: THE WAVE EQUATION ON R3 × R Theorem 1.5. In R3 × R, the solution to the Cauchy problem for the wave equation is ∂ u(x, t) = (tMt (f )(x)) + tMt (g)(x) ∂t Proof. We break the problem into two subproblems: the case when g = 0, and the case when f = 0. Using the Fourier inversion expression of the solution that we worked out before, we see in the first case, Z h i u(x, t) = fˆ(ξ) cos(2π|ξ|t) e2πix·ξ dξ R3 Z sin(2π|ξ|t) ∂ 2πix·ξ t e dξ fˆ(ξ) = ∂t 2π|ξ|t 3 R ∂ = (tMt (f )(x)), ∂t and in the second case, Z sin(2π|ξ|t) 2πix·ξ u(x, t) = e dξ ĝ(ξ) 2π|ξ| R3 Z sin(2π|ξ|t) 2πix·ξ =t e dξ ĝ(ξ) 2π|ξ|t 3 R = tMt (g)(x). The solution to the general problem (for general f, g ∈ S(R3 )) is then the superposition of these two cases. 2. Cool observation about the solution: Huygen’s Principle Huygen’s Principle Considering the form of the solution given above, i.e., that ∂ u(x, t) = (tMt (f )(x)) + tMt (g)(x) ∂t we see that the solution at (x, t) depends on the averages of f and g (that is, data on the boundary t = 0) over spheres (in R3 ) centered at x of radius t; equivalently, “the data at a point x0 in the plane t = 0 influences the solution on the LECTURE 25: THE WAVE EQUATION ON R3 × R 5 boundary of a forward light cone originating at x0 . (See http://en.wikipedia.org/wiki/Huygens-Fresnel principle) 3. The Wave Equation in R2 × R: Hadamard’s Method of Descent Definition 3.1. We define, for F : R2 → C, a weighted ft (F )(x) over the disk of radius t centered at x ∈ average M 2 R by Z 1 1 ft (F )(x) := F (x − ty) dy. M 2π| |y|≤1 (1 − |y|2 )1/2 Relevant weighted averaging operator Theorem 3.2. Let f, g ∈ S(R2 ) be initial data for the Cauchy problem for the wave equation on R2 × R. Then a solution is ∂ f f u(x, t) = (tM t (f )(x)) + tMt (g)(x). ∂t Proof. We use f, g ∈ S(R2 ) to create a Cauchy problem for the wave equation in R3 × R as follows. Fix some T > 0, and let η ∈ S(R) be a (bump) function such that η(x) = 1 whenever |x| ≤ 3T. We turn the two-dimensional problem into a three-dimensional one. We create f [ , g [ ∈ S(R3 ) by defining f [ (x1 , x2 , x3 ) := f (x1 , x2 )η(x3 ) g [ (x1 , x2 , x3 ) := g(x1 , x2 )η(x3 ). Let, now, u[ be the solution to the Cauchy problem for the wave equation on R3 × R with initial data f [ , g [ . By Huygen’s principle, we see that for |t| ≤ T , u[ (x, t) is constant in x3 for all |x3 | ≤ T : after all, the backwards light cone for such (x, t) is contained in R2 × [−3T, 3T ], over which region the initial data f [ , g [ is constant in x3 . (Draw picture: the solution depends on data which is constant in x3 .) We already can solve the R3 × R equation; ignoring the third variable (it’s constant in that variable) gives the R2 × R solution. LECTURE 25: THE WAVE EQUATION ON R3 × R Now, define u(x1 , x2 , t) := u[ (x1 , x2 , 0, t). u solves the 2-dimensional Cauchy problem for |t| < T ; let’s call it uT . Notice now that if we take a T2 > T1 , then uT2 agrees with uT1 for |t| < T1 . Since T was arbitrary, we thus obtain a well-defined solution u(x1 , x2 , t) for all t > 0. Now we need to show that the solution actually has the desired form. 6 Now we need to show that the spherical averages of the “extended to R3 data” are the weighted averages of the original R2 data. Calculus II Lemma 3.3. Let H be a function on the sphere S 2 . If there exists some two-variable function h such that H(x1 , x2 , x3 ) = h(x1 , x2 ), then ft (h)(x1 , x2 ). Mt (H)(x1 , x2 , 0) = M Proof of lemma. By definition, Mt (H)(x1 , x2 , 0) Z 2π Z π 1 h(x1 − t sin θ cos φ, x2 − t sin θ sin φ) sin θ dθ dφ = 4π 0 0 Z 2π Z π/2 1 = h(x1 − t sin θ cos φ, x2 − t sin θ sin φ) sin θ dθ dφ 4π 0 0 Z 2π Z π 1 + h(x1 − t sin θ cos φ, x2 − t sin θ sin φ) sin θ dθ dφ 4π 0 π/2 Letting r = sin θ, we get Z 2π Z π/2 1 = h(x1 − t sin θ cos φ, x2 − t sin θ sin φ) sin θ dθ dφ 4π 0 Z 2π Z 0π 1 + h(x1 − t sin θ cos φ, x2 − t sin θ sin φ) sin θ dθ dφ 4π 0 π/2 Z 2π Z 1 1 1 = h(x1 − tr cos φ, x2 − tr sin φ) √ rdr dφ 2 2π 0 1 − r 0 Z 1 1 ft (h)(x) = h(x − ty) p dy = M 2 2π |y|≤1 1 − |y| as desired (taking y = (r cos φ, r sin φ) in the last change of variables). LECTURE 25: THE WAVE EQUATION ON R3 × R Once we have the above lemma, since u(x1 , x2 , t) := u[ (x1 , x2 , 0, t) ∂ = (tMt (f [ )(x1 , x2 , 0) + tMt (g [ )(x1 , x2 , 0) ∂t ∂ f f = (tM t (f )(x1 , x2 ) + tMt (g(x1 , x2 ), ∂t our solution indeed has the form that we claimed. 4. Comments about the solutions Since the propagation of light is governed by the three-dimensional wave equation, if at t = 0 a point of light flashes at the origin, after a finite amount of time an observer will see the flash only for an instant. However, if we drop a stone in a lake, after a finite amount of time any point on the surface will begin to undulate and will continue to do so indefinitely (in principle). 7 Difference between odd and even dimensions LECTURE 26: RADIAL SYMMETRY, THE FOURIER TRANSFORM, AND BESSEL FUNCTIONS; THE RADON TRANSFORM 1. f0 and F0 , Bessel Functions, and Parity d Question. Recall: if f is a radial function on R (i.e., f (x) = f0 (|x|), then so is fˆ (fˆ(ξ) = F0 (ξ)). What is the relation between f0 and F0 ? 1.1. Case R. In one dimension, “radial” is the same as even; so let |ξ| = ρ and consider: Z ∞ F0 (ρ) := fˆ(ξ) := f (x)e−2πix|ξ| dx −∞ Z ∞ −2πir|ξ| 2πir|ξ| = f0 (r) e +e dr 0Z ∞ =2 cos(2πρr)f0 (r) dr 0 1.2. Case R3 . In R3 , we (obviously) use the polar integration formula, and then the formula for the Fourier trans2 form of the surface element Z on S . F0 (ρ) = fˆ(ξ) := f (x)e−2πix·ξ dx RZ3 Z ∞ = f0 (r) e−2πirγ·ξ dσ(γ)r2 dr S2 Z0 ∞ 2 c = f0 (r)4π dσ(rξ)r dr 0 Z ∞ 4π sin(2πρr) 2 r dr = f0 (r) 2πρr 0Z 2 ∞ = sin(2πρr)f0 (r)r dr. ρ 0 1 Seems sort of a digression for now.... A trivial calculation using the evenness of the function. 2 RADIAL SYMMETRY; RADON TRANSFORM 1.3. Case R2 . Bessel function: you might have seen them in ODEs Definition 1.1. For each n ∈ Z, let the nth Bessel function Jn (ρ) denote the nth Fourier coefficient of eiρ sin θ ; that is, Z 2π 1 eiρ sin θ e−inθ dθ, or Jn (ρ) = 2π 0 ∞ X iρ sin θ Jn (ρ)einθ . e = n=−∞ Polar coordinates: x = (r cos θ, r sin θ) Then we have the following relation: Z ˆ f (ξ) = f (x)e−2πix·(0,−ρ) dx 2 ZR2π Z ∞ = f0 (r)e2πirρ sin θ r dr dθ 0 Z 0 ∞ = 2π J0 (2πrρ)f0 (r)r dr. 0 Fubini’s theorem In general, the relation between f0 and F0 involves the Bessel function of order d2 − 1 (one needs a more general definition in the odd dimensions for the Bessel functions of fractional order). 2. The Radon Transform 2.1. Various forms of the Radon Transform. The so-called X-ray transform Definition 2.1. Let ρ be a function on R2 . For each line L ⊂ R2 , we define the Radon transform X of ρ on L by Z X(ρ)(L) := ρ. L Definition 2.2. Let G2,3 denote the Grassmannian (manifold) of two-dimensional affine planes in R3 . The Radon Transform in R3 Definition 2.3. Let f be a function on R3 (f ∈ S(R3 ), say). We define the Radon transform R(f ) on G2,3 by RADIAL SYMMETRY; RADON TRANSFORM 3 Z R(f )(P) = f P Remark. Usually Radon transform means integration over planes of co-dimension 1. One uses the term k-plane Radon transform for integrals over k-planes, and the term X-ray transform for integrals over lines. 2.2. Calculation of the Radon Transform in R3 . Remark. One can parametrize the elements ot G2,3 as follows: for γ ∈ S 2 and t ∈ R, let Pt,γ denote the plane {x ∈ R3 : x · γ = t}. I.e., Pt,γ is the plane orthogonal to γ and passing through (the terminal point of) tγ. Note that Pt,γ P−t,−γ . = Definition 2.4. Given f ∈ S(R3 ), we define its integral over Pt,γ by Z Z f := f (tγ + u1 e1 + u2 e2 ) du1 du2 R2 Pt,γ where {γ, e1 , e2 } is an orthonormal basis of R3 . Proposition 2.5. Let f ∈ S(R3 ). Then the above definiR tion of Pt,γ f is independent of the choice of e1 and e2 . Proof. Trivial. That is, {e1 , e2 } is an ONB of the plane P0,γ . Lemma 2.6. Z ∞ ! Z f −∞ Pt,γ Z dt = f (x) dx R3 Proof. Just a calculation. Let R be the rotation taking the standard basis vectors of R3 to γ, e1 , e2 . Then Intuitively obvious 4 RADIAL SYMMETRY; RADON TRANSFORM Z Z f (Rx) dx f (x) dx = R3 R3 Z = R3 Z f (x1 γ + x2 e1 + x3 e2 ) dx1 dx2 dx3 ! Z ∞ = f −∞ dt. Pt,γ Definition 2.7 (Alternate definition of Radon Transform). Let f ∈ S(R3 ). Then for (t, γ) ∈ R × S 2 we define the Radon Transform of f by Z R(f )(t, γ) := f. Again, we note that R(f )(t, γ) = R(f )(−t, −γ); that is, really Relevant Schwartz class S(R × S 2 ) Pt,γ We shall need an appropriately-defined Schwartz space. Definition 2.8. Let F be a continuous function on R2 ×S 2 that is infinitely differentiable in t. If ` k ∂ F (t, γ) < ∞ sup |t| ∂t` t∈R,γ∈S 2 for all nonnegative k, ` ∈ Z (i.e., F (·, γ) is in S(R) uniformly in γ) then we say F ∈ S(R × S 2 ). THE key lemma: The Fourier Slice theorem Lemma 2.9 (Fourier Slice, or Projection Theorem). If f ∈ S(R3 ), then R(f )(t, γ) ∈ S(R) for each γ, and b )(s, γ) = fˆ(sγ) R(f where ˆ denotes the one-dimensional Fourier transform in the first variable in the LHS, and the three-dimensional Fourier transform on the RHS. Remark. In other words, if you project f onto the line {tγ} and then take the 1-D Fourier transform, it’s the same as taking the slice of the 2-D Fourier transform parallel to γ. LECTURE 27: FOURIER SLICE THEOREM AND RADON INVERSION FORMULA 1. Fourier Slice Theorem Recall: Definition 1.1 (Definition of Radon Transform). Let f ∈ S(R3 ). Then for (t, γ) ∈ R×S 2 we define the Radon Transform of f by Z R(f )(t, γ) := f. Pt,γ We shall need an appropriately-defined Schwartz space. Definition 1.2. Let F be a continuous function on R2 ×S 2 that is infinitely differentiable in t. If ` k ∂ F (t, γ) < ∞ sup |t| ∂t` t∈R,γ∈S 2 Again, we note that R(f )(t, γ) = R(f )(−t, −γ); that is, really Relevant Schwartz class S(R × S 2 ) for all nonnegative k, ` ∈ Z (i.e., F (·, γ) is in S(R) uniformly in γ) then we say F ∈ S(R × S 2 ). Lemma 1.3 (Fourier Slice, or Projection Theorem). If f ∈ S(R3 ), then R(f )(t, γ) ∈ S(R) for each γ, and b )(s, γ) = fˆ(sγ) R(f where ˆ denotes the one-dimensional Fourier transform in the first variable in the LHS, and the three-dimensional Fourier transform on the RHS. Remark. In other words, if you project f onto the line {tγ} and then take the 1-D Fourier transform, it’s the same as taking the slice of the 3-D Fourier transform parallel to γ. 1 THE key lemma: The Fourier Slice theorem 2 FOURIER SLICE THEOREM AND RADON TRANSFORM INVERSION Proof of the formula. Z ∞ Z b )(s, γ) := R(f −∞ ∞ Z ! f e−2πist dt Pt,γ Z f (tγ + u1 e1 + u2 e2 ) du1 du2 e−2πist dt := 2 Z −∞ R f (tγ + u)e−2πist du dt = ZR3 = f (tγ + u)e−2πisγ·(tγ+u) du dt ZR3 = f (x)e−2πisγ·x dx (C.O.V.: x = tγ + u) R3 = fˆ(sγ). Remark. The Fourier Slice theorem plus the (one-dimensional) inversion formula yields the following: Z ∞ R(f )(t, γ) = fˆ(sγ)e2πits ds; −∞ in other words, the Radon transform of f at (t, γ) is actually the inverse along the line {sγ : s ∈ R} of the (3dimensional) Fourier transform of f , evaluated at t. Corollary 1.4. Let f, g ∈ S(R3 ). If R(f ) = R(g) then f = g. Uniqueness theorem (not at all so simple in general; a topic of serious study) Proof. Since R(f ) = R(g), R(f − g)(t, γ) ≡ 0. But then Fourier slice implies f[ − g(sγ) = 0 for all s, γ; so f[ − g(ξ) ≡ 0. Fourier inversion implies f − g ≡ 0, so f = g. 2. Filtered Backprojection Inversion formula 2.1. The dual Radon transform. Dual Radon Transform Definition 2.1. Let F : R × S 2 → R. We define the dual Radon transform R∗ (F ) : R3 → R of F by FOURIER SLICE THEOREM AND RADON TRANSFORM INVERSION 3 Z ∗ F (x · γ, γ) dσ(γ). R (F )(x) := S2 Remark. If F (t, γ) is the Radon transform R(f )(t, γ) then F (x · γ, γ) = R(f )(x · γ, γ) Z = f, What is the dual Radon transform of the Radon transform? Px·γ,γ the integral over the plane with normal vector γ of distance x · γ from the origin: in other words, the plane passing through x with normal vector γ. Thus R∗ (F )(x) is the integral of f over all planes passing through x: the so-called backprojection of f . 2.2. Why is it called the dual? Let V1 = S(R3 ), with inner product Z (f, g)1 = f g; The backprojection operator R3 2 let V2 = S(R × S ) with inner product Z Z (F, G)2 = F (t, γ)G(t, γ) dσ(γ) dt. R S2 Then Duality formula ∗ (R(f ), F )2 = (f, R (F ))1 . 2.3. The inversion formula. Theorem 2.2. Let f ∈ S(R3 ). Then ∆(R∗ R(f )) = −8π 2 f. Proof. As we noted earlier, the Fourier slice theorem (plus the one-dimensional inversion formula) imply Z ∞ R(f )(t, γ) = fˆ(sγ)e2πits ds; −∞ so ∗ Z Z ∞ R R(f )(x) = S2 −∞ fˆ(sγ)e2πix·γs ds dσ(γ). a.k.a. the filtered backprojection inversion formula 1. Fourier slice theorem 4 2. differentiation under the integral sign FOURIER SLICE THEOREM AND RADON TRANSFORM INVERSION Then, differentiation under the integral sign (plus the fact that γ ∈ S 2 ) imply Z Z ∞ ∗ fˆ(sγ)(−4π 2 s2 )e2πix·γs ds dσ(γ) ∆(R R(f ))(x) = S 2 −∞ Z Z ∞ 2 = −4π fˆ(sγ)e2πix·γs s2 ds dσ(γ) 2 ZS Z−∞ ∞ 2 = −4π fˆ(sγ)e2πix·γs s2 ds dσ(γ) 2 Z SZ 0 0 fˆ(sγ)e2πix·γs s2 ds dσ(γ) − 4π 2 S 2 −∞ Z Z ∞ = −8π 2 fˆ(sγ)e2πix·γs s2 ds dσ(γ) S2 0 2 = −8π f (x). invariance under rotation Notes on the above calculation: 1. The second-to-last inequality follows since Z Z 0 −4π 2 fˆ(sγ)e2πix·γs s2 ds dσ(γ) S 2 −∞ Z Z ∞ 2 = −4π fˆ(−sγ)e−2πix·γs s2 ds dσ(γ) 2 ZS Z0 ∞ = −4π 2 fˆ(sγ)e2πix·γs s2 ds dσ(γ) S2 3. polar integration 0 where one rotates γ to −γ in the last step. 2. The last equality follows from (reverting) the polar integration formula: Z Z ∞ Z 2πix·γs 2 fˆ(sγ)e s ds dσ(γ) = fˆ(ξ)e2πix·ξ dξ S2 0 R3 = f (x). Reconstruction formula in Rd Remark. In general, the reconstruction formula for the Radon transform is as follows: FOURIER SLICE THEOREM AND RADON TRANSFORM INVERSION 5 (d−1) (2π)1−d (−∆) 2 R∗ (R(f )) = f, 2 where the fractional Laplacian is defined by Z (2π|ξ|)2α fˆ(ξ)e2πiξ·x dξ (−∆)α f (x) := Fractional Laplacian, or inverse Riesz transform Rd d for f ∈ S(R ). Note that the formula is (again) more simple for the odd dimensions than for the even. Remark. In using the Schwartz class, we have actually swept an entire world of details under a rug. For the interested reader, please see, for example, http://equinto.math.tufts.edu/ research/sc-article.pdf, an introductory article to the field Reference to Quinto’s article by Todd Quinto of Tufts (whose thesis advisor was Cor- from the workshop mack...wait, no, that’s false (http://equinto.math.tufts.edu/ CV/vitatuft.pdf)). 3. Wave Equation and Radon Transform Definition 3.1. Let F : R → R be a C 2 (R) function, and u a function on Rd × R. If there exists a vector γ ∈ S d−1 such that u(x, t) = F ((x · γ) − t), then we call F a plane wave. Remarks. (Plane waves and the wave equation.) i. Such functions are solutions of the wave equation in Rd . ii. Such u are constant on planes perpendicular to γ. iii. As t increases, the wave travels in the γ direction. iv. In fact, for d > 1 the solution of the wave equation can be expressed as an integral of plane waves (by using the Radon transform). Reference: Helgason, Sigurdur. Radon Transforms and Wave Equations, Springer Lecture Notes in Mathematics, 1996. Full text is available at http://www.springerlink.com/ content/d664n43145015772/fulltext.pdf. Plane wave Plane waves and the wave equation LECTURE 28: FINITE FOURIER ANALYSIS: BASIC DEFINITIONS 1. Background knowledge: The Group Z/N Z Definition 1.1. Let z ∈ C, N ∈ N. If z N = 1, we say z is an N th root of unity. 1.1. Z(N ). Note that the set, which we’ll notate as Z(N ), of N th roots of unity is exactly roots of unity Z(N ) {1, e2πi/N , e2πi2/N , . . . , e2πi(N −1)/N }. Also note that Z(N ) is, with complex multiplication as its group law, an abelian group, i.e., it is: i. (Closed under group law:) If z, w ∈ Z(N ) then zw ∈ Z(N ) ii. (Abelian:) If z, w ∈ Z(N ) then zw = wz iii. (Identity) 1 ∈ Z(N ) iv. (Inverses) If z ∈ Z(N ) then there exists z −1 ∈ Z(N ) such that zz −1 = 1. 1.2. Z/N Z. Definition of abelian group an Z/N Z Definition 1.2. Let N ∈ N, and x, y ∈ Z. If x − y is divisible by N , we say x is congruent to y mod N : x≡y mod N. It is obvious that the relation is i. (reflexive) x ≡ x mod N for all x ∈ Z ii. (symmetric) x ≡ y mod N implies y ≡ x mod N iii. (transitive) x ≡ y mod N and y ≡ z mod N imply x ≡ z mod N ; in other words, an equivalence relation on Z. 1 Equivalence relation 2 Z/N Z Additive group law for Z/N Z Isomorphism THE FAST FOURIER TRANSFORM Definition 1.3. One calls the set of equivalence classes on Z modulo this relation the integers modulo N , denoted by Z/N Z. Now, for each x ∈ Z, let R(x) denote the equivalence class corresponding to x. It is easy to see that one can define a group law (addition) on the set of equivalence classes by defining R(x) + R(y) = R(x + y). It is easy to see that if x0 ∈ R(x) and y 0 ∈ R(y) (i.e., x0 ≡ x mod N , y 0 ≡ y mod N ) then x0 + y 0 ∈ R(x + y) (i.e., x0 + y 0 ≡ x + y mod N ). Proposition 1.4. The association R(k) ↔ e2πik/N gives a correspondence (in fact, a group isomorphism) between Z/N Z and Z(N ). Remark. In the same way, we can create an identification between the functions on Z/N Z and Z(N ). The space of (cts.) functions on Z(N ) 2. The characters on Z(N ) Notation 2.1. We let V denote the (N -dimensional) inner product space of functions F : Z(N ) → C, with inner product (F, G) := N −1 X F (k)G(k). k=0 Defining the (obvious) inner product and the associated norm. Question. What should be the analogues, for Fourier analysis on Z(N ), of the functions en (x) = e2πinx on the circle? Desirable properties of characters The key properties of those functions are the following: i. {en }n∈Z is an orthonormal set, with respect to the inner product THE FAST FOURIER TRANSFORM Note that if we had a set whose span was dense in V , then the span would have to be all of V . 3 ii. The collection of finite linear combinations of the {en } is dense in the space of continuous functions on the circle iii. en (x + y) = en (x)en (y) Notation 2.2. Let ζ = e2πi/N . We define, for ` = 0, . . . , N − 1, the functions e` on Z(N ) by The characters...? (Not quite.) k e` (k) := ζ ` = e2πi`k/N . Lemma 2.3. The set {e0 , . . . , eN −1 } is orthogonal (and thus a basis of V ). Proof. A calculation: (em , e` ) := N −1 X ζ mk −`k k=0 ζ = N −1 X ζ (m−`)k , k=0 which, if m 6= ` yields (a geometric sum) 1 − (ζ m−` )N = 0, 1 − ζ m−` and yields N if m = `. Thus if we normalize the {e` }, we obtain an orthonormal basis of V : Notation 2.4 (Orthonormal basis of V ). Let e∗` = √1 e` . N The ONB 3. Finite Fourier Analysis Definition 3.1. Give F ∈ V , we define the nth Fourier coefficient of F by 1 an = √ (F, e∗n ) N N −1 1 X = F (k)e−2πink/N N k=0 Fourier coefficients on Z(N ) LECTURE 29: FINITE FOURIER ANALYSIS: THE FAST FOURIER TRANSFORM 1. Finite Fourier Analysis Definition 1.1. Give F ∈ V , we define the nth Fourier coefficient of F by 1 an = √ (F, e∗n ) N N −1 1 X = F (k)e−2πink/N N Fourier coefficients on Z(N ) k=0 Finite Fourier inversion and Plancherel are (obviously) linear algebra statements about orthonormal bases. Theorem 1.2 (Fourier inversion). Let F ∈ V . Then N −1 X F (k) = an e2πink/N . “Obviously” because Fourier analysis = infinite-dimensional linear algebra; so finite-dimensional Fourier analysis = linear algebra. n=0 Fourier inversion on Z(N ) Proof. (Do this backwards.) N −1 N −1 X X 1 2πink/N √ (F, e∗n )e2πink/N an e := N n=0 n=0 = N −1 X (F, e∗n )e∗n (k) = F (k) n=0 since {e∗n } is an ONB of V . Theorem 1.3 (Parseval-Plancherel formulae). N −1 N −1 X 1 X 2 |F (n)|2 . |an | = N n=0 n=0 1 Again, just properties of ONBs. THE FAST FOURIER TRANSFORM 2 Proof. N −1 1 X 1 |F (n)|2 =: ||F ||2 N n=0 N N −1 1 X = |(F, e∗n )|2 N n=0 = N −1 X n=0 N −1 2 X 1 ∗ √ (F, en ) =: |an |2 . N n=0 2. FFT (the Fast Fourier Transform) Question. How does one best calculate the Fourier coefficients of a function F on Z(N )? The Question 2.1. Naive calculation. It is easy to see that if one is given the values of F (0), . . . , F (N − 1) and ωN = e−2πi/N , then calculating the N Fourier coeffiN −1 2 cients {aN k (F )}k=0 of F on Z(N ) requires at most 2N + N operations. After all, by definition, aN k (F ) Straightforward counting N −1 1 X kr := F (r)ωN . N r=0 Then N −1 2 i. Calculating ωN , . . . , ωN takes at most N − 2 multiplications ii. For each aN k (F ), one needs at most 1+N multiplications 1 (first by N , then the N products inside the sum) and N − 1 additions (the sum). iii. Thus we have N − 2 operations at the beginning, and then N × 2N operations afterwards, totalling 2N 2 + N − 2. However, one can actually do much better than this. THE FAST FOURIER TRANSFORM 3 2.2. The fast Fourier transform. Theorem 2.1 (FFT). Given ωN = e−2πi/N (with N = 2n ), it takes at most 4 · 2n n = 4N log2 (N ) = O(N log N ) operations to calculate the Fourier coefficients of a function on Z(N ). Notation 2.2. Let #(M ) denote the minimum number of operations needed to calculate all the Fourier coefficients of any function on Z(M ). Much better version. Notation: minimum number of operations Lemma 2.3 (The key lemma). If we are given ω2M = e−2πi/(2M ) then #(2M ) ≤ 2#(M ) + 8M. 2M Proof of lemma. First note that to calculate ω2M , . . . , ω2M 2 takes no more than 2M operations; also observe that ω2M = −2πi/M e =: ωM . Now, given any function F on Z(2M ) define F0 and F1 on Z(M ) by F0 (n) = F (2n), F1 (n) = F (2n + 1); by definition, it is possible to calculate the Fourier coefficients of F0 and F1 with at most #(M ) operations each. Notating Fourier coefficients corresponding to Z(2M ) and Z(M ) as a2M and aM k k , we claim: 1 M M k a2M (F ) = a (F ) + a (F )ω 0 1 k 2M ; k 2 k thus after obtaining the Fourier coefficients for F0 and F1 , each a2M k (F ) can be obtained in three operations (one multiplication, one addition, one multiplication) and thus #(2M ) ≤ 2M = 2#(M ) + 3 × 2M = 2#(M ) + 8M, k (steps to calculate the {ω2M }2M k=1 , the Fourier coefficients of F0 and F1 , and then finally the coefficients a2M k (F )) as desired. The main idea: to divide F into even and odd parts, whose Fourier coefficients can be obtained in ≤ #(M ) steps and notice that the Fourier coefficients of F can be obtained from the Fourier coefficients of the even and odd parts. THE FAST FOURIER TRANSFORM 4 The first step we do nothing; the second we do even less. So the only thing left to do is to prove the claim. But this is basically a tautology: just break the sum into its even and odd terms: 2M −1 1 X kr 2M F (r)ω2M ak (F ) := 2M r=0 = =: 1 2 1 2 1 M 1 M M −1 X `=0 M −1 X `=0 k(2`) F (2`)ω2M + k(2`) F0 (`)ω2M + 1 M 1 M M −1 X m=0 M −1 X ! k(2m+1) F (2m + 1)ω2M ! k(2m+1) F1 (m)ω2M . m=0 Noticing that k(2`) k` ω2M = ωM and mk k k(2m+1) 2 mk k ω2M = ω2M ω2M = ωM ω2M by the observation in the first paragraph of the proof finishes the claim. Remark. Notice that the algorithm is built into the proof of the lemma. The FFT was discovered by Cooley and Tukey in 1965; however, in 1984 it was discovered that it (as usual) had already been known to Gauss around 1805. LECTURE 30: FINITE FOURIER ANALYSIS: THE FAST FOURIER TRANSFORM 1. Proof of FFT Theorem 1.1 (FFT). Given ωN = e−2πi/N (with N = 2n ), it takes at most 4 · 2n n = 4N log2 (N ) = O(N log N ) operations to calculate the Fourier coefficients of a function on Z(N ). Much better version. Recall: the key was that calculation of the coefficients on Z(2M ) could be done by calculating the coefficients of related functions (the odd and even parts) on Z(M ): Lemma 1.2 (The key lemma). If we are given ω2M = e−2πi/(2M ) then #(2M ) ≤ 2#(M ) + 8M. Once we have the key lemma, the theorem follows immediately: Proof of Theorem. Let N = 2n ; we induct on n. In the case n = 1 (N = 2), by definition 1 1 N aN 0 (F ) = [F (1) + F (−1)] and a1 (f ) = [F (1) − F (−1)], 2 2 which requires 5 < 8 operations; so case n = 1 is verified. For the inductive step, we assume that the theorem is true for N = 2n−1 ; i.e., that #(N ) ≤ 4 · 2n−1 (n − 1). Then, by the lemma, #(2N ) ≤ 2 · 4 · 2n−1 (n − 1) + 8 · 2n−1 = 4 · 2n n, as desired. 1 Proof of theorem follows from lemma by induction trivially. THE FAST FOURIER TRANSFORM 2 2. Fourier Analysis on (finite) Abelian Groups Remark. We’re just going to set up the background knowledge necessary. See how we extend the notions of Fourier analysis; what’s needed, etc. In fact one can create a theory for locally compact (i.e., each point has a neighborhood contained in a compact set) abelian groups. Definition of abelian group Definition 2.1. An abelian group is a set with a binary operation (“group law”) on pairs of elements of G, G×G→G (a, b) 7−→ a · b, that satisfies the follwing properties: i. (commutativity) a · b = b · a for all a, b ∈ G; ii. (associativity) a · (b · c) = (a · b) · c for all a, b, c ∈ G; iii. (identity) there exists an element u ∈ G such that a·u = a for all a ∈ G; and iv. (inverses) given any a ∈ G, there exists an element a−1 ∈ G such that aa−1 = u. Examples and non-examples: (R, ·), (R∗ , ·), (R, +), SL2 (R), SO2 (R), etc. Example: Z∗ (q) Definition 2.2. We say n ∈ Z(q) (q ∈ N) is a unit if there exists an m ∈ Z(q) such that (1) Remark that multiplication is well-defined on the equivalence classes of Z(n) order nm ≡ 1 mod q. The set of all units in Z(q) is denoted Z∗ (q); it is an abelian group under multiplication mod q. Definition 2.3. The number #(G) of elements in a group G is called the order of G, and denoted |G|. Definition 2.4. Let G, H be (abelian) groups. If a map f : G → H “preserves the group law” i.e., satisfies f (a · b) = f (a) · f (b), homomorphism then we call f a (group) homomorphism. THE FAST FOURIER TRANSFORM 3 Definition 2.5. A homomorphism f : G → H, if one-toone and onto, is called an isomorphism; and in that case one says “G is isomorphic to H” and writes G ≈ H. isomorphism Example: the exponential function exp : R → R+ (R equipped with addition and R+ with multiplication, respectively, as the group laws) is a group isomorphism. Remark. The existence of an isomorphism is equivalent to the existence of an inverse homomorphism. 2.1. Structure Theorem for Finite Abelian Groups. Definition 2.6. Given two finite abelian groups G1 and G2 , we define the direct product G1 × G2 to be the set of cartesian pairs direct product {(g1 , g2 ) : g1 ∈ G1 , g2 ∈ G2 } with the group law given by (g1 , g2 ) · (g10 , g20 ) := (g1 · g10 , g2 · g20 ), with which the set becomes (check) itself a finite abelian group. Theorem 2.7 (Structure theorem for finite abelian groups). Any finite abelian group is isomorphic to a direct product of groups of the form Z(N ). Example. Consider Z∗ (8), the multiplicative group of units of Z/8Z; namely, Z∗ (8) = {1, 3, 5, 7}. Z∗ (8) can be shown to be isomorphic to Z(2) × Z(2) via, for example, the mapping under which {1, 3, 5, 7} correspond to {(0, 0), (1, 0), (0, 1), (1, 1)} respectively. 3. Characters Notation 3.1. Let S 1 denote the unit circle in C, equipped with complex multiplication as the group law. Example of structure theorem 4 character dual group THE FAST FOURIER TRANSFORM Definition 3.2. Let G be a finite abelian group, and e : G → S 1 . If for all a, b ∈ G, e(a · b) = e(a)e(b), (in other words, if e is a homomorphism) then we call e a character. b denote the set of all characters of G. Lemma 3.3. Let G b the product e1 · e2 by If we define, for e1 , e2 ∈ G, (e1 · e2 )(a) := e1 (a)e2 (a) for all a ∈ G, b becomes itself an abelian group, called the dual group. then G Useful lemma: Useful lemma: any multiplicative, nonvanishing map is a character Lemma 3.4. Let G be a finite abelian group, and e : G → C\{0}. If e is multiplicative, i.e., e(a · b) = e(a)e(b) for all a, b ∈ G, then e is a character. Proof. Notice that the function |e| is bounded both above and below (away from 0) on G (since G is finite). If |e(g)| > 1, then |e(g n )| = |e(g)|n would go to infinity with n: impossible. Similarly for |e(g)| < 1. Thus e maps into S 1 and is a character. LECTURE 31: FOURIER ANALYSIS ON FINITE ABELIAN GROUPS Recall that we had, in the distant past, just introduced b of the notion of characters and defined the dual group G characters on G. 1. Characters Definition 1.1. Let G be a finite abelian group, and e : G → S 1 . If for all a, b ∈ G, e(a · b) = e(a)e(b), (in other words, if e is a homomorphism) then we call e a character. b denote the set of all characters of G. If we define, for Lemma 1.2. Let G b e1 , e2 ∈ G, the product e1 · e2 by character dual group (e1 · e2 )(a) := e1 (a)e2 (a) for all a ∈ G, b becomes itself an abelian group, called the dual group. then G Lemma 1.3. Let G be a finite abelian group, and e : G → C\{0}. If e is multiplicative, i.e., e(a · b) = e(a)e(b) for all a, b ∈ G, then e is a character. Useful lemma: any multiplicative, nonvanishing map is a character 2. The characters are what we want Definition 2.1. Given a finite abelian group G, let V denote the |G|-dimensional (complex) vector space of functions f : G → C with inner product given by 1 X (f, g) := f (a)g(a) for f, g, ∈ V. |G| a∈G b is an orTheorem 2.2. With the above inner product, G thonormal family. 1 The inner product space of (continuous) functions. FINITE ABELIAN GROUPS 2 b be a non-trivial character. Then Lemma 2.3. Let e ∈ G X e(a) = 0. a∈G Proof. Since e is non-trivial, there exists an element b such that e(b) 6= 1. Then X X X e(b) e(a) = e(ba) = e(a), a∈G a∈G a∈G the last following since multiplying by b is an invertible P (and thus 1-1) map. Thus a∈G e(a) = 0. That the norms are 1 is obvious. b (e, e) = Proof of the theorem. Let’s first show that for e ∈ G, 1. 1 X 1 X (e, e) := e(a)e(a) = |e(a)|2 . |G| |G| a∈G a∈G 1 since e : G → S the first step is done. Now it remains to be seen that (e, e0 ) = 0 for e 6= e0 ; that is, X e(a)e0 (a) = 0. a∈G 1 Well, in the group S , complex conjugation of an element corresponds to taking the inverse of that element (since zz = |z|2 = 1 for z ∈ S 1 ), so we can rewrite the above equivalently as: X e(a)(e0 (a))−1 = 0. a∈G b - we I.e. - expressing the above using the group law for G want to show that X [e · (e0 )−1 ](a) = 0. a∈G Since e · (e0 )−1 is by construction (e = 6 e0 ) a non-trivial character, the previous lemma finishes the proof. Key lemma: showing the cancellation property (moment condition) for characters FINITE ABELIAN GROUPS 3 Remark. Thus the characters are linearly independent elb ≤ G. In ements of V . Since dim(V ) = |G|, we see |G| fact.... b is an ONB of V 3. G b is a Theorem 3.1. Let G be a finite abelian group. G (orthonormal) basis for the vector space of functions on G. 3.1. Some linear algebra: the Spectral Theorem. unitary transformations Definition 3.2. Let V be a d-dimensional inner product space and T : V → V be a linear transformation. If, for all v, w ∈ V , (T v, T w) = (v, w) then we call T unitary. Theorem 3.3 (The Spectral Theorem). Given any unitary transformation T : V → V , there exists a basis {v1 , . . . , vd } of V of eigenvectors of T . Corollary 3.4 (Simultaneous diagonalization). Let V be a finite-dimensional inner product space, and {T1 , . . . , Tk } be a family of linear transformations on V . If all {Ti } commute, then there exists a basis for V consisting of eigenvectors for every Ti . Proof. By induction. Suppose the lemma is true for any family of k − 1 commuting unitary transformations. Consider the family {T1 , . . . , Tk }. By the spectral theorem, we can decompose V into a direct sum of eigenspaces of the last transformation Tk , V = Vλ1 ⊕ · · · ⊕ Vλs . Now, given any Tj , 1 ≤ j ≤ k − 1, we notice that for any v ∈ Vλi , and any 1 ≤ i ≤ s, Tk Tj (v) = Tj Tk (v) = Tj (λi v) = λi Tj (v); That is, Tj (v) ∈ Vλi . In other words, each of the Tj preserves each of the eigenspaces. The Spectral Theorem: unitary ⇒ diagonalizable Consequence: simultaneous diagonalization of commuting unitary transformations (Base case is the spectral theorem.) Decompose V into eigenspaces of Tk . Notice (by mutation) Tj preserve eigenspaces comthat those 4 FINITE ABELIAN GROUPS Now, by induction hypothesis, the family {T1 , . . . , Tk−1 } is simultaneously diagonalizable on each subspace Vλi ; i.e., each Vλi is decomposable into a direct sum of subspaces which are eigenspaces for all of the {T1 , . . . , Tk−1 }. Those eigenspaces are of course contained in Vλi , the eigenspace for Tk ; so they provide a decomposition of V into eigenspaces for the entire family. Thus ends the linear algebra portion of this show. Then we can decompose those eigenspaces (of Tk ) into further eigenspaces of the family {T1 , . . . , Tk−1 } LECTURE 32: FOURIER ANALYSIS ON FINITE ABELIAN GROUPS b is an ONB of V 1. G b is a Theorem 1.1. Let G be a finite abelian group. G (orthonormal) basis for the vector space of functions on G. Corollary 1.2 (Simultaneous diagonalization). Let V be a finite-dimensional inner product space, and {T1 , . . . , Tk } be a family of linear transformations on V . If all {Ti } are unitary and commute, then there exists a basis for V consisting of eigenvectors for every Ti . Proof of theorem. Recall that dim(V ) = |G|, and that the b of characters has order |G| b ≤ |G|. Recall also dual group G that the characters form an orthonormal family; to show b = |G|. they form a basis, it suffices to show that |G| Sketch: we consider the (finite) family of linear transformations (“left-translation”) on V which are not only unitary, but also commute (because G is abelian). We shall see that the basis of V which simultaneously diagonalizes the family consists (essentially) of characters: thus the number of characters equals dim(V ) = |G|, as desired. To each a ∈ G, associate a linear transformation Ta : V → V defined by, for f ∈ V , (Ta f )(x) = f (a · x) for all x ∈ G. With the inner product defined on V , we note that 1 Consequence: simultaneous diagonalization of commuting unitary transformations Need to check that there are sufficient characters to form a basis of V Create commuting, unitary transformations: “translation” operators Why unitary: if we shift both functions over by the same translation, then the inner product doesn’t change. 2 b IS AN ORTHONORMAL BASIS G 1 X Ta f (b)Ta g(b) |G| b∈G 1 X Ta f (b)Ta g(b) = |G| b∈G X 1 f (a · b)g(a · b) = |G| b∈G 1 X f (b)g(b) = (f, g); = |G| (Ta f, Ta g) := b∈G The previous corollary yields a basis (of V ) of eigenfunctions. In fact, those |G| eigenfunctions, properly normalized, are characters; so we have enough characters to form a basis. i.e., the Ta are all unitary transformations. Further, they commute (since G is abelian); so by the previous lemma, they are simultaneously diagonalizable. Thus we have a basis for V of functions {vb : G → C}b∈G , each of which is an eigenfunction for all of the {Ta }a∈G . We now Claim: For each of the {vb : G → C}b∈G , let vb (x) wb (x) := vb (u) where u denotes the identity element of G. Then wb is a character. Remark. Why was this the obvious thing to do? Well, think of what a character of G must do: it must, first, preserve the group law: χ(a · x) = χ(a)χ(x); second, it must preserve the identity: χ(uG ) = uS1 = 1. The first statement is equivalent to saying that χ must be an eigenfunction for the family of translations Ta (for all a ∈ G); the second forces the normalization required of w above. Can we get such eigenfunctions? Yes, since the Ta are unitary and commute. b IS AN ORTHONORMAL BASIS G 3 Proof of claim: Let’s use v to denote one of the basis eigenfunctions, and w for the corresponding normalized eigenfunction. We first notice that v(u) 6= 0: if v(u) = 0, then v(a) = v(a · u) = Ta v(u) = λa v(u) = 0 and then v(a) ≡ 0 for all a ∈ G. In fact, v can never vanish, for if v(a) = 0, then w never vanishes. v(u) = v(a−1 a) = Ta−1 v(a) = λa−1 v(a) = 0 Similarly, w can never vanish. It thus suffices to show that w : G → C\{0} is multiplicative; the lemma at the beginning of this lecture would then imply that it must be a character. Well, letting λa denote the eigenvalue of v for Ta , we observe w is multiplicative. v(a · b) v(u) Ta v(b) λa v(b) λa λb v(u) = = = v(u) v(u) v(u) = λa λb = w(a)w(b), w(a · b) := using the fact that w(x) := v(x) Tx v(u) = = λx ; v(u) v(u) so we’re done. Thus there are in fact dim(V ) = |G| characters, so they do form an ONB of V ) Once we have that the characters form an orthonormal basis of V , our Fourier analysis of functions on G follows as follows. 2. Fourier Analysis on Finite Abelian Groups Definition 2.1. Let G be a finite abelian group. Given b we define the any function f : G → C and character e ∈ G, Fourier coefficients on (finite) abelian groups b IS AN ORTHONORMAL BASIS G 4 Fourier series Fourier inversion: just the statement b is an ONB of that G V. Fourier coefficient of f w.r.t. e as 1 X fˆ(e) := (f, e) := f (a)e(a). |G| a∈G P Further, we define the Fourier series of f as e∈Gb fˆ(e)e. Of course, since the characters form an ONB, we have immediately that f actually equals its Fourier series: Theorem 2.2 (Fourier inversion). Let G be a finite abelian group. Then given any function f : G → C, we have that X f= fˆ(e)e. b e∈G As before we also have a Plancherel/Parseval theorem: Theorem 2.3 (Plancherel/Parseval). Let G be a finite abelian group. Then given any function f : G → C, we have that X ||f ||2 = |fˆ(e)|2 . b e∈G Proof. Using Fourier inversion, we see that X X 2 ˆ ||f || := (f, f ) = f (e)e, fˆ() b e∈G = XX b ∈G fˆ(e)fˆ()(e, ) = b ∈G b e∈G X |fˆ(e)|2 , b e∈G since the characters are orthonormal. A result that shows the ultimate power of Fourier analysis 3. Dirichlet’s theorem Theorem 3.1. Let q, ` ∈ N. If q and ` have no common factor, then the sequence `, ` + q, ` + 2q, . . . , ` + kq, . . . contains infinitely many primes. b IS AN ORTHONORMAL BASIS G 5 4. Background knowledge: The Fundamental Theorem of Arithmetic Theorem 4.1 (Euclid’s algorithm). Let a, b ∈ Z; b > 0. Then there exists unique integers q and r, with 0 ≤ r < b such that a = qb + r. Proof. Consider S := {a − qb : q ∈ Z; a − qb ≥ 0}. S is non-empty; let r = min S. Of course a = qb + r and r ≥ 0; we need to show r < b. If r ≥ b, then r = b + s for some s ≥ 0; so b + s = a − qb, and thus 0 ≤ s = a − (q + 1)b < a − qb = r. But then s ∈ S and s < r; contradiction. Proving uniqueness: suppose that a = qb + r = q1 b + r1 with 0 ≤ r, r1 < b. Then (q − q1 )b = r1 − r. Since |LHS| is (nonnegative) integral multiple of b, and |RHS| < b, |LHS| = |RHS| = 0b = 0. The basis of long division. Slight error in text: need not assume s < r. LECTURE 33: ELEMENTARY NUMBER THEORY 1. Background knowledge: The Fundamental Theorem of Arithmetic Theorem 1.1 (Euclid’s algorithm). Let a, b ∈ Z; b > 0. Then there exists unique integers q and r, with 0 ≤ r < b such that a = qb + r. Proof. Consider S := {a − qb : q ∈ Z; a − qb ≥ 0}. S is non-empty; let r = min S. Of course a = qb + r and r ≥ 0; we need to show r < b. If r ≥ b, then r = b + s for some s ≥ 0; so b + s = a − qb, and thus 0 ≤ s = a − (q + 1)b < a − qb = r. But then s ∈ S and s < r; contradiction. Proving uniqueness: suppose that a = qb + r = q1 b + r1 with 0 ≤ r, r1 < b. Then (q − q1 )b = r1 − r. Since |LHS| is (nonnegative) integral multiple of b, and |RHS| < b, |LHS| = |RHS| = 0b = 0. Definitions. Let a, b ∈ Z. i. If there exists c ∈ Z such that ac = b, then we say a divides b (a is a divisor of b) and write a|b. ii. A prime number is a positive integer, greater than 1, which has no divisors besides 1 and itself. iii. Say a, b ∈ N. The greatest common divisor gcd(a, b) of a and b is the largest integer that divides both a and b. iv. If gcd(a, b) = 1, we say a and b are relatively prime 1 The basis of long division. Slight error in text: need not assume s < r. Elementary concepts from number theory (i.e., they have no nontrivial common divisor) 2 gcd(a, b) is in the (Z)-span of {a, b}. Let’s show that s|a. We want to show that r = 0. ELEMENTARY NUMBER THEORY Theorem 1.2. Let a, b, ∈ N. Then there exist x, y ∈ Z such that xa + yb = gcd(a, b). Proof. Consider the (positive) span of a and b: S := {xa + yb : x, y ∈ Z; xa + yb > 0}. Let s = min S; then in fact we claim that s = gcd(a, b). Why? Well, obviously, any divisor of both a and b will divide s; so s ≥ gcd(a, b). So STS s|a and s|b. By Euclid’s algorithm, a = qs + r with 0 ≤ r < s. Let’s think about qs. Since s ∈ S, xa + yb = s for some x, y ∈ Z and thus qxa + qyb = qs. Together, that gives us qax + qby = a − r; i.e. r = (1 − qx)a + (−qy)b; in other words, 0 ≤ s − r ≤ s and s − r ∈ S. Since s = min S, this forces r = 0. Similarly, s|b; so s = gcd(a, b). Corollary 1.3. Let a, b ∈ N. a and b are relatively prime if and only if there exist x, y ∈ Z such that ax + by = 1. Characterization of relative prime-ness Proof. Obvious: if relatively prime, then the theorem implies there exist such integers. If such integers exist, then any divisor of both a and b must divide 1; thus the only divisor is 1. Corollary 1.4. Let a, c be relatively prime. If c|ab, then c|b. Useful property of relatively prime numbers Proof. gcd(a, c) = 1 means there exist x, y ∈ Z such that xa+yc = 1. Thus xab+ycb = b, and since c|ab and (clearly) c|ycb, c also divides b. ELEMENTARY NUMBER THEORY 3 Corollary 1.5. Let p be prime. If p|a1 · · · ar , then p|ai for some i. Corollary of the previous corollary Proof. Suppose p - a1 . Then p|a2 · · · ar . Eventually p must divide one of the ai . Theorem 1.6 (Fundamental Theorem of Arithmetic). Every positive integer greater than 1 can be factored uniquely into a product of primes. hilarious proof Proof. First we note that all positive integers have prime factorizations. For consider the set of positive integers without prime factorizations; and let n = min S. Since n is not a prime, n = ab for some integers 1 < a, b < n. Since n was minimal, a, b both have prime factorizations; and so n does: ※. Now suppose n has two prime factorizations, n = p1 p2 · · · pr and n = q1 q2 · · · qs . Since p1 |n = q1 q2 · · · qs , we know p1 must divide one of the qi (relabeling, call it q1 ); thus p1 = q1 . In this manner we see that r = s and that the factorizations are identical up to permutation. 2. Euclid’s proof of the infinitude of primes Theorem 2.1. There are infinitely many primes Proof. By contradiction: suppose there are finitely many, which we Qn label {p1 , . . . , pn }. Then consider the number N := ( i=1 pi ) + 1. It is larger than any pi , so must be non-prime; thus has a prime factorization. Qn However, if the prime pk divides N , then it divides N −( i=1 pi ) = 1, which is impossible. Notice that we do need the Fundamental Theorem of Arithmetic for this proof. Corollary 2.2. The number of primes of the form 4k + 3 is infinite. Variation theme on the 4 ELEMENTARY NUMBER THEORY Proof. Suppose there are only finitely many such primes: {p1 = 7, p2 = 11, . . . , pn } (leave 3 out of the list for now). Let N = 4p1 p2 · · · pn + 3. Observe the odd primes are either of the form 4k + 1 or 4k + 3 Since N is of the form 4k +3, and further, N > pn , we know N cannot be prime, and thus has a prime factorization (not containing 2, since N is odd). Observe that (4m + 1)(4n + 1) = 4(mn + m + n) + 1; so N ’s prime factors must include a prime of the form 4k + 3. But, just as in the previous argument, none of the primes in our finite list divide evenly into N . 3. One result about infinite products infinite product ∞ Definition Q 3.1. Given {An ∈ R}n=1 , we define the infinite product An by ∞ Y An = lim N →∞ n=1 condition implying convergence of product Proof of lemma is via power series expansion of ln(1 + x) around 0 An . n=1 The main result we shall need is the following: P Theorem 3.2. Let A = 1 + a . If an converges abson n Q lutely, then An converges. Lemma 3.3. For |x| < 21 , | ln(1 + x)| ≤ 2|x|. P Proof of theorem. Since an converges, WLOG we may 1 assume |an | < 2 for all n. Then N Y n=1 An := N Y n=1 PN =e Here’s where we use absolute convergence N Y eln(1+an ) n=1 ln(1+an ) . By the lemma, | ln(1 + an )| ≤ 2|an |; so PN n=1 ln(1 + an ) ELEMENTARY NUMBER THEORY 5 converges to a limit, B. Thus, by continuity of the exponential function, N Y PN lim An = lim e n=1 ln(1+an ) = eB . N →∞ n=1 N →∞ Remarks. i. Notice that if the product vanishes, then one of the initial factors (i.e., before |an | < 1/2) must have been 0, since the infinite tail is eB 6= 0. ii. Notice that if in addition an 6= 1 for all n, then N Y 1 Y 1 1 = lim = lim QN N N 1 − an 1 − an n (1 − an ) n n also converges. LECTURE 34: EULER’S PRODUCT FORMULA; FOURIER SERIES FOR CHARACTERISTIC FUNCTIONS OF POINTS 1. Euler’s product formula for the Riemann zeta function Definition 1.1. Let s > 1. We define the zeta function by zeta function ∞ X 1 ζ(s) := . s n n=1 Remark. That this series converges can be seen via the integral test: Z ∞ ∞ ∞ Z n X X 1 dx dx 1 ≤ 1 + = 1 + = 1 + . s s s n x x s − 1 n−1 1 n=1 n=2 Theorem 1.2 (Euler’s product formula). Let s > 1, and let ℘ denote the collection of primes. Then Y 1 ζ(s) = . s 1 − 1/p p∈℘ Remark. The intuition: each of the terms in the product can be realized as a geometric series: 1 1 1 = 1 + + + ··· ; ps p2s 1 − p1s Y Y 1 1 1 so = 1 + s + 2s + · · · s 1 − 1/p p p p∈℘ p 1 Converges because of the integral test Analytic version of the fundamental theorem of arithmetic 2 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM Now, by the fundamental theorem of arithmetic, each n ∈ N can be written as a product n = pk11 pk22 · · · pkmm , so 1 1 = . ns (pk11 pk22 · · · pkmm )s “Thus” (at least formally) ∞ Y X 1 1 1 = 1 + s + 2s + · · · ζ(s) := s n p p p n=1 Proof. Let N ∈ N, and consider N X 1 n=1 ns . Now, pick any M > N . Each integer n = 1, 2, . . . , N has a prime factorization in terms of primes p < N , where p occurs fewer than M times in that factorization. Thus N X Y 1 1 1 1 ≤ 1 + s + 2s + · · · + M s s n p p p n=1 p≤N Y 1 Y 1 ≤ ≤ 1 − p−s 1 − p−s p p≤N Letting N → ∞ gives ζ(s) ≤ Y p 1 1 − p−s . Conversely, by the fundamental theorem of arithmetic, we have for any N , X ∞ Y 1 1 1 1 1 + s + 2s + · · · + M s ≤ p p p ns n=1 p≤N Thus, letting M → ∞, Y p≤N 1 1 − p−s ∞ X 1 ≤ ns n=1 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM 3 and thus, letting N → ∞, ∞ Y 1 X 1 . ≤ −s s 1 − p n p n=1 Theorem 1.3 (Euler’s analytic version of the infinitude of primes). X1 diverges. The series p p∈℘ Take that, Euclid! Lemma 1.4. ln(1 + x) = x + E(x) where |E(x)| ≤ x2 for |x| < 1/2. Proof of lemma. Using the power series expansion, E(x) = ln(1 + x) − x = − x2 x3 x4 + − + −··· 2 3 4 x2 so |E(x)| ≤ (1 + |x| + |x|2 + · · · ) 2 x2 1 1 ≤ 1 + + 2 + · · · = x2 2 2 2 for |x| ≤ 12 . Proof of theorem. By Euler’s formula Y 1 ζ(s) = , so −s 1 − p p∈℘ X 1 ln ζ(s) = − ln 1 − s p p∈℘ By the lemma, Slight error in book: big O mistake 4 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM X 1 1 − s +E ln ζ(s) = − p ps p∈℘ X 1 + C, = s p p∈℘ since X E p∈℘ 1 ps ∞ X 1 X 1 π2 ≤ ≤ = . 2s 2 p n 6 p∈℘ n=1 I.e., X 1 = ln ζ(s) − C. s p p∈℘ Then consider: ∞ X 1 lim inf ζ(s) := lim inf s→1+ s→1+ ns n=1 M M X X 1 1 ≥ lim inf = , s s→1+ n n n=1 n=1 for every M ∈ N. So then X 1 ≥ lim inf ln ζ(s) = ∞ lim inf s s→1+ s→1+ p p∈℘ P P and thus p∈℘ p1 ≥ lim inf s→1+ p∈℘ p1s = ∞ as well. 2. Dirichlet characters and the reduction of the theorem The above approach showing that there are infinitely many primes was extended by Dirichlet to show that there are an infinitude of primes of the form p ≡ ` mod q. It suffices to show that X 1 diverges. p p∈℘: p≡` mod q DIRICHLET CHARACTERS AND REDUCTION OF THEOREM 5 The proof will involve using the Fourier series for functions on Z∗ (q), the units (invertible elements) of Z(q). Question. What is Z∗ (q)? a ∈ Z(q) being invertible means of course that there exists x ∈ Z(q) such that xa ≡ 1 mod q, i.e., that gcd(a, q) = 1. In other words, Z∗ (q) is precisely the set of (equivalence classes of) numbers relatively prime to q. Definition 2.1. Define the Euler ϕ-function by ϕ(q) := | Z∗ (q)|. Let δ` : Z∗ (q) → R denote the characteristic function χ{`} of the singleton ` ∈ Z∗ (q), i.e., 1 if n ≡ ` mod q δ` (n) := 0 otherwise. Dumb comment: we extend δ` to all of Z(q) (and thus all of Z) by defining it as 0 on the non-units of Z(q), i.e., δ` (n) = χ{`} ([n]) 0 minor notation: Euler phi function Characteristic functions of singletons in Z∗ (q): Extending δ` to all of Z - this is just a periodic extension if n and q are relatively prime otherwise. b be a character. We define the Definition 2.2. Let e ∈ G Dirichlet character modulo q extending e, denoted χ = χ(e) : Z → C by e([m]) if m and q are relatively prime χ(m) := 0 otherwise. We denote the extension of the trivial character of G by χ0 . Remark. Observe that the Dirichlet characters are multiplicative on all of Z. Lemma 2.3. X 1 δ` (n) := χ(`)χ(n). ϕ(q) b χ(e) :e∈G Extending characters to all of Z, again periodically Expressing δ` in terms of characters 6 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM Proof. Consider the Fourier coefficients on G of δ` : 1 X 1 b δ` (e) := δ` (m)e(m) = e(`). |G| |G| m∈G Fourier series of characteristic function of a singleton Thus the Fourier inversion formula yields X δ` (n) := δb` (e)e(n) b e∈G = 1 X e(`)e(n). |G| b e∈G Rewriting the above in terms of the extensions of the functions to Z, we have the desired result. LECTURE 35: EULER’S PRODUCT FORMULA; FOURIER SERIES FOR CHARACTERISTIC FUNCTIONS OF POINTS 1. Reduction of the problem Recall: our goal is to show Dirichlet’s theorem, i.e., that for q, ` which are relatively prime, X 1 diverges. p p∈℘: p≡` mod q We reduced the problem as follows. First, we took the characteristic function δ` of {`} as a function on Z∗ (q), and ∗ (q). By our work in the previous the characters e ∈ Z[ chapter (recall by definition ϕ(q) := | Z∗ (q)|), we have the following Fourier series expansion: Lemma 1.1. δ` (n) = 1 X e(`)e(n). ϕ(q) b e∈G We then extended the definitions of δ` and e to all of Z periodically (defining them as 0 whenever n was not relatively prime with q), and letting χ denote the extension of e to Z. Then by the above inversion formula, since δ` (p) = 1 iff p ≡ ` mod q, X X δ` (p) 1 =: ps ps p∈℘ p∈℘ s.t. p≡` mod q X χ(p) 1 X = χ(`) . s ϕ(q) p p∈℘ b χe :e∈G Breaking the first sum into the (Dirichlet extension of the) trivial character and the non-trivial ones, we get that the 1 Perhaps the best way is to present this backwards, i.e., realize the sum in terms of the characteristic function which can be thought of as a function on Z∗ (q) and as thus possessing a Fourier series expansion. 2 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM above equals X χ(p) 1 X χ0 (p) 1 X = + χ(`) ϕ(q) p∈℘ ps ϕ(q) ps p∈℘ χe 6=χ0 = X χ(p) 1 X 1 1 X + χ(`) , s ϕ(q) ps ϕ(q) p p∈℘ χe 6=χ0 p∈℘: p-q the last because χ0 (p) is the extension of the trivial character (taking 1 on all of Z∗ (q)); i.e., it indicates whether or not p ∈ ℘ is relatively prime with q, which is equilvalent to saying that p does not divide q. Now let’s consider the first sum, X 1 . ps p∈℘:p-q Snce but finitely many primes do not divide q, and since P all 1 p∈℘ p diverges (Euler’s theorem), we see that this sum diverges as s → 1+ . Thus to show that the sum diverges, it suffices to show that X χ(p) 1 X , < ∞, χ(`) s ϕ(q) p p∈℘ χe 6=χ0 as s ↓ 1, which we shall do by proving the following. Theorem 1.2. Let χ be any nontrivial Dirichlet character. Then X χ(p) ps p∈℘:p≡` mod q is bounded as s → 1+ . 2. The strategy The key to proving the reduced problem will be the following. DIRICHLET CHARACTERS AND REDUCTION OF THEOREM 3 Definition 2.1. For s > 1, χ a Dirichlet character (modulo q), we define the L-function L by ∞ X χ(n) L(s, χ) := . s n n=1 Theorem 2.2 (Product formula for L-functions). For s > 1, ∞ X χ(n) Y 1 L(s, χ) := . = s χ(p) n 1 − s p∈℘ n=1 p Then, formally, χ(p) ln L(s, χ) = − ln 1 − s p p∈℘ X χ(p) 1 =− − s +E p ps p∈℘ X χ(p) = + C; s p p∈℘ P we use this relation to show that lims↓1 p∈℘ χ(p) ps is finite. To make the above argument rigorous, we need to do the following. i. Extend the logarithm to complex numbers, ii. Interpret properly the (complex, multi-valued) logarithm of the product, iii. Show that L(s, χ) is continuous at s = 1, and iv. Show that L(1, χ) 6= 0. X Again, |C| ≤ π2 6 . 3. Logarithms and infinite products on C Question. How do we extend the logarithm to C? Recall that for |x| < 1 one has the power series ∞ X (−1)n+1 n x log(1 + x) = n n=1 Via power series expansion of the logarithm. 4 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM which implies that (again for |x| < 1) X ∞ 1 1 n log = x . 1−x n n=1 Extension of the natural logarithm to....somewhere. Definition 3.1. For z ∈ C, |z| < 1, we define ∞ X zk 1 := . log1 1−z k k=1 Question. Consider {w = of points? Question: For what set of points are we actually defining the log? Claim: {w = 1 1−z 1 1−z : |z| < 1}. What is this set : |z| < 1} = {w ∈ C : <(w) > 1/2}. Proof. ⊂: 1 1 + re−iθ 1 w= · = 1−z 1 − reiθ 1 + re−iθ 1 + re−iθ = 1 − r2 r 1 + eiθ = 2 2 1−r 1−r 1 r r = + cos θ + i sin θ 1 − r2 1 − r2 1 − r2 Considering the real part of the above, we see that 1−r 1 <(w) ≥ = ; 1 − r2 1+r since 0 ≤ r < 1, <(w) > 1/2. 1 ⊃: Conversely, if <(w) > 1/2, and we write w = 1−z , we see that z = w−1 w . Thus (w − 1)(w − 1) |z|2 = zz = ww 1 − (w + w) =1+ . ww Since w + w = 2<(w) > 1, the second term above is negative; so 0 ≤ |z| < 1. DIRICHLET CHARACTERS AND REDUCTION OF THEOREM 5 Remark. So our above definition is a generalization of the natural logarithm for x > 1/2. Question. What happens if we extend the logarithm in a different way? Do we get the same function? Proposition 3.2 (Properties of log1 ). i. If |z| < 1, then elog1 ( 1−z ) = 1 What properties does this new logarithm have? 1 . 1−z ii. If |z| < 1, then log1 1 1−z = z + E1 (z), where |E1 (z)| ≤ |z|2 for |z| ≤ 1/2. iii. If |z| < 1/2, then 1 log1 ≤ 2|z|. 1−z Proof of (i). Using polar coordinates, let z = reiθ (0 ≤ r < 1). We want to show that P∞ (1 − reiθ )e k=1 (reiθ )k k = 1. Well, when r = 0, the left hand side does equal 1, so it d (LHS) = 0. But this is easy (using suffices to show that dr the fact that we can differentiate term-by-term in the disk of convergence): !0 ∞ iθ k iθ )k X P P∞ (reiθ )k ∞ (re (re ) d (LHS) = (1 − reiθ )e k=1 k · + (−eiθ )e k=1 k dr k k=1 !0 # " ∞ iθ k X P∞ (reiθ )k (re ) = −eiθ + (1 − reiθ ) e k=1 k k k=1 6 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM Differentiating term-by-term, we see that " !0 # ∞ iθ k X (re ) −eiθ + (1 − reiθ ) k = −eiθ + (1 − reiθ )e k=1 ∞ X iθ (reiθ )k−1 k=1 = −eiθ + (1 − reiθ )eiθ With the logarithm in hand, we can now prove an analogous theorem for infinite products of complex numbers. The proof is completely the same; no point in showing it. 1 = 0. 1 − reiθ P Proposition 3.3 (Condition forP convergence). Let an be a series of complex numbers. If an converges absolutely, and an 6= 1 for all n, then ∞ Y 1 converges. 1 − a n n=1 Proof. As before, WLOG assume |an | < 1/2 for all n. Using the properties of the log1 , and then the exponential function (note that we use the complex exponential function from chapter 1 as well) Y N N Y 1 1 = elog1 ( 1−an ) 1 − an n=1 n=1 PN 1 = e n=1 log1 ( 1−an ) . P Just like in the earlier proof, we note that since |an | converges, and since 1 log1 ≤ 2|an |, 1 − an the sum in the exponent converges, and thus the limit exists. LECTURE 35: EULER’S PRODUCT FORMULA; FOURIER SERIES FOR CHARACTERISTIC FUNCTIONS OF POINTS 1. Reduction of the problem Recall: our goal is to show Dirichlet’s theorem, i.e., that for q, ` which are relatively prime, X 1 diverges. p p∈℘: p≡` mod q We reduced the problem as follows. First, we took the characteristic function δ` of {`} as a function on Z∗ (q), and ∗ (q). By our work in the previous the characters e ∈ Z[ chapter (recall by definition ϕ(q) := | Z∗ (q)|), we have the following Fourier series expansion: As before P 1 we’ll consider ps as s ↓ 1. Lemma 1.1. 1 X δ` (n) = e(`)e(n). ϕ(q) b e∈G We then extended the definitions of δ` and e to all of Z periodically (defining them as 0 whenever n was not relatively prime with q), and letting χ denote the extension of e to Z. Then since δ` (p) = 1 iff p ≡ ` mod q, X X δ` (p) 1 =: ps ps p∈℘ p∈℘ s.t. p≡` mod q X χ(p) 1 X = χ(`) . s ϕ(q) p p∈℘ Perhaps the best way is to present this backwards, i.e., realize the sum in terms of the characteristic function which can be realized as a function on Z∗ (q) and as thus possessing a Fourier series expansion. b χe :e∈G the second equality being the above Fourier series expansion. Breaking the first sum into the (Dirichlet extension of 1 Step 1: Fourier series expansion 2 Step 2: break sum into trivial and nontrivial characters DIRICHLET CHARACTERS AND REDUCTION OF THEOREM the) trivial character and the non-trivial ones, we get that the above equals X χ(p) 1 X χ0 (p) 1 X χ(`) = + ϕ(q) p∈℘ ps ϕ(q) ps p∈℘ χe 6=χ0 = X χ(p) 1 X 1 X 1 χ(`) + , s ϕ(q) ps ϕ(q) p p∈℘ χe 6=χ0 p∈℘: p-q What’s χ0 ? It will take the value 1 at numbers relatively prime to q; that is, primes that do not divide q. Observe that almost all primes do not divide q the last because χ0 (p) is the extension of the trivial character (taking 1 on all of Z∗ (q)); i.e., it indicates whether or not p ∈ ℘ is relatively prime with q, which is equilvalent to saying that p does not divide q. Now let’s consider the first sum, X 1 . ps p∈℘:p-q Since P all but finitely many primes do not divide q, and since p∈℘ p1 diverges (Euler’s theorem), we see that this sum diverges as s → 1+ . Thus to show that the sum diverges, it suffices to show that X χ(p) 1 X , < ∞, χ(`) s ϕ(q) p p∈℘ χe 6=χ0 as s ↓ 1, which we shall do by proving the following. Reduction problem of the Theorem 1.2. Let χ be any nontrivial Dirichlet character. Then X χ(p) p∈℘:p≡` mod q is bounded as s → 1+ . ps DIRICHLET CHARACTERS AND REDUCTION OF THEOREM Dirichlet L-function (Generalization of zeta function) 3 2. Dirichlet L-functions; Product formula Definition 2.1. For s > 1, χ a Dirichlet character (modulo q), we define the L-function by ∞ X χ(n) . L(s, χ) := s n n=1 The key to proving the reduced problem will be the following product formula, which will play a role analogous to that of Euler’s product formula for the zeta function. Statement of product formula Theorem 2.2 (Product formula for L-functions). For s > 1, ∞ X χ(n) Y 1 L(s, χ) := = . s χ(p) n p∈℘ 1 − s n=1 p To accomplish this theorem will involve extending the notions of logarithm and infinite products to complex numbers. What we need isQnot much: logs and . 3. Logarithms and infinite products on C Question. How do we extend the logarithm to C? Recall that for |x| < 1 one has the power series ∞ X (−1)n+1 n log(1 + x) = x n n=1 Motivation: power series expansion of the logarithm. which implies that (again for |x| < 1) ∞ X 1 1 n x . log = − log(1 − x) = 1−x n n=1 Definition 3.1. For z ∈ C, |z| < 1, we define ∞ X 1 zk log1 := . 1−z k k=1 Extension of the natural logarithm to....somewhere. 4 Question: For what set of points are we actually defining the log? Some “boring” calculations (actually they were slightly fun). DIRICHLET CHARACTERS AND REDUCTION OF THEOREM Question. Consider {w = of points? Claim: {w = 1 1−z 1 1−z : |z| < 1}. What is this set : |z| < 1} = {w ∈ C : <(w) > 1/2}. Proof. ⊂: 1 1 1 + re−iθ w= = · 1−z 1 − reiθ 1 + re−iθ 1 + re−iθ = 1 − r2 1 r = + eiθ 2 2 1−r 1−r 1 r r = + cos θ + i sin θ 1 − r2 1 − r2 1 − r2 Considering the real part of the above, we see that 1 1−r = ; <(w) ≥ 1 − r2 1+r since 0 ≤ r < 1, <(w) > 1/2. 1 ⊃: Conversely, if <(w) > 1/2, and we write w = 1−z , we w−1 see that z = w . Thus (w − 1)(w − 1) |z|2 = zz = ww 1 − (w + w) . =1+ ww Since w + w = 2<(w) > 1, the second term above is negative; so 0 ≤ |z| < 1. Remark. So our above definition is a generalization of the natural logarithm for x > 1/2. multi-valued logarithm Question. What happens if we extend the logarithm in a different way? Do we get the same function? Consider the complex exponential function. Since ew+2kπi = ew , we see that e is not a 1-1 function; thus there are multiple logarithms possible: log1 is one of them. DIRICHLET CHARACTERS AND REDUCTION OF THEOREM 5 Does this logarithm behave as it ought to? What properties does this new logarithm have? Proposition 3.2 (Properties of log1 ). i. If |z| < 1, then elog1 ( 1−z ) = 1 1 . 1−z ii. If |z| < 1, then log1 1 1−z = z + E1 (z), where |E1 (z)| ≤ |z|2 for |z| ≤ 1/2. iii. If |z| < 1/2, then 1 log1 ≤ 2|z|. 1−z Proof of (i). (The other two facts follow as before.) Using polar coordinates, let z = reiθ (0 ≤ r < 1). We want to show that iθ P∞ (1 − re )e Note that we can’t differentiate term-by-term when |z| ≥ 1: otherwise, this would be true for all z. k=1 (reiθ )k k Use polar coordinates; see that f (r) = 1, f 0 (r) = 0, so f (r) ≡ 1. = 1. Well, when r = 0, the left hand side does equal 1, so it d (LHS) = 0. But this is easy (using suffices to show that dr the fact that we can differentiate term-by-term in the disk of convergence): d (LHS) = (1 − reiθ )e dr " (reiθ )k k=1 k P∞ iθ · ∞ X (reiθ )k = −e + (1 − re ) ∞ X (reiθ )k k=1 k P∞ + (−eiθ )e k !0 # k=1 iθ !0 e P∞ k=1 (reiθ )k k k=1 (reiθ )k k 6 DIRICHLET CHARACTERS AND REDUCTION OF THEOREM Differentiating term-by-term, we see that " !0 # ∞ iθ k X (re ) −eiθ + (1 − reiθ ) k = −eiθ + (1 − reiθ )e k=1 ∞ X iθ (reiθ )k−1 k=1 = −eiθ + (1 − reiθ )eiθ With the logarithm in hand, we can now prove an analogous theorem for infinite products of complex numbers. The proof is completely the same; almost no point in showing it. 1 = 0. 1 − reiθ P Proposition 3.3 (Condition forP convergence). Let an be a series of complex numbers. If an converges absolutely, and an 6= 1 for all n, then ∞ Y 1 converges. 1 − a n n=1 Proof. As before, WLOG assume |an | < 1/2 for all n. Using the properties of the log1 , and then the exponential function (note that we use the complex exponential function from chapter 1 as well) Y N N Y 1 1 = elog1 ( 1−an ) 1 − an n=1 n=1 PN 1 = e n=1 log1 ( 1−an ) . P Just like in the earlier proof, we note that since |an | converges, and since 1 log1 ≤ 2|an |, 1 − an the sum in the exponent converges, and thus the limit exists. LECTURE 35: PRODUCT FORMULA FOR THE L-FUNCTIONS 1. Proof of the product formula Recall: we hope to obtain a product formula for L-functions analogous to that for the zeta function, i.e.: Theorem 1.1. For s > 1, L(s, χ) := ∞ X χ(n) ns n=1 Y = 1 p∈℘ 1 − χ(p) ps . Proof. P First observe both sides converge: the left since |χ| ≤ 1 1 and ∞ n=1 ns converges for s > 1, and the right side converges by the previous proposition (on the convergence of infinite products), since (letting pn denote the n-th prime) for s > 1 ∞ X χ(pn ) n=1 psn converges absolutely, again in comparison with P∞ 1 n=1 ns . Now, we want to show that ∞ X χ(n) n=1 Approximate sides with sums both finite ns − Y p∈℘ 1 1− χ(p) ps =0 i.e., that the difference is smaller than any > 0. Well, by the triangle inequality, for any N, M , 1 Proof of product formula PROOF OF DIRICHLET’S THEOREM 2 ∞ X χ(n) n=1 ∞ X n=1 ns − Y 1 p∈℘ 1 − χ(p) ps ≤ χ(n) X χ(n) + − ns ns n≤N X χ(n) Y χ(p) χ(pM ) + − 1 + s + · · · + Ms ns p p n≤N Y p≤N Y p≤N p≤N M χ(p) χ(p ) + · · · + − ps pM s p≤N ! Y 1 1 − χ(p) 1 − χ(p) p∈℘ 1 − ps ps 1+ 1 Y 1− χ(p) ps ! + =: |L − SN | + |SN − ΠN,M | + |ΠN,M − ΠN | + |ΠN − Π| = I + II + III + IV. Fix > 0. (I and IV:) By convergence, we can choose N such that for all numbers greater, |SN − L| < and |ΠN − Π| < . Next we claim that we can choose M large enough that We can control first and fourth ferences: those just the finite proximations the difare ap- II: |SN − ΠN,M | < and III: |ΠN,M − ΠN | < . III is clear since, using the multiplicativity of the Dirichlet characters, Third difference obvious. χ(p) χ(pM ) χ(p) [χ(p)]M lim 1 + s + · · · + M s = lim 1 + s + · · · + M →∞ M →∞ p p p pM s ∞ X χ(p)n 1 = = ns p 1 − χ(p) n=1 ps is PROOF OF DIRICHLET’S THEOREM 3 The only non-trivial inequality is II. X χ(n) Y χ(p) χ(pM ) |SN − ΠN,M | := 1 + s + · · · + Ms − ; ns p p n≤N p≤N Of course, for n ≤ N , n’s prime factorization is composed of primes p ≤ N ; further, there cannot be more than N primes involved. So take M > N . Then, using multiplicativity, we see that the second term contains all thePterms in the first. The sum of the terms not contained in n≤N χ(n) ns are P contained in n>N χ(n) ns , which has magnitude smaller than by our choice of N . Proposition 1.2 (Corollary of the the product formulae). Recall the trivial Dirichlet character modulo q: 1 if n and q are relatively prime χ0 (n) := 0 otherwise. Minor observation along the way: L(s, χ0 ) is basically ζ(s)! If q = pa11 · · · paNN is the prime factorization of q, then −s −s L(s, χ0 ) = (1 − p−s 1 )(1 − p2 ) · (1 − pN )ζ(s). Proof. By the product formula just proven, X χ0 (n) Y 1 = χ0 (p) ns s p∈℘ 1 − n Proof: use the product formulae. and by Euler’s product formula for ζ, Y 1 ζ(s) = 1 − p1s p∈℘ χ0 (p) = 0 when p|q. p The terms that are missing are those which correspond to the primes that divide q; i.e., p1 , . . . , pN . 2. Logarithms of L-functions (Introduce “second reduction” as “corollary” here.) Proposition 2.1 (The technical proposition). Let χ be a non-trivial Dirichlet character. Then We’d like to be able to take the log of both sides of the above product formula. But we’ll need to do a lot of work first.... The technical proposition: allowing us to define the log of L. Where do we need 0 < s < 1? PROOF OF DIRICHLET’S THEOREM 4 i. L(s, χ) := ∞ X χ(n) n=1 Obtain controls on the growth of L and d ds L Simple but crucial observation in demonstrating absolute and uniform convergence in s > 0 ns converges for s > 0. ii. L(s, χ) is continuously differentiable in s. iii. For some constants c, c0 > 0, we have L(s, χ) = 1 + O(e−cs ) as s → ∞ d 0 L(s, χ) = O(e−c s ) as s → ∞. ds Lemma 2.2 (The key lemma). If χ is a non-trivial Dirichlet character modulo q, then k X χ(n) ≤ q for any k. n=1 Proof. First recall that since χ is non-trivial, q X χ(n) = 0. n=1 Q. Why do they prove this again? Using the Euclidean algorithm, write k = aq + b with 0 ≤ b < q; then aq aq+b aq+b k X X X X χ(n) = + χ(n) = χ(n). n=1 Characters map into S 1 ; Dirichlet characters can equal 0. n=1 n=aq+1 n=aq+1 By the triangle inequality (since |χ(n)| ∈ {0, 1}) the last term is less than or equal to q. 3. Proof of the technical proposition Proof of the technical proposition. It is easy to see that the series defining L(s, χ) converges uniformly and absolutely d for s > 1 as does the term-by-term derivative ds L(s, χ). We use some elementary manipulation to rewrite the series.... (Note that the book is incorrect to say that (p.263) one uses summation by parts.) Proof of (i): To show convergence for s > 0, we rewrite the P series as follows. Let Sk denote the partial sum kn=1 χ(n) (and S0 := 0). Then The L(·, χ) functions make sense for 0 < s ≤ 1 - as long as χ is non-trivial. PROOF OF DIRICHLET’S THEOREM N X χ(k) k=1 ks 5 N X Sk − Sk−1 = k=1 N X = k=1 N X ks N Sk X Sk−1 − (since S0 = 0) ks ks k=2 N −1 X Sk Sk − ks (k + 1)s k=1 k=1 N −1 X 1 1 SN + Sk s − = k (k + 1)s Ns = k=1 N −1 X SN Ns k=1 P Consider: the convergence of fk (s): by the key lemma, 1 1 |fk (s)| := Sk s − k (k + 1)s d −s ≤ q max x = −sx−s−1 x∈[k,k+1] ds 1 = qs s+1 ; k P thus k fk (s) converges absolutely and uniformly for s > 0. Differentiating the series term-by-term, we see by an argument similar to that above that the differentiated series converges uniformly for s > 0, and thus converges to a continuous function. =: fk (s) + Proof of ii: Consider: for s > 1 + (say), |L(s, χ) − 1| := ∞ X χ(n) Z n=2 ∞ ≤ ns −s x 2 −s dx = 2 2 ≤ 2−s O(1). s−1 ...as an absolutely (and uniformly) converging series! Thus the original series also converges absolutely uniformly. Control of growth is easy the 6 PROOF OF DIRICHLET’S THEOREM Taking c = log 2, gives the desired result: L(s, χ) = 1 + O(e−cs ) as s → ∞. iii is proved similarly. 4. What was the point of that? To define the log of L(s, χ) LECTURE 38: NON-VANISHING OF L(1, χ) FOR NON-TRIVIAL REAL DIRICHLET CHARACTERS 1. Recall Our goal is to prove the following. Theorem 1.1 (Partial theorem, part II). For non-trivial, real Dirichlet characters χ, L(1, χ) 6= 0. We stated (without proof) the following propositions. Proposition 1.2. There exists a number γ (the EulerMascheroni constant) satisfying N X 1 n=1 n Some simple results on “p-series”. Of course they diverge, but we will need more precise estimates on the partial sums. − log N = γ + O(1/N ). (See en.wikipedia.org/wiki/Euler-Mascheroni constant) Proposition 1.3. For N ∈ N, N X 1 − 2N 1/2 = c + O(1/N 1/2 ). 1/2 n n=1 2. Hyperbolic sums Let F : N × N → C, N ∈ N, and AN := {(m, n) ∈ N × N : mn ≤ N }. Let SN denote the finite sum X SN := F (m, n). Trivial observations on particular finite sums: that we can sum them in different ways. (m,n)∈AN Now observe that we can express SN in three different ways: 1 Completely trivial, but yet so useful.... PROOF OF DIRICHLET’S THEOREM 2 SN = N X X F (m, n) (vertically) m=1 1≤n≤N/m = N X X F (m, n) (horizontally). n=1 1≤m≤N/n = N X X F (m, n) (i.e., along hyperbolae indexed by k) k=1 nm=k 3. Return to Dirichlet’s theorem To finish the proof of Dirichlet’s theorem we will show that L(1, χ) is well-approximated by ( √1N times) a particular hyperbolic sum, namely X SN := F (m, n) (m,n)∈AN where Q. Why is it obvious that this hyperbolic sum √ approximates 2 N L(1, χ)? Recall L(1, χ) = P χ(n) n . Thus if we prove the proposition, the proof of Dirichlet’s theorem is over! Analysis of the sum of χ over divisors of k χ(n) ; F (m, n) := √ nm we’ll also show that the sum grows faster than the log N . Precisely, Proposition 3.1. Let χ be a non-trivial real Dirichlet character. With the above definitions, i. ∃ c > 0 such that SN ≥ c log N . ii. SN = 2N 1/2 L(1, χ) + O(1). Remark. The above proposition finishes the theorem for, if L(1, χ) = 0 then (ii) would imply that SN = O(1), while (i) states that SN ≥ c log N ; a contradiction. To prove the two parts of the proposition we will need the following two lemmas. Lemma 3.2. Let k ∈ N. Then PROOF OF DIRICHLET’S THEOREM X χ(n) ≥ n|k 3 0 for all k 1 if k = `2 for some ` ∈ Z Proof of lemma. Case I: k = pα for some p prime. Then X χ(n) = χ(1) + χ(p) + χ(p2 ) + · · · + χ(pα ) Simple case first: k a prime power. n|k = χ(1) + χ(p) + χ(p)2 + · · · + χ(p)α . Now, χ being a real Dirichlet character, χ(p) ∈ {−1, 0, 1}; so α + 1 if χ(p) = 1 X 1 if χ(p) = −1 and α is even χ(n) = 0 if χ(p) = −1 and α is odd n|k 1 if χ(p) = 0, i.e., p|q. P i.e., as desired, n|k χ(n) is greater than or equal to 0, and only equals zero when α is odd (i.e., k is not a square). Case II: General k. If k = pα1 1 · · · pαMM , then the divisors of k consist of the set {pβ1 1 · · · pβMM : 0 ≤ βj ≤ αj ; j = 1, . . . , M }. Thus every divisor has exactly M prime factors of powers 0 through αj (and every such product is a divisor); so X n|k χ(n) = M Y α χ(p0j ) + χ(p1j ) + χ(p2j ) + · · · + χ(pj j ) . j=1 As before, the only possibility of getting a 0 out of this is if one of the αj is odd, in which case k is not a square. Lemma 3.3 (Second lemma). Let a, b ∈ N. If a < b then P √ = O(a−1/2 ) and i. bn=a χ(n) n Pb χ(n) ii. n=a n = O(a−1 ). Remark. The proofs of these two facts are relatively straightforward; so we leave them to you. The general case follows immediately from the first. (Bad notation in text: using N for two purposes) 4 PROOF OF DIRICHLET’S THEOREM Now we can prove the proposition, which we restate again for convenience. Proposition 3.4. Let χ be a non-trivial real Dirichlet character. With the above definitions, i. ∃ c > 0 such that SN ≥ c log N . ii. SN = 2N 1/2 L(1, χ) + O(1). Proof. (i) Summing along hyperbolae, we see X SN := (m,n)∈AN χ(n) √ nm N X X χ(n) √ = nm = k=1 nm=k N X X k=1 1 √ k χ(n). n|k By the first lemma, then, we can obtain the desired lower bound on SN as follows: N X 1 X √ SN = χ(n) k k=1 n|k X 1 X 1 √ = ≥ = log N 1/2 + O(1). k `≤N 1/2 ` k=`2 ,`≤N 1/2 (ii) We want to get a precise estimate of SN := X (m,n)∈AN χ(n) √ . nm PROOF OF DIRICHLET’S THEOREM 5 We separate SN into the sum SI + SII + SIII , where the indices in the respective sums lie in the regions √ √ N N, N < n ≤ } √ √ m II := {(n, m) ∈ N : 1 ≤ m ≤ N , 1 ≤ n ≤ N } √ √ N III := {(n, m) ∈ N : N < m ≤ , 1 ≤ n < N }. n I := {(n, m) ∈ N : 1 ≤ m < Consider SI : summing vertically, we get SI := X √ √ N 1≤m< N , N <n≤ m χ(n) √ nm = X √ m< N X 1 √ m √ N N <n≤ m χ(n) √ n By the second lemma, the term in parentheses is of the order O((N 1/2 )−1/2 ) = O(N −1/4 ); using the 21 -series estimate X 1≤n≤M 1 n1/2 = 2M 1/2 + c + O 1 √ M we get 1 1/4 SI = 2N + c + O O(N −1/4 ) = O(1) 1/4 N as N → ∞. 6 PROOF OF DIRICHLET’S THEOREM For the other two terms, we sum horizontally, again using the 12 -series estimate: X χ(n) X 1 SII + SIII = 1/2 m1/2 √ n 1≤m≤ N 1≤n< N ( n ) 1/2 X χ(n) N n 1/2 = 2 + c + O 1/2 n N √ n 1≤n< N X χ(n) X χ(n) 1 +c + O = 2N 1/2 n n1/2 N 1/2 1/2 1/2 1≤n≤N the last term from X χ(n) n1/2 1 = n1/2 N 1/2 N 1/2 1/2 1≤n<N 1≤n≤N X χ(n). 1≤n≤N 1/2 Let’s call these three terms A, B, and C. P χ(n) Now, since L(1, χ) = ∞ n=1 n , and since ∞ X χ(n) 1/2 −1/2 1/2 =N O N , N n 1/2 n=N we see that A = 2N 1/2 L(1/χ) + O(1). Further, part (i) of the lemma implies that X χ(n) = O(1); 1/2 n 1/2 1≤n≤N thus B = O(1). Finally, obviously C = O(1), and so SN = 2N 1/2 L(1/χ) + O(1), and the proposition is proved. X 1≤n≤N 1/2 1 , LECTURE 37: PRODUCT FORMULA FOR THE L-FUNCTIONS We’d like to be able to take the log of both sides of the above product formula. But we’ll need to do a lot of work first.... The technical proposition: allowing us to define the log of L. Where do we need 0 < s < 1? The L(·, χ) functions make sense for 0 < s ≤ 1 - as long as χ is non-trivial. Obtain controls on the growth of L and d ds L Notice we don’t define log2 in general. The point is that this integral turns out, for a non-trivial Dirichlet character to be the primary branch of the log. We need control of the growth to justify this definition. 1. Recall the technical proposition Proposition 1.1 (The technical proposition). Let χ be a non-trivial Dirichlet character. Then i. ∞ X χ(n) L(s, χ) := converges for s > 0. s n n=1 ii. L(s, χ) is continuously differentiable in s. iii. For some constants c, c0 > 0, we have L(s, χ) = 1 + O(e−cs ) as s → ∞ d 0 L(s, χ) = O(e−c s ) as s → ∞. ds 2. What was the point of that? To define the log of L(s, χ) Definition 2.1. Let s > 1, χ a non-trivial Dirichlet character. We define the “logarithm of L(s, χ)” by Z ∞ 0 L (t, χ) dt. log2 L(s, χ) := − L(t, χ) s Notice that the integral converges, since L(t, χ) can be made arbitrarily close to 1, and L0 (t, χ) = O(e−ct ), which together imply L0 (t, χ) = O(e−ct ) as t → ∞. L(t, χ) Proposition 2.2. Let s > 1. Then i. elog2 L(s,χ) = L(s, χ) and (most importantly for us,) 1 PROOF OF DIRICHLET’S THEOREM 2 ii. log2 L(s, χ) = P p∈℘ log1 1 1− χ(p) ps Getting the logarithmic version of Dirichlet’s product formula for the L-functions Proof. (i) We want to show that e− log2 L(s,χ) L(s, χ) = 1. Well, the derivative of the LHS w.r.t s is L0 (s, χ) − log2 L(s,χ) 0 − log2 L(s,χ) L(s, χ) = 0, e L (s, χ) + e − L(s, χ) so the LHS is constant with respect to s. Now consider R∞ lim e− log2 L(s,χ) L(s, χ) := lim e s→∞ s→∞ s L0 (t,χ) L(t,χ) dt The derivative trick again. L(s, χ). The integral, being convergent, vanishes as s → ∞, so the first term tends to 1 as s → ∞; since L(s, χ) = 1 + O(e−cs ), so does the second. (ii) Now let’s show ! X 1 log2 L(s, χ) = log1 . χ(p) 1 − ps p∈℘ LHS Well, consider the exponential of both sides. e e = L(s, χ), as we just saw. As for the RHS, log2 L(s,χ) = The crucial second step in the second reduction We’ll show eA = eB , which almost implies A = B... ! P log2 L(s,χ) e =e p log1 1 χ(p) 1− s p ! = Y log1 e 1 χ(p) 1− s p p∈℘ = Y p∈℘ 1 1− χ(p) ps ! = L(s, χ) also. We’re not done! This only means that ! X 1 log2 L(s, χ) − log1 = 2πiM (s) χ(p) 1 − p∈℘ ps ...but not quite. PROOF OF DIRICHLET’S THEOREM 3 for some integer-valued function M . It is left as an exercise to show that M is continuous and that lims→∞ M (s) = 0 which forces M ≡ 0 (since M (s) ∈ Z). What was the point of all that? Following the same lines of the argument of Euler, but to show finiteness rather than infiniteness. Here’s why we needed the integral representation of the log Now comes the deep part.... My God, what a wonderful piece of work. 3. Second reduction of the problem By the proposition just proved and subsequently the properties of log1 proven earlier, ! X 1 log2 L(s, χ) = log1 1 − χ(p) p∈℘ ps X χ(p) 1 = +E s p ps p∈℘ X χ(p) = + C. s p p∈℘ Now, if we can show that L(1, χ) 6= 0 for χ non-trivial, then since L(s, χ) is continuously differentiable for 0 < s < ∞, L0 (s, χ) is bounded near s = 1, so Z ∞ 0 L (t, χ) lim+ log2 L(s, χ) := lim+ dt is bounded. s→1 s→1 L(t, χ) s P In that case, the above equality implies lims↓1 p∈℘ χ(p) ps is also bounded; that is, the reduced problem would be solved. Thus it suffices to prove the following claim: Theorem 3.1. L(1, χ) 6= 0 for non-trivial χ. We will do this in two cases, that of complex Dirichlet characters and that of real ones. The former is the easier. 4. Non-vanishing of the L-function Case I: complex Dirichlet characters Theorem 4.1 (Partial theorem). For non-trivial, complex Dirichlet characters χ, L(1, χ) 6= 0. 4 PROOF OF DIRICHLET’S THEOREM The proof will be by contradiction and rely on the following two lemmas. Q Lemma 4.2. Let s > 1. Then χ L(s, χ) ≥ 1; in particular, the product is real-valued. Proof. Using the product formula for the L-function and the definition of log1 , we see !! Y XX 1 L(s, χ) = exp log1 1 − χ(p) χ χ p∈℘ ps !! XX 1 = exp log1 1 − χ(p) χ p∈℘ ps ! ∞ XXX 1 χ(pk ) = exp k pks χ p∈℘ k=1 ! ∞ X XX 1 1 χ(pk ) . = exp ks kp χ p∈℘ Product over all χ is real-valued. k=1 Now, by Fourier series expansion characteristic funcP of the 1 k k tions of points, δ1 (p ) = ϕ(q) χ χ(p ), so the above equals ! ∞ XX 1 1 exp ϕ(q)δ1 (pk ) ≥ 1, ks kp p∈℘ (Another cool application of Fourier series.) Error in text: δ1 , not δ0 . (Unless by 0 they mean 1.) k=1 since the term in the parentheses is non-negative. Lemma 4.3. i. If L(1, χ) = 0, the L(1, χ) = 0 (obvious). ii. If L(1, χ) = 0 and χ is non-trivial, then |L(s, χ)| ≤ C|s − 1| for 1 ≤ s ≤ 2. iii. In the case of χ = χ0 , we have C |L(s, χ0 )| ≤ for 1 < s ≤ 2. |s − 1| On the behavior of L(s, χ) as s ↓ 1: how it vanishes; how it blows up. PROOF OF DIRICHLET’S THEOREM 5 Proof. i) L(1, χ) = L(1, χ). ii) For χ non-trivial, L(s, χ) is continuously differentiable for s > 0. Thus, by the Mean Value Theorem, L(s, χ) − L(1, χ) = L0 (s0 , χ) s−1 for some s0 ∈ (1, s). Since L0 (s, χ) is continuous, it has a bound C on [1, 2]. 1 s iii) Recall that ζ(s) ≤ 1 + s−1 = s−1 . For 1 < s ≤ 2, then, 2 |ζ(s)| ≤ . s−1 Since L(s, χ0 ) = Cζ(s), we are done. Theorem 4.4 (Partial theorem). For non-trivial, complex Dirichlet characters χ, L(1, χ) 6= 0. The point: for complex Dirichlet characters, we’re guaranteed that they occur in pairs; so if L(1, χ) = 0, then L(1, χ) = 0, overpowering the blowing-up L(1, χ0 ). Proof of the partial theorem. By contradiction: suppose L(1, χ1 ) = 0; then by the previous lemma, i. L(1, χ1 ) = 0 and ii. For χ = χ1 and χ1 , |L(s, χ)| ≤ C|s − 1| for 1 ≤ s < 2. Q If we consider the product χ L(s, χ), then, we see that there are at least two Dirichlet characters (χ1 and χ1 ) that vanish like |s − 1| as s ↓ 1. L(s, χ) being continuously differentiable on s > 0 for χ non-trivial, we see the only term in this product that could (and does) go to infinity as 1 s ↓ 1 is L(s, χ0 ). But its growth is of order O( s−1 ), so Y lim L(s, χ) = 0, in contradiction with the earlier lemma. s→1+ χ LECTURE 37: NON-VANISHING OF L(1, χ) FOR COMPLEX DIRICHLET CHARACTERS We’d like to be able to take the log of both sides of the above product formula. But we’ll need to do a lot of work first.... The technical proposition: allowing us to define the log of L. Where do we need 0 < s < 1? The L(·, χ) functions make sense for 0 < s ≤ 1 - as long as χ is non-trivial. Obtain controls on the growth of L and d ds L Notice we don’t define log2 in general. The point is that this integral turns out, for a non-trivial Dirichlet character to be the primary branch of the log. We need control of the growth to justify this definition. 1. Recall the Technical Proposition Proposition 1.1 (The technical proposition). Let χ be a non-trivial Dirichlet character. Then i. ∞ X χ(n) converges for s > 0. L(s, χ) := s n n=1 ii. L(s, χ) is continuously differentiable in s. iii. For some constants c, c0 > 0, we have L(s, χ) = 1 + O(e−cs ) as s → ∞ d 0 L(s, χ) = O(e−c s ) as s → ∞. ds 2. What was the point of that? To define the log of L(s, χ) Definition 2.1. Let s > 1, χ a non-trivial Dirichlet character. We define the “logarithm of L(s, χ)” by Z ∞ 0 L (t, χ) log2 L(s, χ) := − dt. L(t, χ) s Remark. Notce that this integral converges since L(t, χ) can be made arbitrarily close to 1, and L0 (t, χ) = O(e−ct ), which together imply L0 (t, χ) = O(e−ct ). L(t, χ) Proposition 2.2. Let s > 1. Then i. elog2 L(s,χ) = L(s, χ) and, most importantly for us, 1 PROOF OF DIRICHLET’S THEOREM 2 ii. log2 L(s, χ) = P p∈℘ log1 1 1− χ(p) ps Getting the logarithmic version of Dirichlet’s product formula for the L-functions Proof. i) We want to show that e− log2 L(s,χ) L(s, χ) = 1. Well, the derivative of the LHS w.r.t s is L0 (s, χ) − log2 L(s,χ) 0 − log2 L(s,χ) L(s, χ) = 0, e L (s, χ) + e − L(s, χ) so the LHS is constant with respect to s. Now consider R∞ lim e− log2 L(s,χ) L(s, χ) := lim e s→∞ s L0 (t,χ) L(t,χ) dt s→∞ L(s, χ). The integral, being convergent, vanishes as s → ∞, so the first term tends to 1 as s → ∞; since L(s, χ) = 1 + O(e−cs ), so does the second. ii) Now let’s show ! X 1 log2 L(s, χ) = log1 . χ(p) 1 − ps p∈℘ LHS Well, consider the exponential of both sides. e e = L(s, χ), as we just saw. As for the RHS, log2 L(s,χ) = The crucial second step in the second reduction We’ll show eA = eB , which almost implies A = B... ! P e log2 L(s,χ) =e p log1 1 χ(p) 1− s p log1 1 χ(p) 1− s p ! P =e p ! = Y log1 e 1 χ(p) 1− s p p∈℘ = Y p∈℘ 1 1− χ(p) ps ! = L(s, χ) also. We’re not done! This only means that ! X 1 log2 L(s, χ) − log1 = 2πiM (s) χ(p) 1 − p∈℘ ps ...but not quite. PROOF OF DIRICHLET’S THEOREM 3 for some integer-valued function M . It is left as an exercise to show that M is continuous and that lims→∞ M (s) = 0 which forces M ≡ 0 (since M (s) ∈ Z). What was the point of all that? Following the same lines of the argument of Euler, but to show finiteness rather than infiniteness. (Here’s why we needed the integral representation of log L.) Now comes the deep part.... My God, what a wonderful piece of work. 3. Second reduction of the problem By the proposition just proved and subsequently the properties of log1 proven earlier, ! X 1 log2 L(s, χ) = log1 1 − χ(p) p∈℘ ps X χ(p) 1 = +E s p ps p∈℘ X χ(p) = + C. s p p∈℘ Now, if we can show that L(1, χ) 6= 0 for χ non-trivial, then since L(s, χ) is continuously differentiable for 0 < s < ∞, L0 (s, χ) is bounded near s = 1, so Z ∞ 0 L (t, χ) lim log2 L(s, χ) := lim+ dt is bounded. s↓1 s→1 L(t, χ) s P In that case, by the above equality, lims↓1 p∈℘ χ(p) ps is also bounded; that is, the reduced problem would be solved. Thus it suffices to show the following claim: Theorem 3.1. L(1, χ) 6= 0 for non-trivial χ. We will do this in two cases, that of complex Dirichleet characters and that of real ones. The former is the easier. 4. Non-vanishing of the L-function Case I: complex Dirichlet characters Theorem 4.1 (Partial theorem). For non-trivial, complex Dirichlet characters χ, L(1, χ) 6= 0. 4 PROOF OF DIRICHLET’S THEOREM The proof will be by contradiction and rely on the following two lemmas. Q Lemma 4.2. Let s > 1. Then χ L(s, χ) ≥ 1; in particular, the product is real-valued. Proof. Using the product formula for the L-function and the definition of log1 , we see !! Y XX 1 L(s, χ) = exp log1 1 − χ(p) χ χ p∈℘ ps !! XX 1 = exp log1 1 − χ(p) χ p∈℘ ps ! ∞ XXX 1 χ(pk ) = exp k pks χ p∈℘ k=1 ! ∞ XX 1 1 X = exp χ(pk ) . ks kp χ p∈℘ k=1 Now, by Fourier series expansion characteristic funcP of the 1 k k tions of points, δ1 (p ) = ϕ(q) χ χ(p ), so the above equals ! ∞ XX 1 1 ϕ(q)δ1 (pk ) ≥ 1, exp ks kp p∈℘ k=1 since the term in the parentheses is non-negative. Lemma 4.3. i. If L(1, χ) = 0, the L(1, χ) = 0 (obvious). ii. If L(1, χ) = 0 and χ is non-trivial, then |L(s, χ)| ≤ C|s − 1| for 1 ≤ s ≤ 2. iii. In the case of χ = χ0 , we have C |L(s, χ0 )| ≤ for 1 < s ≤ 2. |s − 1| (Another cool application of Fourier series. Recall δ` (m) = P 1 χ χ(`)χ(m).) ϕ(q) Error in text: δ1 , not δ0 . (Unless by 0 they mean 1.) On the behavior of L(s, χ) as s ↓ 1: how it vanishes (if it vanishes); how it blows up (if χ = χ0 ). PROOF OF DIRICHLET’S THEOREM P 1 ns ≤ 1+ 1 . 1 + s−1 R∞ 1 dx xs = 5 Proof. i) L(1, χ) = L(1, χ) ii) For χ non-trivial, L(s, χ) is continuously differentiable for s > 0. Thus, by the Mean Value Theorem, L(s, χ) − L(1, χ) (1) = L0 (s0 , χ) s−1 for some s0 ∈ (1, s). Since L0 (s, χ) is continuous, it has a bound C on [1, 2]. 1 s iii) Recall that ζ(s) ≤ 1 + s−1 = s−1 . For 1 < s ≤ 2, then, 2 |ζ(s)| ≤ (2) . s−1 Since L(s, χ0 ) = cζ(s), we are done. Theorem 4.4 (Partial theorem). For non-trivial, complex Dirichlet characters χ, L(1, χ) 6= 0. The point: for complex Dirichlet characters, we’re guaranteed that they occur in pairs; so if L(1, χ) = 0, then L(1, χ) = 0, overpowering the blowing-up of L(1, χ0 ) as s ↓ 1. Proof of the partial theorem. By contradiction: suppose L(1, χ1 ) 6= 0; then, by the previous lemma, L(1, χ1 ) = 0 also, and for both χ = χ1 , χ1 , we have (3) |L(s, χ)| ≤ C|s − 1| for 1 ≤ s < 2. Q Now consider χ L(s, χ). For non-trivial χ, L(s, χ) is continuously differentiable on s > 0 and thus finite at L(1, χ) (i.e., when s = 1; among those Dirichlet characters, there are at least two (viz., χ1 and χ1 ) that vanish (like O(|s − 1|) as s ↓ 1). The only term that can (and does) go to infinity as s ↓ 1 is L(s, χ0 ), which grows like 1 O s−1 . Thus Y (4) lim+ L(s, χ) = 0, s→1 χ in contradiction with the fact that Q χ L(s, χ) ≥ 1. LECTURE 38: NON-VANISHING OF L(1, χ) FOR NON-TRIVIAL REAL DIRICHLET CHARACTERS 1. Recall the Goal; Behavior of p-series Our goal is to prove the following. Theorem 1.1 (Partial theorem, part II). For non-trivial, real Dirichlet characters χ, L(1, χ) 6= 0. We stated (without proof) the following propositions. Proposition 1.2. There exists a number γ (the EulerMascheroni constant) satisfying N X 1 − log N = γ + O(1/N ). n n=1 Precise estimates on the partial sums of certain divergent pseries. (See en.wikipedia.org/wiki/Euler-Mascheroni constant) Proposition 1.3. For N ∈ N, N X 1 − 2N 1/2 = c + O(1/N 1/2 ). 1/2 n n=1 2. Approximating L(1, χ) with hyperbolic sums To finish the proof of Dirichlet’s theorem we will show that L(1, χ) is well-approximated by a particular hyperbolic sum, namely X χ(n) √ SN := nm (m,n)∈AN where AN := {(m, n) ∈ N × N : mn ≤ N }. We’ll see that ∞ X X χ(n) χ(n) 1 1 √ √ L(1, χ) := ≈ =: √ SN n mn 2 N (m,n)∈A 2 N n=1 N 1 The trick: approximating L(1, χ) with a particular hyperbolic sum. PROOF OF DIRICHLET’S THEOREM 2 Q. Why is it obvious that this hyperbolic sum √ approximates 2 N L(1, χ)? Recall L(1, χ) = P χ(n) n . Thus if we prove the proposition, the proof of Dirichlet’s theorem is over! Analysis of the sum of χ over divisors of k and, further, that the sum grows faster than log N . Precisely: Proposition 2.1. Let χ be a non-trivial real Dirichlet character. With the above definitions, i. ∃ c > 0 such that SN ≥ c log N . ii. SN = 2N 1/2 L(1, χ) + O(1). Remark. The above proposition finishes the theorem for, if L(1, χ) = 0 then (ii) would imply that SN = O(1), while (i) states that SN ≥ c log N ; a contradiction. To prove the two parts of the proposition we will need the following two lemmas. Lemma 2.2. Let k ∈ N. Then X 0 for all k χ(n) ≥ 1 if k = `2 for some ` ∈ Z n|k Simple case first: k a prime power. Proof of lemma. Case I: k = pα for some p prime. Then X χ(n) = χ(1) + χ(p) + χ(p2 ) + · · · + χ(pα ) n|k The general case follows immediately from the first. (Bad notation in text: using N for two purposes) = χ(1) + χ(p) + χ(p)2 + · · · + χ(p)α . Now, χ being a real Dirichlet character, χ(p) ∈ {−1, 0, 1}; so α + 1 if χ(p) = 1 X 1 if χ(p) = −1 and α is even χ(n) = 0 if χ(p) = −1 and α is odd n|k 1 if χ(p) = 0, i.e., p|q. P i.e., as desired, n|k χ(n) is greater than or equal to 0, and only equals zero when α is odd (i.e., k is not a square). Case II: General k. If k = pα1 1 · · · pαMM , then the divisors of k consist of the set {pβ1 1 · · · pβMM : 0 ≤ βj ≤ αj ; j = 1, . . . , M }. PROOF OF DIRICHLET’S THEOREM 3 Thus every divisor has exactly M prime factors of powers 0 through αj (and every such product is a divisor); so X n|k χ(n) = M Y α χ(p0j ) + χ(p1j ) + χ(p2j ) + · · · + χ(pj j ) . j=1 As before, the only possibility of getting a 0 out of this is if one of the αj is odd, in which case k is not a square. Lemma 2.3 (Second lemma). Let a, b ∈ N. If a < b then P √ = O(a−1/2 ) and i. bn=a χ(n) n Pb χ(n) ii. n=a n = O(a−1 ). The product can equal zero only when one of the sums is 0; by the previous case, αj must be odd Remark. The proofs of these two facts are relatively straightforwardl; we leave them to you. Now we can prove the proposition, which we restate again for convenience. Proposition 2.4. Let χ be a non-trivial real Dirichlet character. With the above definitions, i. ∃ c > 0 such that SN ≥ c log N . ii. SN = 2N 1/2 L(1, χ) + O(1). Proof. (i) Summing along hyperbolae, we see X χ(n) √ SN := nm Summing along hyperbolae gives the lower bound result. (m,n)∈AN N X X χ(n) √ = nm k=1 nm=k N X 1 X √ = χ(n) . k n|k k=1 By the first lemma, then, since the term in parentheses is (The pair (m, n) such that nm = k is determined by the first entry.) 4 PROOF OF DIRICHLET’S THEOREM ≥ 1 when k is square, 1 √ k k=`2 ,`≤N 1/2 X 1 = = log N 1/2 + O(1), ` 1/2 SN ≥ X `≤N Summing vertically and horizontally gives L(1, χ). which is the desired lower bound on SN (the last equality follows from the estimate of the partial sum of the harmonic series). (ii) Consider: X χ(n) √ SN := . nm (m,n)∈AN Break the sum into various regions. (Draw the hyperbola correctly!) We separate SN into the sum SN = SI + SII + SIII , where the indices for the respective sums lie in the regions √ √ N I := {(n, m) ∈ N : 1 ≤ m < N , N < n ≤ } √ √ m II := {(n, m) ∈ N : 1 ≤ m ≤ N , 1 ≤ n ≤ N } √ √ N III := {(n, m) ∈ N : N < m ≤ , 1 ≤ n < N }. n Consider SI : summing vertically, we get X χ(n) √ SI := nm √ √ N 1≤m< N , N <n≤ m X χ(n) X 1 √ √ = m √ n √ N m< N Use the bound on P χ(n) √ n and the bound on the p = 12 -series N <n≤ m By the second lemma, the term in parentheses is of the order O((N 1/2 )−1/2 ) = O(N −1/4 ); using the 21 -series estimate we get PROOF OF DIRICHLET’S THEOREM 5 1 1/4 SI = 2N + c + O O(N −1/4 ) = O(1) 1/4 N as N → ∞. For the other two terms, we sum horizontally. Using the 1 SII + SIII : use the 2 -series estimate, 1 2 -series bound again X χ(n) X 1 SII + SIII = 1/2 m1/2 √ n N 1≤m≤ 1≤n< N ( n ) 1/2 X χ(n) n 1/2 N = 2 + c + O 1/2 n N √ n 1≤n< N X χ(n) X X χ(n) 1 +c + O 1 , = 2N 1/2 1/2 1/2 n n N 1/2 1/2 1/2 1≤n≤N 1≤n≤N 1≤n≤N the last term from X 1≤n<N χ(n) n1/2 1 = n1/2 N 1/2 N 1/2 1/2 X χ(n). 1≤n≤N 1/2 Let’s call these three terms A, B, and C. We’ll deal with them in reverse order. Obviously C = O(1). Further, part (i) of the lemma implies that X 1≤n≤N χ(n) = O(1); 1/2 n 1/2 thus B = O(1). P χ(n) Now, since L(1, χ) = ∞ n=1 n , and since ∞ X χ(n) −1/2 =O N , n 1/2 n=N C and B are O(1) A: we basically have L(1, χ) 6 PROOF OF DIRICHLET’S THEOREM we see that χ(n) n 1≤n≤N 1/2 h i 1/2 −1/2 = 2N L(1, χ) − O N A := 2N 1/2 X = 2N 1/2 L(1, χ) + O(1). giving the desired estimate for SN .