The Dirac δ-Function and Weierstrass’s Theorem 1 Physicist’s Approach There are a number of ways to motivate the introduction of the Dirac delta function, and we will look at two of them. Since you may not be familiar with Fourier transforms, I will begin with a brief derivation assuming that you are however familiar with Fourier series. The Fourier series of a continuous and periodic function f (x) is given by f (x) = A0 + ∞ h X An cos n=1 nπx nπx i + Bn sin a a where we have chosen the period to be p = 2a. Using the identity e±iθ = cos θ ± i sin θ we can write cos and sin so that f (x) = A0 + 1 inπx/a nπx e − e−inπx/a = a 2i ∞ X An n=1 = C0 + 1 nπx = einπx/a + e−inπx/a a 2 2 + Bn 2i einπx/a + An Bn − 2 2i e−inπx/a ∞ h i X Cn einπx/a + C−n e−inπx/a n=1 or simply f (x) = ∞ X Cn einπx/a . (1) n=−∞ To find the coefficient Cm we multiply by e−imπx/a and integrate from −a to a: Z a f (x) e −imπx/a dx = −a ∞ X Cn ∞ X Cn 2a δnm n=−∞ = n=−∞ = 2aCm 1 Z a ei(n−m)πx/a dx −a so that Cm = 1 2a a Z f (x) e−imπx/a dx. (2) −a To handle a non-periodic function, we assume that it is periodic, but with period p = 2a → ∞. Let us define kn = nπ/a. The difference between two values of kn is δk = π/a or a = π/δk. Then we can write equations (1) and (2) as ∞ X Cn eikn x f (x) = n=−∞ and so that δk Cn = 2π Z a ′ f (x′ ) e−ikn x dx′ −a Z a ∞ 1 X ikn x ′ −ikn x′ ′ e δk f (x ) e dx . f (x) = 2π n=−∞ −a As we let a → ∞ we have δk → 0, and the sum becomes an integral: Z ∞ Z a ′ 1 f (x) = eikx dk f (x′ ) e−ikx dx′ . 2π −∞ −a If we define the Fourier transform of f (x) by Z ∞ 1 ˜ √ f (x) e−ikx dx f (k) = 2π −∞ then we can write 1 f (x) = √ 2π ∞ Z f˜(k) eikx dk. (3a) (3b) −∞ Equations (3) are called a Fourier transform pair. Note that it really doesn’t matter which of these has the positive exponent in the integrand. They can also be written in the form where one has the coefficient 1/2π and the other doesn’t have any such factor at all. Furthermore, the generalization to three dimensions is just Z ∞ 1 ˜ f (k) = f (x) e−ik·x d3 x (4a) (2π)3/2 −∞ and 1 f (x) = (2π)3/2 Z ∞ f˜(k) eik·x d3 k. (4b) −∞ Let us substitute (3a) into (3b) to write Z ∞ Z ∞ ′ 1 f (x) = f (x′ ) eik(x−x ) dk dx′ . 2π −∞ −∞ 2 (5) Now look at the term in brackets: Z ∞ Z a ′ 1 1 ik(x−x′ ) I= e dk = lim eik(x−x ) dk a→∞ 2π −a 2π −∞ a 1 1 ik(x−x′ ) = lim e a→∞ 2π i(x − x′ ) −a i ′ 1 h ia(x−x′ ) 1 e − e−ia(x−x ) = lim ′ a→∞ π(x − x ) 2i sin a(x − x′ ) . a→∞ π(x − x′ ) = lim Consider the function a sin a(x − x′ ) sin a(x − x′ ) = . g(x, x ) = π(x − x′ ) π a(x − x′ ) ′ A plot of this for a = 10 is shown in Figure 1 below. 3 2 1 -2 1 -1 2 -1 Figure 1: The function (a/π)[sin a(x − x′ )]/[a(x − x′ )] for a = 10. Since sin θ =1 θ we see that g(x, x′ ) has height a/π, and first crosses the (x − x′ )-axis at x − x′ = ±π/a. Thus the area of the central peak is approximated by the area of the triangle with height a/π and base 2π/a for an area of (1/2)(2π/a)(a/π) = 1 which is independent of a. This underestimates the area under the main peak but ignores the tails, and these actually cancel each other out. Therefore, as a → ∞ the height of g(x, x′ ) becomes infinite as its width goes to zero, while the area under the curve remains equal to 1. Writing Z ∞ ′ 1 δ(x − x′ ) := lim g(x, x′ ) = eik(x−x ) dk (6a) a→∞ 2π −∞ lim θ→0 3 we see from (5) that f (x) = Z ∞ −∞ f (x′ ) δ(x − x′ ) dx′ and from the above discussion we also have Z ∞ δ(x − x′ ) dx′ = 1. (6b) (6c) −∞ These are the two main defining properties of the Dirac delta function. In three dimensions these become Z ∞ ′ 1 ′ δ(x − x ) = eik·(x−x ) d3 k (7a) 3 (2π) −∞ where f (x) = and Z ∞ −∞ Z ∞ −∞ f (x′ ) δ(x − x′ ) d3 x′ δ(x − x′ ) d3 x′ = 1. (7b) (7c) There is a second way we can define the delta function that is sometimes very useful. It starts with the Heaviside step function θ(x) (also sometimes written as H(x)) defined by ( 1 for x > 0 θ(x) = 0 for x ≤ 0. (Sometimes this is defined with strict inequalities and θ(0) = 1/2.) From the definition of derivative, it should be clear that dθ(x) =0 dx x6=0 while dθ(x) = ∞. dx x=0 However, if f (x) is a smooth function, then for any ε > 0 we have Z ε Z ε df (x) d dθ(x) dx dx = [f (x)θ(x)] − θ(x) f (x) dx dx dx −ε −ε Z ε df (x) dx = f (0). = f (ε) − dx 0 Thus we define the “function” δ(x) := 4 dθ(x) dx which, for ε > 0, has the property that Z ε f (x)δ(x) dx = f (0) and −ε ε Z δ(x) dx = 1. −ε We can easily generalize this by considering the function ( 1 for x > a θ(x − a) = 0 for x ≤ a. Now we define dθ(x − a) dx and integrating by parts again we see that Z ∞ Z ∞ dθ(x − a) f (x)δ(x − a) dx = f (x) dx dx −∞ −∞ ∞ Z ∞ df (x) dx = f (x)θ(x − a) − θ(x − a) dx −∞ −∞ Z ∞ df (x) dx = f (a). = f (∞) − dx a δ(x − a) = Therefore δ(x − a) has the property that Z ∞ f (x)δ(x − a) dx = f (a) and −∞ Z ∞ −∞ δ(x − a) dx = 1. If we are in three dimensions, then we define δ 3 (x − x0 ) = δ(x − x0 )δ(y − y0 )δ(z − z0 ). (It is very common to write simply δ(x) instead of δ 3 (x).) If the coordinates are not Cartesian, then we have to be careful. For example, suppose we are using spherical coordinates (r, θ, φ). Then the volume element is d3 r = r2 sin θ drdθdφ and we must define δ 3 (r − r0 ) = so that we still have Z ∞ 0 2 r dr Z 0 δ(r − r0 )δ(θ − θ0 )δ(φ − φ0 ) r2 sin θ π sin θ dθ Z 0 5 2π dφ δ 3 (r − r0 ) = 1. 2 More Rigorous Approach In physics, we may define the δ-function by the following properties: δ(x) = 0 everywhere except at the singular point x = 0, and Z ∞ f (x)δ(x) dx = f (0) (8) −∞ so that taking f (x) = 1 yields ∞ Z δ(x) dx = 1. (9) −∞ By changing variables to y = x − x0 in the following integral, it is easy to see using (8) that Z ∞ f (x)δ(x − x0 ) dx = f (x0 ). (10) −∞ Since a real-valued function of a real variable is a rule that assigns to each number in a set another number, the δ-function is not a true function (what happens at x = 0?), but behaves like a true function “almost everywhere.” In addition, since the integral of a function that vanishes everywhere except on a set of measure zero (and in particular, except at a single point) is zero, there is no function δ(x) that can vanish almost everywhere and still satisfy (8) or (9). In fact, the δ-function is really a shorthand notation for a limiting process. The idea is that we want a set of functions δα (x), parametrized by the index α, that have the properties lim δα (x) = 0 (11) α→0 and lim α→0 Z ∞ f (x)δα (x) dx = f (0). (12) −∞ If we denote limα→0 δα (x) by δ(x) in these equations, then our previous equations result (assuming that the limits can be interchanged with the integration, which is not necessarily true in general). Let us now consider some examples of this limiting process. 1. The simplest set of functions with the proper behavior is the set of functions δc (x) defined by ( 1/c for |x| ≤ c/2 δc (x) = 0 for |x| > c/2 See Figure 2 below. R∞ Clearly we have limc→0 δc (x) = 0 for all x 6= 0, and −∞ δc (x) dx = 1 independent of c. The function δc (x) is defined for all c 6= 0, and we have Z ∞ lim δc (x) dx = 1. c→0 −∞ 6 12 10 8 6 4 2 -1.0 0.0 -0.5 0.5 1.0 Figure 2: The function δc (x) for c = 1.0, 0.5 and 0.1 Furthermore, if f (x) is continuous we have Z Z ∞ Z c/2 1 c/2 f (x) dx. lim f (x)δc (x) dx = lim f (x)δc (x) dx = lim c→0 −∞ c→0 −c/2 c→0 c −c/2 But, by the mean value theorem for integrals, there exists ξ with −1/2 < ξ < 1/2 such that Z Z c/2 c/2 f (x) dx = f (ξc) −c/2 dx = cf (ξc). −c/2 Finally, letting c → 0 we obtain Z ∞ lim f (x)δc (x) dx = f (0). c→0 −∞ 2. Another representation of the δ-function comes from the sequence of Gaussian functions defined by 2 2 1 δa (x) = √ e−x /a . a π See Figure 3 below. Note that lima→0 δa (x) = 0 for all x 6= 0. And using the well-known result (a proof is given at the end of this example) r Z ∞ 2 π e−αx dx = α −∞ R∞ we see that −∞ δa (x) dx = 1 independent of a. Lastly, as a → 0 the entire contribution to the integral comes from a neighborhood of 0, and hence Z ∞ lim f (x)δa (x) dx = f (0). a→0 −∞ 7 6 5 4 3 2 1 -1.0 0.0 -0.5 0.5 1.0 Figure 3: The function δa (x) for a = 1.0, 0.5 and 0.1 Thus we may write symbolically 2 2 1 δ(x) = lim δa (x) = lim √ e−x /a . a→0 a→0 a π Here is the proof of the above quoted result. Let I = I2 = Z ∞ −∞ = Z 0 = Z 2 e−αx dx e −αr 2 r dr dθ = 2π 0 Z 2 e−αy dy −∞ 2πZ ∞ −π α ∞ −∞ eu du = 0 −1 2α = Z ∞ −∞ ∞ Z Z ∞ R∞ −∞ 2 e−αx dx. Then e−α(x 2 +y 2 ) dx dy −∞ 2 e−αr (−2αr) dr 0 π α and taking the square root proves the result. 3. A third useful representation of the δ-function is δ(x) = lim δε (x) := lim ε→0 ε→0 ε 1 . π x2 + ε2 See Figure 4 below. Using ε π Z ∞ −∞ ε dx = x2 + ε2 π Z π/2 −π/2 dθ =1 ε (which follows from the trig substitution x = ε tan θ) we can follow the previous approach to verify the desired properties. 8 4 3 2 1 -1.0 0.0 -0.5 0.5 1.0 Figure 4: The function δε (x) for ε = 1.0, 0.5 and 0.1 4. Our last representation of preceding ones. It will be integral Define the functions δn (x) by ( cn (1 − x2 )n δn (x) = 0 the δ-function is slightly different from the to the proof of Weierstrass’s theorem below. for 0 ≤ |x| ≤ 1, for |x| > 1 n = 1, 2, 3, . . . where the normalization constant cn is defined so that Z 1 δn (x) dx = 1. (13) (14) −1 To determine cn , we note that (1 − x2 )n is an even function of n, and hence we have Z π/2 Z 1 Z 1 1 (cos θ)2n+1 dθ (15) (1 − x2 )n dx = 2 (1 − x2 )n dx = 2 = cn 0 0 −1 where we made the substitution x = sin θ, dx = cos θ dθ. Let us denote the integral by In . Then we have (using integration-by-parts with u = (cos θ)2n and dv = cos θ dθ to go from the first line to the second) In = Z π/2 (cos θ)2n+1 dθ = 0 Z π/2 0 Z π/2 = (cos θ)2n sin θ + 2n 0 = 2n (cos θ)2n cos θ dθ Z 0 π/2 π/2 (cos θ)2n−1 sin2 θ dθ 0 (cos θ)2n−1 (1 − cos2 θ) dθ 9 = 2n Z π/2 Z π/2 2n−1 (cos θ) 0 = 2n 0 dθ − 2n Z π/2 (cos θ)2n+1 dθ 0 (cos θ)2n−1 dθ − 2nIn and therefore In = 2n 2n + 1 Z π/2 (cos θ)2n−1 dθ = 0 2n In−1 . 2n + 1 Iterating this we have 2(n − 1) 2n 2n In−2 In−1 = In = 2n + 1 2n + 1 2(n − 1) + 1 2n 2(n − 1) 2(n − 2) = In−3 2n + 1 2(n − 1) + 1 2(n − 2) + 1 = ··· = = = But I0 = R π/2 (2n)2(n − 1) · · · 2(n − (n − 1)) I0 (2n + 1)(2(n − 1) + 1) · · · (2(n − (n − 1)) + 1) 2n n(n − 1) · · · 2 I0 (2n + 1)(2n − 1) · · · 3 2n n! I0 . (2n + 1)!! cos θ dθ = 1 so that (from (15)) 0 2n+1 n! 1 = . cn (2n + 1)!! Now note that (2n + 1)!! = 1 · 3 · 5 · 7 · 9 · · · (2n − 1) · (2n + 1) = 1(2 · 1)3(2 · 2)5(2 · 3)7(2 · 4)9 · · · (2(n − 1))(2n − 1)(2n)(2n + 1) (2 · 1)(2 · 2)(2 · 3)(2 · 4) · · · 2(n − 1) · 2n = (2n + 1)! 2n n! and hence we have cn = (2n + 1)! . 22n+1 (n!)2 See Figure 5 below. We now want to give a somewhat less than rigorous proof to show that Z 1 lim f (x)δn (x) dx = f (0) n→∞ −1 10 (16) 6 5 4 3 2 1 -1.0 0.0 -0.5 0.5 1.0 Figure 5: The function δn (x) for n = 5, 20 and 100 and therefore lim δn (x) = δ(x). n→∞ To do this, we need to know the behavior of cn as n → ∞. Referring√to equation (15), we note that the integrand is ≥ 0 for 0 ≤ x ≤ 1, and since 1/ n < 1 we can write 1 =2 cn Z 1 0 2 n (1 − x ) dx ≥ 2 Z √ 1/ n 0 (1 − x2 )n dx. (17) Now, it is also true that (1 − x2 )n ≥ 1 − nx2 for x ∈ [0, 1]. To see this, define the function g(x) by g(x) = (1 − x2 )n − (1 − nx2 ). Clearly g(0) = 0, and furthermore we note that g ′ (x) = 2nx[1 − (1 − x2 )n−1 ] > 0 for x ∈ (0, 1]. Thus g(x) is monotonically increasing on [0, 1] so that g(x) ≥ 0. In other words, (1 − x2 )n ≥ (1 − nx2 ) for all x ∈ [0, 1] as claimed. Using this result in (17) then yields 1 ≥2 cn Z 0 √ 1/ n 1 4 (1 − nx2 ) dx = √ > √ 3 n n and hence we see that cn < √ n. Before proceeding, we need a simple result from analysis. 11 (18) Lemma 1. The nth power of any positive number less than 1 will decrease more rapidly with n than any power of n will increase. In other words, limn→∞ xn nN = 0 for 0 < x < 1 and any N ∈ Z+ . Proof. For any x ∈ (0, 1) and N ∈ Z+ define the function f (n) = xn nN = en ln x nN . If f has a maximum, it occurs when df = (ln x)en ln x nN + en ln x N nN −1 = 0 dn or n = −N/ ln x. This shows that f indeed has a maximum. To show limn→∞ f (n) = 0 we write nN f (n) = −n ln x e and use l’Hôpital’s rule. (Note ln x < 0 so both the numerator and denominator blow up.) We have N! (dN /dnN )nN = lim n→∞ (− ln x)N e−n ln x n→∞ (dN /dnN )e−n ln x N! N! n x = lim xn = lim n→∞ (− ln x)N (− ln x)N n→∞ lim f (n) = lim n→∞ =0 since the coefficient in front of xn is just some number and 0 < x < 1. Now we can show that as n → ∞ the main contribution to the integral (14) comes from a neighborhood of the origin. To see this, let 0 < ε < 1, and since δn (x) is an even function of x, we see that letting x → −x yields Z −ε δn (x) dx = Z 1 δn (x) dx. (19) ε −1 √ Noting that δn (x) = cn (1 − x2 )n where cn < n, and (1 − x2 )n takes its maximum value for x ∈ [ε, 1] at x = ε, we see that the integral is bounded by Z 1 δn (x) dx < ε √ √ n(1 − ε2 )n (1 − ε) < n(1 − ε2 )n . Because 0 < (1 − ε2 ) < 1 we can apply our lemma to conclude that lim n→∞ Z 1 δn (x) dx = 0 ε 12 (20) and hence both sides of (19) vanish. This shows that the only contribution to (14) comes from a neighborhood of x = 0. Finally, from the fact that δn (x) is continuous and nonnegative, this last equation shows that lim δn (x) = 0 for 0 < x ≤ 1. n→∞ Since δn (x) is normalized to 1 (equation (14)), it follows that lim n→∞ Z 1 f (x)δn (x) dx = f (0) (21) −1 as claimed. 3 Weierstrass’s Theorem We now turn to the statement and proof of Weierstrass’s theorem. Theorem 1 (Weierstrass’s Theorem). Let f (x) be continuous on the closed interval [a, b]. Then there exists a sequence of polynomials Pn (x) that converges uniformly to f (x) on [a, b]. That is, lim Pn (x) = f (x). n→∞ It is worth emphasizing that this theorem proves there exists a sequence of polynomials that converges uniformly to f (x), not that there is a uniformly convergent power series. Proof. We first show that there is no loss of generality by assuming that f (x) is defined on [0, 1]. For if f (x) is defined on [a, b], then we consider the function g defined by x−a g := f (x). b−a Note that f (a) = g(0) and f (b) = g(1), and also that any x ∈ [a, b] will correspond to some y ∈ [0, 1]. And since any polynomial in y = (x − a)/(b − a) is also a polynomial in x, we can go from a polynomial approximating g to a polynomial approximating f . Furthermore, we can also assume that g(0) = g(1) = 0, for if it does not, then define the function h(y) for y ∈ [0, 1]by h(y) = g(y) − g(0) − y[g(1) − g(0)] Again, since h(y) and g(y) only differ by a polynomial, if we can approximate h(y) by a polynomial, then we can approximate g(y) by the same polynomial plus the polynomial g(0) + y[g(1) − g(0)]. 13 In summary, we may assume that our original function f (x) is defined on [0, 1] and is such that f (0) = f (1) = 0. There is no restriction on f (x) outside [0, 1], so we take it as identically zero in this interval. Now define the polynomials Pn (x) for x ∈ [0, 1] by Pn (x) = Z 1 f (x + t)δn (t) dt, −1 x ∈ [0, 1] (22) where δn (t) is the function defined by equation (13). Our goal now is to prove that lim Pn (x) = f (x) n→∞ and this will then also prove that the sequence of functions defined by (13) do indeed represent a δ-function. This will also provide a rigorous proof of (21). Recall we are assuming that f (0) = f (1) = 0 and f (x) = 0 for x outside the interval [0, 1]. Then f (x + t) = 0 for x + t ≤ 0 or t ≤ −x, and f (x + t) = 0 for x + t ≥ 1 or t ≥ 1 − x. This means we can write equation (22) as Pn (x) = Z 1−x f (x + t)δn (t) dt. −x Change variables by letting t′ = t + x. Then dt = dt′ , the limit t = −x becomes t′ = 0, the limit t = 1 − x becomes t′ = 1, and we have (dropping the prime on t) Z 1 Z 1 f (t)cn [1 − (t − x)2 ]n dt f (t)δn (t − x) dt = Pn (x) = 0 0 where we used definition (13) for δn (t − x). We see from this equation that Pn (x) is a polynomial in x (of degree 2n) with coefficients that are definite integrals over t. Thus {Pn (x)} is a sequence of polynomials, and we must show that this sequence in fact converges uniformly to f (x). Let us recall from analysis that a continuous function defined on a compact set is in fact uniformly continuous. In particular, f (x) is a continuous function defined on the closed (and hence compact) interval [0, 1]. Then uniform continuity means that given ε > 0, there exists δ > 0 such that |y − x| < δ implies |f (y) − f (x)| < ε. Letting y = x + δ we write this as |f (x + δ) − f (x)| < ε for all x ∈ [0, 1]. (23) Because of (14), we can write f (x) = Z 1 f (x)δn (t) dt. −1 R R Using (22) for Pn (x), the fact that f ≤ |f |, and the fact that δn (t) ≥ 0 for 14 all t ∈ [0, 1] we then have Z |Pn (x) − f (x)| = ≤ Z 1 −1 1 −1 [f (x + t) − f (x)]δn (t) dt |f (x + t) − f (x)| δn (t) dt. We now break up the range of integration as follows: Z −δ Z δ Z 1 Z 1 + + = −δ −1 −1 δ and proceed to estimate each of these integrals. We will pick a specific δ below. From analysis again we have the fact that a continuous function defined on a compact set takes its maximum (and minimum) values on the set. Let us write maxx∈[0,1] |f (x)| = M (and remember that f (x) = 0 for x outside the interval [0, 1]). Using the triangle inequality and (20) we then have Z 1 Z 1 Z 1 |f (x + t) − f (x)| δn (t) dt ≤ |f (x + t)| δn (t) dt + |f (x)| δn (t) dt δ δ ≤ 2M δ Z δ 1 δn (t) dt < 2M n1/2 (1 − δ 2 )n . R −δ By (19) this result also applies to −1 . Rδ Now for −δ . Referring to condition (23) for uniform continuity we choose δ so that |f (x + t) − f (x)| < ε/2 for |t| < δ. R1 Rδ Because δn (x) ≥ 0 for x ∈ [0, 1], we see that −δ δn (t) dt < −1 δn (t) dt < 1. Then Z Z δ δ −δ |f (x + t) − f (x)| δn (t) dt < ε/2 δn (t) dt < ε/2. −δ Combining these results we have |Pn (x) − f (x)| ≤ 4M n1/2 (1 − δ 2 )n + ε/2. Since 0 < δ < 1, our lemma shows that we can choose n sufficiently large that n1/2 (1 − δ 2 )n < ε/8M , and thus there exists N such that n > N implies |Pn (x) − f (x)| < ε for any ε > 0 and all x ∈ [0, 1]. In other words, we have shown that lim |Pn (x) − f (x)| = 0 n→∞ for all x ∈ [0, 1], and hence the sequence Pn (x) converges uniformly to f (x). 15