Chapter 9 Mathematical Preliminaries Stirling’s Approximation n ~ n! e n n n 2n log n! log k . Let I log xdx ( x log x x) 1 n log n n 1 n k 1 1 Fig. by trapezoid rule 9.2-1 I 1 1 log 1 log 2 log( n 1) log n 2 2 .... 1 n log n n log n 1 log n! n n e n n e n! 2 take antilogs 1 1 Fig. by midpoint formula I log 2 log 3 log( n 1) log n 9.2-2 8 2 7 1 1 n n n log n n 1 log n log n! n e n e 8 n! 8 2 n! n e n n take antilogs 7 8 n C where 2.4 e C e 2.7 9.2 Binomial Bounds Show the volume of a sphere of radius λn in the n-dimensional unit hypercube is: n n n V (n) 1 2nH 2 ( ) 1 2 n Assuming 0 λ ½ (since the terms are reflected about n/2) the terms grow monotonically, and bounding the last by Stirling gives: n n! n ne n n C n n n n n n n C (n n ) e n n C n (n )!(n n )! (n ) e nn 1 en C 1 n n n n n n n (1 )(1 ) n e ne ( n n ) n (1 )n 1 n 1 1 C 2 nH 2 ( ) C for some C ' (1 )1 (1 )n 2 n log (1 )1 log (1 ) log( 1 ) H ( ). 9.3 n n n 1 termwise by a geometric series bound n n 1 1 n n 1 3 2 1 with ratios (the 1 n n 1 n n 2 k n 2 n 1 n 1 1 actual ratios between te rms). So , 1 2 k 0 1 - 1 1 k n n C C 1 nH 2 ( ) nH 2 ( ) nH 2 ( ) 2 2 2 k 2 n k 0 1 1 k 0 n 1 as n n 2 N.b. n n ~ n 1 n nH 2 1 n 2 2 2 2 k 0 k k 0 k 9.3 The Gamma Function Idea: extend the factorial to non-integral arguments. by convention Let (n) e x x n1dx. 0 For n > 1, integrate by parts: dg = e−xdx f = xn−1 ( n) x n 1 x e (n 1) e x x n 2 dx 1 0 0 (n) (n 1)(n 1) (1) e dx e x x 0 1 0 (2) 1 (3) 2! (4) 3! (n) (n 1)! 9.4 xt 2 1 x e x dx 2 0 1 2 2 e dt e dt t 2 t 2 e t 2 0 1 2tdt t (called the error integral) 0 1 1 x y x2 y2 e dx e dy e dxdy 2 2 2 dr rdθ r dx dy area = dxdy 2 2 r 2 e e rdr d 2 2 00 r 2 0 area = rdrdθ 1 2 9.4 N – Dimensional Euclidean Space Use Pythagorean distance to define spheres: x12 xn2 r Consequently, their volume depends proportionally on rn 4 n Vn (r ) Cn r C1 2 C2 C3 3 n n t 2 1 xi2 e dt e dxi i 1 2 n n 2 e x12 x n2 dx dx e r dVn (r ) dr C e r nr n 1dr 1 n n 2 0 2 dr 0 converting to polar coordinates 9.5 r2 t dr n nC t t 2 1 n n Cn e t t dt e t dt 2 0 0 n n n n Cn Cn 1 1 2 2 2 2 n 1 2 n 2 1 2 1 2 2 Cn Cn 2 n n 1 2 just verify by substitution Cn 2 2.00000 π 3.14159 3 4π/3 4.18879 4 π2/2 4.93420 5 8π2/15 5.26379 6 π3/6 5.16771 7 16π3/105 4.72477 8 π4/24 4.05871 … … 2k πk/k! … →0 From table on page 177 of book. 9.5 Interesting Facts about N-dimensional Euclidean Space Cn → 0 as Vn(r) → 0 as n→∞ n→∞ for a fixed r Volume approaches 0 as the dimension increases! Vn (r ) Cn r Cn (r ) 1 1 1 n Vn (r ) Cn r r as n n n n Almost all the volume is near the surface (as n → ∞) end of 9.5 Angle between the vector (1, 1, …, 1) and each coordinate axis: 1 cos n length of projection along axis As n → ∞: cos θ → 0, θ → π/2. length of entire vector For large n, the diagonal line is almost perpendicular to each axis! x y By cos x y definition: What about the angle between random vectors, x and y, of the form (±1, ±1, …, ±1)? n n uniform for uniformly ( 1 )( 1 ) ( 1 ) random products distributed random vectors k 1 k 1 0 n n n Hence, for large n, there are almost 2n random diagonal lines which are almost perpendicular to each other! end of 9.8 Chebyshev’s Inequality Let X be a discrete or continuous random variable with p(xi) = the probability that X = xi. The mean square is x2 × p(x) ≥ 0 x p( x ) x p( x )dx E X 2 i 2 i 2 i x p ( x ) dx 2 2 x p ( x ) dx P | X | 2 x P | X | E X2 2 Chebyshev’s inequality 9.7 Variance The variance of X is the mean square about the mean value of X, EX x p( x)dx a So variance, (via linearity) is: V X E ( X a) 2 E X 2 2a EX a 2 a 2 2 2 2 2 E X a E X E X . Note: V{1} = 0 → V{c} = 0 & V{cX} = c²V{X} Also: V{X − b} = V{X} 9.7 The Law of Large Numbers Let X and Y be independent random variables, with E{X} = a, E{Y} = b, V{X} = σ2, V{Y} = τ2. because of independence Then E{(X − a) ∙ (Y − b)} = E{X − a} ∙ E{Y − b} = 0 ∙ 0 = 0 And V{X + Y} = V{X − a + Y − b} = E{(X − a + Y − b)2} = E{(X − a)2} + 2E{(X − a)(Y − b)} + E{(Y − b)2} = V{X} + V{Y} = σ2 + τ2 Consider n independent trials for X; called X1, …, Xn. The expectation of their average is (as expected!): X 1 X n EX 1 EX n E EX a n n 9.8 The variance of their average is (using independence): X 1 X n V X 1 V X n n V 2 2 n n n n 2 2 So, what is the probability that their average A is not close to the mean E{X} = a? Use Chebyshev’s inequality: E A a V A P A a 2 Let n → ∞ 2 2 2 n Weak Law of Large Numbers: The average of a large enough number of independent trials comes arbitrarily close to the mean with arbitrarily high probability. 2 9.8