Chapter 4 Probability and Random Variables Def: Sample Space:the space of all possible outcomes (δ) Event:a collection of outcomes:subset of δ Probability:a “measure” assigned to the events of a sample space with the following properties: 1. P( A) 0 for all event A in S 2. P( S ) 1 3. A and B are mutually exclusive, P( A B) P( A) P( B) Ex. P( A) NA N P( A) lim N NA N P( A B ) P( A) P( B ) P( A B ) P( A | B ) P( B ) P( B | A) P( A) P( A B ) P( B | A) P( B ) P( A | B ) P( A) Baye' s Rule A and B are Statistically Independent P( A B ) P( A) P( B ) Ex. P(1s|1r)=? Sol: P(1s | 1r ) P(1r | 1s ) P(1s ) P(1r | 1s ) P(1s ) P(1r ) P(0s ) P(1r | 0s ) P(1s ) P(1r | 1s ) Compound EventJoint Probability 4-1 例:擲兩骰子 P(Ai,Bj) P( A , B ) 1 i i j j P( B j ) P( Ai , B j ) i marginal probability P( Ai ) P( Ai , B j ) j P( Bm | An ) P( An , Bm ) P( An , Bm ) P( An ) P( An , B j ) j 4-2 § 4.2 Random Variables A rule which assigns a numerical value to each possible outcomes of a chance experiment. Ex. Discrete:flip a coin S1 S2 H X(S1)=1 T X(S2)=-1 Continuous:spin a pointer 0 X 360 4.2.2 Cumulative Distribution Function (CDF) FX (x ) ≜ Prob{X x} Properties:1. 0 FX ( x) 1, FX () 1, FX () 0 2. FX ( x) is continuous from right, i.e. lim FX ( x) FX ( x0 ). x x0 3. FX ( x) is a nondecreas ing function of x. 4.2.3 Probability density function (pdf) x dF ( x ) f X (x ) ≜ X FX ( x ) f X (t )dt dx Properties: f X ( x ) 0 f X ( x )dx 1 4-3 P( x1 X x2 ) FX ( x2 ) FX ( x1 ) x f X ( x )df x2 1 4.2.4 Joint cdf’s and pdf’s FXY ( x, y ) ≜ Prob{X x,Y y} joint CDF 2 FXY ( x, y ) f XY ( x, y ) ≜ xy joint pdf FX ( x ) FXY ( x, ) f XY ( x, y )dxdy x FY ( y ) FXY (, y ) f XY ( x, y )dxdy y P( x1 X x2 , y1 Y y2 ) y y2 1 x2 x1 dFX ( x ) f XY ( x, y )dy dx dF ( y ) fY ( y ) Y f XY ( x , y )dx dy f XY ( x, y )dxdy f X ( x) marginal pdf X.Y independen t P ( X x, Y y ) P ( X x ) P (Y y ) FXY (x,y) FX (x) FY (y) f XY (x,y) f X (x) f Y (y) f X|Y (x|y) f X (x) or f Y | X ( y | x ) f Y ( y ) 4.2.5 Transformation of Random Variable: 1-D 已知 f X ( x ), let Y ≜ g (x), 求 f Y ( y ) ? 4-4 f X ( x )dx Prob{x X x dx} f Y ( y )dy Prob{ y Y y dy} f X ( x )dx f Y ( y )dy f Y ( y ) f X ( x ) dx | dy x g 1 ( y) Y g ( x ) : not one - to - one f Y ( y )dy f X ( xi )dxi i f Y ( y ) f X ( xi ) i 2-D dxi | dy x g U g1 ( X , Y ) i 1 ( y) V g2 ( X ,Y ) f XY ( x, y )dxdy f UV (u, v )dudv f XY ( x, y )dAxy f UV (u, v )dAuv f UV (u, v ) f XY ( x, y ) dAxy dAuv ( x, y ) Jacobian: ≜ ( u, v ) f XY ( x, y ) ( x, y ) | 1 (u, v ) yx gg1 1((uu,,vv)) 2 x u x v y u y v Ex 4.15 Throw a dart. Target is center at origin, exp x 2 y 2 2 2 f XY ( x, y ) f X ( x ) fY ( y ) joint pdf 2 2 Y R ≜ X 2 Y 2 ,θ≜ tan 1 , 0 R ,0 2 X 求 f R (r, ) ? 4-5 Sol: X R cos , Y R sin J x r y r x y cos r sin sin r cos r r exp r 2 2 2 f R (r , ) f XY ( x, y ) J | x r cos 2 2 y r sin 2 0 r2 f R ( r, )d 2 exp 2 f R ( r ) 2 r We know f R (r , ) f R (r ) f Θ (θ ) , R, independent, Rayleigh distributi on 2 uniform distributi on 1 4-6 § 4.3 Statistical Average Expectation or Mean: M X E{ X } xi pi discrete i 1 xf X ( x )dx continuous x i i Probxi X xi dx Y g ( X ), EY Eg ( X ) g ( X ) f X ( x )dx Y f Y ( y )dx Eg ( X , Y ) g ( x, y ) f XY ( x, y )dxdy X .Y statistica lly independen t. EX Y E ( X ) E (Y ) Eh( X ) g (Y ) Eh( X ) Eg (Y ) Variance: x ≜ E X E(X ) 2 2 E X 2 2E ( X ) X E ( X ) 2 E ( X 2 ) E ( X ) 2 N N E ai X i ai E ( X i ) i 1 i 1 4-7 Var (a1X1+ a2X2) = E {[a1X1+ a2X2- E (a1X1+ a2X2)]2} = E {[a1 (X1- EX1)+ a2 (X2-EX2)]2} = E {[a12 (X1- EX1)2+ a22 (X2-EX2)2+ 2a1a2(X1- EX1)( X2-EX2)] = a12σX12+ a22σX22+ 2a1a2E {(X1- EX1)( X2-EX2)} = a12σX12+ a22σX22+ 2a1a2μX1X2 μX1X2 covariance If X1 X2 independent. ⇒ Var [a1X1+ a2X2] = a12σX12+ a22σX22 In general, if X1, X2, … ,XN are mutually independent, ⇒ Var {(a1X1+ a2X2+ … +aNXN)} = a12σX12 + a22σX22 + … + aN2σXN2 eXY XY XY Correlation coefficient Prove : -1≦eXY≦1 Pf : E{(X-EX)(Y-EY)}≦ E ( X EX ) 2 E (Y EY ) 2 We now prove for any X,Y [E(XY)]2≦EX2EY2 Proof : E[(X-λY)2]≧0 , X, Y f (λ) f (λ) = λ2EY2- 2λEXY+EX2≧0 λ 的二次式 At λ = EXY EY 2 f (λ) has a minimum 代入 f (λ) ( EXY ) 2 EX ≧0 EY 2 2 4-8 imply 4.3.8 * Characteristic Function: For a random X jvX E{e } = fX(x)ejvxdx MX(jv) fX(x) = 1 2 MX(jv) e-jvxdv M ( jv) = j x fX(x)ejvxdx v Let v=0 E{x} = -j M ( jv) |v=0 v E{Xn} = (-j)n n M ( jv) |v=0 vn 4-9 § 4.4 Some Useful pdf’s 4.4.1 Binomial Distribution: Throw a coin n times, count the number of heads (Sn) P(H) = p P(T) = q = 1-p n P(sn = k) = Pn(k) = pk(1-p)n-k k n! = pk(1-p)n-k k!(n k )! n E(Sn) = kPn(k) k 0 n = k k 0 n! pkqn-k k!(n k )! (n 1)! pk-1qn-k (k 1)!(n k )! n = np h 1 Let m=k-1 n 1 = np m 0 (n 1)! pmqn-m-1 m!(n m 1)! From another point of view: Let Sn = X1+ X2+ …+ Xn Xi = 1 , if the ith throw is a head Xi = 0 , if the ith throw is a tail E (Xi) = p Var (Xi) = EXi2- (EXi)2= p-p2=pq n E {Sn} = E(Xi) = np n 1 Var (Sn) = npq 4.4.3 Poisson Distribution: (T ) k -αT PT(k) = e k! k = 0, 1, 2, … Probability of k calls arrived in (0, T), α: call arrival rate Mean = k 0 (T ) k -αT (T ) k 1 e = (αT) e-αT = αT (k 1)! (k 1)! k 0 * 4-10 Let Pn be the probability that a call will arrive within ∆T From Binomial Distribution, we know nα∆T =αT , Pn = T n n P(Sn=k) = pnk(1-Pn)n-k k n(n 1)...( n k 1) T k T T = ( ) (1- )n (1- )-k k! n n n n(n 1) . . n. ( k 1) →1 nk As n→∞ (1- T k ) → e-αT n (1- ∴ P(Sn=k) → T -k ) → 1 n (T ) k k! e-αT Binomial Distribution can be approximated by Poisson Distribution when Pn→0 , and n→∞ 4.4.4 Geometric Distribution: First head in a series of coin tossing occurring on the kth trial p(k) = qk-1p p = p(H) q = p(T) 4.4.5 Gaussian Distribution: (Normal Distribution) fX(x) = 1 2 2 e ( x ) 2 / 2 2 X ~ N(μ, σ2) Central Limit Theorem: X1, X2, … ,XN mutually independent with mean m1, m2, … , mN and variance σ12, σ22, σ32, … , σN2 Let Z = X1+ X2+ … +XN Then Z ~ N(m, σ2) m = m1+ m2+ … + mN σ2 =σ12+ σ22+ … + σN2 4-1 Joint Gaussian Probability Density Function: Given X ~ N(mx, σx2) Y ~ N(my, σy2) E[( X m x )(Y mY )] x Y Then [ fXY(x,y)= 1 ( x mx ) 2 exp{ 2 x Y 1 2 x2 2 ( x m x )( y mY ) x Y 2(1 2 ) ( y mY ) 2 Y2 ] } Observation: If ρXY=0 (uncorrelated) For Gaussian X and Gaussian Y ⇒X and Y are independent Sn = X1+ X2+ … +XN Recall from Binomial Distribution section Sn ~ N(np, npq) when n is large e.g. 4.169 Sum of two Gaussian random variables (dependent or independent) is still a Gaussian random variable … ⇒ Sum of any number of Gaussian random variable is still a Gaussian random variable … = Proof of when X1 and X2 are independent Gaussian Distribution. X1 ~ N(m1, σ12) , X2 ~ N(m2, σ22) MX1 (jv) = MX2 (jv) = e e1( x1 m1 ) 1 21 ( jm2 v 2 2v 2 2 2 / 2 12 2 e jvx1 dx = e ( jm1v 1 ) E (ejvX) = E[ejv(X1+X2)] = E[ejvX1ejvX2] = MX1(jv) MX2(jv) = e j ( m1 m2 ) v ∴ X ~ N(m1+ m2 , σ12+σ22) 4-2 ( 12 2 2 ) 2 v 2 12 v 2 2 ) 4.4.6 Gaussian Q function: X ~ N(mx, σx2) X mx ~ N(0, 1) x P(mx- a≦x≦mx+a) = mx a exp( e Q(u) erf(u) e dy ) dx dy = 1-2Q( 2 q 2 y2 2 a x ) x u 2 2 2 u e = x ( Q(u) a x x mx 2 x a 2 mx a Let y = ( x mx ) 2 Let u = a ) x u 2 2 2 u 2 u when u>>1 e y dy = 1-2Q( 2 0 u 1 2 ) = 1-2Q( 2 u) Markov Inequality: random variable X≧0 , P(X≧t)≦ Pf: EX = EX t xfX(x)dx ≧ 0 ∴ P(X≧t)≦ xfX(x)dx ≧ t t EX t 4-3 tfX(x)dx = tP(X≧t) Chebyshev Inequality: X: mean u, variance σ2 2 P(|X-u|≧t) ≦ 2 t Pf: from Markov Inequality. Let Y = X-u t’=t2 E (( X u ) 2 ) 2 P(|X-u| ≧t )≦ = 2 t2 t 2 2 4-4