Basics on Probability

advertisement
Basics on Probability
Jingrui He
09/11/2007
Coin Flips

You flip a coin


Head with probability 0.5
You flip 100 coins

How many heads would you expect
Coin Flips cont.

You flip a coin




Head with probability p
Binary random variable
Bernoulli trial with success probability p
You flip k coins



How many heads would you expect
Number of heads X: discrete random variable
Binomial distribution with parameters k and p
Discrete Random Variables

Random variables (RVs) which may take on
only a countable number of distinct values


E.g. the total number of heads X you get if you
flip 100 coins
X is a RV with arity k if it can take on exactly
one value out of x1 , , xk 

E.g. the possible values that X can take on are 0,
1, 2,…, 100
Probability of Discrete RV


Probability mass function (pmf): P  X  xi 
Easy facts about pmf

 P X  x   1


i
i

 
 P X  xi  X  x j  0 if i  j
 P X  x X  x  P X  x  P X  x
i
j
i
j
 P X  x1  X  x2   X  xk  1




 if i  j
Common Distributions
 Uniform X U 1, , N 


X takes values 1, 2, …, N

P X  i  1 N

E.g. picking balls of different colors from a box
Binomial X

Bin  n, p 
X takes values 0, 1, …, n
n i
n i
 P  X  i     p 1  p 
i

E.g. coin flips
Coin Flips of Two Persons

Your friend and you both flip coins



Head with probability 0.5
You flip 50 times; your friend flip 100 times
How many heads will both of you get
Joint Distribution

Given two discrete RVs X and Y, their joint
distribution is the distribution of X and Y
together


E.g. P(You get 21 heads AND you friend get 70
heads)

x

y
P X  x  Y  y  1
E.g.
 
50
100
i 0
j 0
P  You get i heads AND your friend get j heads   1
Conditional Probability

P  X  x Y  y  is the probability of X  x ,
given the occurrence of Y  y


E.g. you get 0 heads, given that your friend gets
61 heads
P X  x Y  y 
P X  x  Y  y
P Y  y
Law of Total Probability

Given two discrete RVs X and Y, which take
values in x1 , , xm  and  y1 , , yn  , We have
 P X  x  Y  y 
  P  X  x Y  y P  Y  y 
P  X  xi  
j
i
j
i
j
j
j
Marginalization
Marginal Probability
Joint Probability
 P X  x  Y  y 
  P  X  x Y  y P  Y  y 
P  X  xi  
j
i
j
i
Conditional Probability
j
j
j
Marginal Probability
Bayes Rule

X and Y are discrete RVs…
P X  x Y  y 


P X  xi Y  y j 
P X  x  Y  y
P Y  y


P Y  y j X  xi P  X  xi 
 P Y  y
k
j

X  xk P  X  xk 
Independent RVs


Intuition: X and Y are independent means that
X  x neither makes it more or less probable
that Y  y
Definition: X and Y are independent iff
P X  x  Y  y  P X  x P Y  y 
More on Independence

P X  x  Y  y  P X  x P Y  y 
P X  x Y  y  P X  x

P Y  y X  x  P Y  y
E.g. no matter how many heads you get, your
friend will not be affected, and vice versa
Conditionally Independent RVs


Intuition: X and Y are conditionally
independent given Z means that once Z is
known, the value of X does not add any
additional information about Y
Definition: X and Y are conditionally
independent given Z iff
P X  x  Y  y Z  z   P X  x Z  z  P Y  y Z  z 
More on Conditional Independence
P X  x  Y  y Z  z   P X  x Z  z  P Y  y Z  z 
P  X  x Y  y, Z  z   P  X  x Z  z 
P  Y  y X  x, Z  z   P  Y  y Z  z 
Monty Hall Problem




You're given the choice of three doors: Behind one
door is a car; behind the others, goats.
You pick a door, say No. 1
The host, who knows what's behind the doors, opens
another door, say No. 3, which has a goat.
Do you want to pick door No. 2 instead?
Host reveals
Goat A
or
Host reveals
Goat B
Host must
reveal Goat B
Host must
reveal Goat A
Monty Hall Problem: Bayes Rule

Ci : the car is behind door i, i = 1, 2, 3
P  Ci   1 3

Hij : the host opens door j after you pick door i



P H ij Ck

i j
0
0
jk


ik
1 2
 1 i  k , j  k
Monty Hall Problem: Bayes Rule cont.



WLOG, i=1, j=3
P  C1 H13  
P  H13
P  H13 C1  P  C 1 
P  H13 
1 1 1
C1  P  C1    
2 3 6
Monty Hall Problem: Bayes Rule cont.

P  H13   P  H13 , C1   P  H13 , C2   P  H13 , C3 
 P  H13 C1  P  C1   P  H13 C2  P  C2 

1
1
  1
6
3
1

2
16 1
P  C1 H13  

12 3
Monty Hall Problem: Bayes Rule cont.



16 1
P  C1 H13  

12 3
1 2
P  C2 H13   1    P  C1 H13 
3 3
You should switch!
Continuous Random Variables



What if X is continuous?
Probability density function (pdf) instead of
probability mass function (pmf)
A pdf is any function f  x  that describes the
probability density in terms of the input
variable x.
PDF

Properties of pdf




f  x   0, x



f  x  1
f  x   1 ???
Actual probability can be obtained by taking
the integral of pdf

E.g. the probability of X being between 0 and 1 is
P  0  X  1 

1
0
f  x dx
Cumulative Distribution Function


FX  v   P  X  v 
Discrete RVs
 FX  v  


vi
P  X  vi 
Continuous RVs
 
 FX v 


v

f  x  dx
d
FX  x   f  x 
dx
Common Distributions

N ,
Normal X
 
 f x 


2

1
 x   
exp 
, x 
2
2 
2


E.g. the height of the entire population
0.4
0.35
0.3
0.25
f(x)

2
0.2
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
x
1
2
3
4
5
Common Distributions cont.
Beta X Beta  ,  
1
 1
 1
x 1  x  , x   0,1
 f  x;  ,   
B  ,  
     1 : uniform distribution between 0 and 1

E.g. the conjugate prior for the parameter p in
Binomial distribution
1.6
1.4
1.2
1
f(x)

0.8
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
x
0.6
0.7
0.8
0.9
1
Joint Distribution


Given two continuous RVs X and Y, the joint
pdf can be written as fX,Y  x, y 

x y
f X,Y  x, y dxdy  1
Multivariate Normal

Generalization to higher dimensions of the
one-dimensional normal
Covariance Matrix

f X  x1 ,
, xd  
1
 2 
d 2

12
T 1
 1

 exp   x      x    
 2

Mean
Moments

Mean (Expectation):   E  X 
 Discrete RVs: E  X    vi P  X  vi 
v
i


Continuous RVs: E  X  



Variance: V  X   E  X   


Discrete RVs: V  X  

Continuous RVs: V  X  
xf  x  dx
2
 vi    P  X  vi 
2
vi



 x    f  x dx
2
Properties of Moments

Mean




E  X  Y  E  X  E  Y
E  aX   aE  X 
If X and Y are independent, E  XY  E  X  E  Y
Variance


 
 V aX  b  a V X

2
If X and Y are independent, V  X  Y   V (X)  V (Y)
Moments of Common Distributions
 Uniform X U 1, , N 
 Mean 1  N  2 ; variance  N  1 12
2

Binomial X



Mean np ; variance np 2
Normal X

Bin  n, p 

N , 2
Mean  ; variance  2
Beta X Beta  ,  


Mean      ; variance

        1
2
Probability of Events

X denotes an event that could possibly happen


P(X) denotes the likelihood that X happens,
or X=true


E.g. X=“you will fail in this course”
What’s the probability that you will fail in this
course?
 denotes the entire event set


   X, X
The Axioms of Probabilities


0 <= P(X) <= 1
P    1

P  X1  X2 

disjoint events
Useful rules

  i P  Xi  , where X i are
P  X1  X2   P  X1   P  X2   P  X1  X2 
 
 P X  1 P X
Interpreting the Axioms

X1
X2
Download