1/16 CSE 5522 Class Notes
Expected Value of a Random Variable (Mean) – E(x) = sum of x * P(X=x) over the range of x
E(x^2) = sum of x^2 * P(X=x) over the range of x
In general, E(f(x)) = sum of f(x) * P(X=x) over the range of x
Ex: Roll 2 dice, expected count? Prob. Space (Omega, S, P) Omega = {(1,1),(1,2),..(6,6)}
S = 2 ^ Omega
P = Equiprobable measure = 1/36
Let X be a r.v. for count:
X((I,j)) = I + j range of x: {2,3,…12} sum of x * P(X=x) over range of x (from 2 to 12) = P({s in S|X(s) = x})
P(X=2) = P(1,1) = 1/36
P(X=3) = P({(1,2),(2,1)}) = 2/36
Ultimately, Expected value is 7
Properties of Expected Values: 1. Linearity of Expectation: E(a * x + b * y) = a*E(x) + b*E(y)
2. If X and Y are independent, E(X * Y) = E(X) * E(Y)
Variance: E((X – E(x))^2) –> (Distance from mean) squared
Discrete Distributions: Uniform, Bernoulli, Binomial, Multinomial, Poisson – Function that collects all values of a random variable
Let X be a r.v. Distribution of X is a function defined over the range of X as follows: f(x) = p(X=x) for all x in range of x
Cumulative – F(x) = p(X<=x)
Uniform Distribution: For all x in range of x, where |Range(x)| = n, F(x) = 1/n
Bernoulli Distribution: (Flipping a biased coin) – Binary values
“Success” P(X=1) = Θ
“Failure” P(X=0) = 1 – Θ
Seen as P(x| Θ) or P(x)
Θ x (1- Θ) 1-x
P(X=1| Θ) = Θ 1 (1- Θ) 0 = Θ
Expected Value of a Bernoulli R.V.: E(x) =sum of x * P(x) for all x over range of x = 1*P(X=1) + 0*P(X=0)
= Θ *(1- Θ)
Maximum Likelihood:Easiest way to infer parameters from data
Flip a coin n times – Flips are IID
IID- Independent, identically distributed
Let x i refer to the outcome of flip I, build joint probability
P(x
1
, x
2
, … x n
|Θ) = P(x
1
| Θ) * P(x
2
| Θ) … P(x n
| Θ)
Principle of Maximum Likelihood: Choose value of Θ that maximizes the joint probability
Θ
ML
= max(P(x
1
, …x n
)| Θ) = argmax of mult from I = 1 to n (Θ xi (1- Θ) 1-xi )
Equivalent to maximize ln(p(x
1
,…x n
| Θ)
Let m be number of successes – Take derivative of m * ln(Θ) + (n-m) * ln(1- Θ)
= m/ Θ – (n – m)/ (1- Θ) Solve for Θ Θ = m/n
Binomial Distribution: Probability of observing m heads out of n tosses, given Θ?
P =n! / (n-m)!m!
Multinomial Distribution: Generalizes Bernoulli to k outcomes P(X=1),P(X=2),…P(X=k)
Encode each outcome as binary vector
Follow similar process to find max likelihood – look at ln and take derivative – ends up being m /n