# docx 1/16 CSE 5522 Class Notes

Expected Value of a Random Variable (Mean) – E(x) = sum of x * P(X=x) over the range of x

E(x^2) = sum of x^2 * P(X=x) over the range of x

In general, E(f(x)) = sum of f(x) * P(X=x) over the range of x

Ex: Roll 2 dice, expected count? Prob. Space (Omega, S, P) Omega = {(1,1),(1,2),..(6,6)}

S = 2 ^ Omega

P = Equiprobable measure = 1/36

Let X be a r.v. for count:

X((I,j)) = I + j range of x: {2,3,…12} sum of x * P(X=x) over range of x (from 2 to 12) = P({s in S|X(s) = x})

P(X=2) = P(1,1) = 1/36

P(X=3) = P({(1,2),(2,1)}) = 2/36

Ultimately, Expected value is 7

Properties of Expected Values: 1. Linearity of Expectation: E(a * x + b * y) = a*E(x) + b*E(y)

2. If X and Y are independent, E(X * Y) = E(X) * E(Y)

Variance: E((X – E(x))^2) –> (Distance from mean) squared

Discrete Distributions: Uniform, Bernoulli, Binomial, Multinomial, Poisson – Function that collects all values of a random variable

Let X be a r.v. Distribution of X is a function defined over the range of X as follows: f(x) = p(X=x) for all x in range of x

Cumulative – F(x) = p(X<=x)

Uniform Distribution: For all x in range of x, where |Range(x)| = n, F(x) = 1/n

Bernoulli Distribution: (Flipping a biased coin) – Binary values

“Success” P(X=1) = Θ

“Failure” P(X=0) = 1 – Θ

Seen as P(x| Θ) or P(x)

Θ x

(1- Θ)

1-x

P(X=1| Θ) = Θ

1

(1- Θ)

0

= Θ

Expected Value of a Bernoulli R.V.: E(x) =sum of x * P(x) for all x over range of x = 1*P(X=1) + 0*P(X=0)

= Θ *(1- Θ)

Maximum Likelihood:Easiest way to infer parameters from data

Flip a coin n times – Flips are IID

IID- Independent, identically distributed

Let x i refer to the outcome of flip I, build joint probability

P(x

1

, x

2

, … x n

|Θ) = P(x

1

| Θ) * P(x

2

| Θ) … P(x n

| Θ)

Principle of Maximum Likelihood: Choose value of Θ that maximizes the joint probability

Θ

ML

= max(P(x

1

, …x n

)| Θ) = argmax of mult from I = 1 to n (Θ xi

(1- Θ)

1-xi

)

Equivalent to maximize ln(p(x

1

,…x n

| Θ)

Let m be number of successes – Take derivative of m * ln(Θ) + (n-m) * ln(1- Θ)

= m/ Θ – (n – m)/ (1- Θ) Solve for Θ Θ = m/n

Binomial Distribution: Probability of observing m heads out of n tosses, given Θ?

P =n! / (n-m)!m!

Multinomial Distribution: Generalizes Bernoulli to k outcomes P(X=1),P(X=2),…P(X=k)

Encode each outcome as binary vector

Follow similar process to find max likelihood – look at ln and take derivative – ends up being m /n