Stat 330 Formula Sheet Final Exam

advertisement
Stat 330
Formula Sheet
Final Exam
probability mass function properties: 0 ≤ pX (k) ≤ 1 for all k = 0, 1, 2, . . . and
P
k
pX (k) = 1
probability distribution function properties:
FX (t) non-decreasing for t, limt→−∞ FX (t) = 0, limt→∞ FX (t) = 1, step-function, increases at k.
expected value & variance properties:
E[aX + bY ] = aE[X] + bE[Y ], E[X 2 ] = V ar[X] + (E[X])2 , V ar[aX + b] = a2 V ar[X]
V ar[X + Y ] = V ar[X] + V ar[Y ], iff X, Y are independent.
E[X · Y ] = E[X] · E[Y ] only if X, Y are independent.
P
covariance of X and Y : Cov(X, Y ) = (x,y) (x − E[X])(y − E[y])pX,Y (x, y)
correlation of X and Y : Corr(X, Y ) = √
Cov(X,Y )
,
V ar[X]V ar[Y ]
correlation is in [0, 1], unitless.
discrete distribution(s):
• Binomial B(n, p): X=number of successes in n indep. trials, p=P (success)(Bernoulli trials).
pmf: pX (k) = nk pk (1 − p)n−k , for FX (k) see Binomial table, E[X] = np, V ar[X] = np(1 − p)
• Geometric Geo(p): X = number of Bernoulli trials until 1st success, p = P (success).
pmf: pX (k) = (1 − p)k−1 p, FX (k) = 1 − (1 − p)k , E[X] = 1/p, V ar[X] =
1−p
p2
• Poisson P oi(λ): X = number of events in some unit of time/space,
λ = rate of events in some unit of time/space.
k
pmf: pX (k) = e−λ λk! , for FX (k) see Poisson table, E[X] = λ, V ar[X] = λ
continuous random variable X uncountable set of possible outcomes (=interval),
probability density function : fX (k) = F 0 (X = k),
cumulative distribution function FX (t) = P (X ≤ t),
R∞
R∞
expected value E[X] = −∞ xfX (x)dx, variance V ar[X] = −∞ (x − E[X])2 fX (x)dx.
R∞
probability density function (pdf ) properties: fX (x) ≥ 0 for all x and −∞ fX (x)dx = 1.
cumulative distribution function (cdf ) properties:
FX (t) non decreasing for t, limt→−∞ FX (t) = 0, limt→∞ FX (t) = 1.
rules for densities & distribution functions
Rb
Rt
P (a ≤ X ≤ b) = a fX (x)dx = FX (b) − FX (a), P (X = a) = 0, FX (t) = −∞ fX (x)dx.
• Uniform U (a, b), X = random value between a and b, all ranges of values in [a, b] are equally
likely,
pdf: f (x) = 1/(b − a) for x ∈ [a, b], cdf: FX (x) =
E[X] = (a + b)/2, V ar[X] = (b −
x−a
b−a
for x ∈ [a, b],
a)2 /12.
• Exponential Exp(λ), X = time/space until 1st occurrence of event,
λ = rate of events in some unit of time/space.
pdf: f (x) = λe−λx for x ≥ 0, FX (x) = 1 − e−λx for x ≥ 0,
E[X] = 1/λ, V ar[X] = 1/λ2 .
• Exponential distribution is memoryless, i.e. P (X ≤ s + t | X ≥ s) = P (X ≤ t).
• Erlang Distribution Erlang(k, λ): If Y1 , . . . , Yk are k independent exponential random variables
with parameter λ, their sum X has an Erlang distribution:
P
X := ki=1 Yi is Erlang(k, λ) k is stage parameter, λ is rate parameter
Erlang density f (x) = λe−λx ·
(λx)k−1
(k−1)!
E[X] = k/λ, V ar[X] = k/λ2
for x ≥ 0
Erlang cdf is calculated using Poisson cdf: FX (t) = 1 − FY (k − 1)
where X ∼ Erlang(k, λ) and Y ∼ P oi(λt)
so use Poisson cdf table with parameter= λt
• Normal r.v.: X ∼ N (µ, σ 2 ), Normal density is “bell-shaped” f (x) =
√ 1 e−
2πσ 2
(x−µ)2
2σ 2
E[X] = µ, V ar[X] = σ 2 .
standardization: FX (x) = Φ( x−µ
σ ); Z ∼ N (0, 1) and Φ(z) ≡ FZ (z) and Φ(−z) = 1 − Φ(z).
2 )
X ∼ N µx , σx2 ), Y ∼ N (µy , σy2 ), then W := aX + bY has normal distribution W ∼ N (µW , σW
2 = a2 σ 2 + b2 σ 2 + 2abCov(X, Y )
where µW = aµx + bµy and σW
x
y
Central Limit Theorem (CLT): If X1 , X2 , . . . , Xn are i.i.d. r.v.’s with E[Xi ] = µ, V ar[Xi ] = σ 2 ,
P
P
approx
approx
then X := n1 ni=1 Xi ∼ N (µ, σ 2 /n) and Sn = i Xi ∼ N (nµ, nσ 2 )
Bin(n, p)
P oi(λ)
approx
∼
approx
∼
N (np, np(1 − p)) for large n (if np > 5),
N (λ, λ) for large λ,
Poisson Process with rate λ: X(t) ∈ {0, 1, 2, 3, . . .}, X(t2 ) − X(t1 ) ∼ P oi(λ(t2 − t1 )), 0 ≤ t1 < t2 ,
for any 0 ≤ t1 < t2 ≤ t3 < t4
Xt2 − Xt1 is independent from Xt4 − Xt3
time of jth occurrence:Oj ∼ Erlang(j, λ)
time between j − 1 and jth arrival: Ij ∼ Exp(λ)
X(t) Poisson process with rate λ ⇐⇒ Ij ∼ Exp(λ)
Birth & Death Processes X(t) ∈ {0, 1, 2, 3 . . .}, for all t. visualize with state diagram
steady state probabilities: limt→∞ P (X(t) = k) = pk
P
From balance equations, p0 = S −1 where S = 1 + ∞
k=1
the B&D process is stable only if S exists. Then ,
λ0 λ1 ·...·λk−1
µ1 µ2 ·...·µk
λ0 λ1 ·...·λk−1
pk = µ1 µ2 ·...·µk
p0 .
Special Case: (constant birth & death rates) λk = λ, µk = µ for all k, traffic intensity a = µλ ;
P
1
k
Then S = 1 + µλ01 + µλ01 λµ12 + . . . = 1 + a + a2 + a3 + ... = ∞
k=0 a = 1−a for 0 < a < 1. ;
Markov Chains: Sequence {X(0), X(1), X(2), ...} defined over discrete-time T = {0, 1, 2, ...}.
and discrete-state space {1, 2, 3, ...} . Has Markov property
P {X(t + 1) = j | X(t) = i} = P {X(t + 1) = j | X(t) = i, X(t − 1) = h, X(t − 2) = g, ....
1-step Transition probability pij (t) is defined as pij (t) = P {X(t + 1) = j | X(t) = i}.
(h)
h-step transition probability pij (t) = P (X(t + h) = j | X(t) = i).
Initial distribution P0 is the pmf P0 (x) = P (X(0) = x) for x ∈ {1, 2, , ..., n
Useful results: P (2) = P · P = P 2 ;
P (h) = P h ;
Ph = P0 P h
Steady-state distribution: π = limh→∞ Ph (x), x ∈ X ,
Compute π, i.e. (π1 , π2 , . . . , πn ) by solving the set of equations πP = π,
Regular Markov Chain: if, for some n, all entries of P n are positive.
P
x πx
= 1.
Queues
arrival rate λ, service rate µ
traffic intensity a = µλ , ρ = ac
M/M/1 Queue p0 = 1 − a, pk = ak (1 − a), L =
Lq =
a2
1−a ,
L
λa ,
=
1
µ
·
1
1−a ,
Ws = µ1 , Wq =
1 a
µ 1−a ,
P (q(t) ≤ x) = 1 − ae−x(µ−λ) where q(t) is the time spend in the queue
M/M/1/K Queue p0 =
W =
a
1−a , W
1−a
,
1−aK+1
pk = ak p0 , L =
a
1−a
−
(K+1)aK+1
,
1−aK+1
λa = (1 − pK )λ,
Ws = µ1 , Wq = W − Ws , Lq = Wq · λa ,
M/M/c Queue p0 =
P
c
ak
c−1
k=0 k!
a
C(c, a) = p0 c!(1−ρ)
, Lq =
+
W = Ws , λa = (1 −
1
c! 1−ρ
ρ
1−ρ C(c, a),
Ws = µ1 , W = Wq + Ws , L = a +
M/M/c/c Queue p0 =
ac
P
c
ak
k=0 k!
ac
c! p0 )λ,
(
−1
, pk =
ak
k! p0
ak
p
c!ck−c 0
for 0 ≤ k ≤ c − 1,
,
for k ≥ c,
1
cµ(1−ρ) C(c, a),
Wq =
ρ
1−ρ C(c, a)
−1
, pk =
ak
k! p0 ,
Lq = 0, Wq = 0, Ws = µ1 ,
L = W · λa
Estimation and Confidence intervals
Parameter
Estimate
µ
x̄
p
p̂
(1 − α) ·r
100% Confidence interval
s2
x̄ ± zα/2
n
r
p̂(1 − p̂)
p̂ ± zα/2
(substitution)
n
for small n
use t(n−1),α/2 for zα/2
1
p̂ ± zα/2 · √ (conservative)
2 n
s
µ1 − µ2
x̄1 − x̄2
x̄1 − x̄2 ± zα/2 ·
s21
s2
+ 2
n1
n2
use t(n1 +n2 −2),α/2 for zα/2
& s2p =
s
p1 − p2
p̂1 − p̂2
p̂1 − p̂2 ± zα/2 · 
1
p̂1 − p̂2 ± zα/2 ·
2

p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) 
+
n1
n2
r
1
1
+
n1
n2
where zα/2 = Φ−1 (1 − α/2), and
α
0.1
0.05
zα/2
1.65
1.96
α
0.02
0.01
zα/2
2.33
2.58
(n1 −1)s21 +(n2 −1)s22
(n1 +n2 −2)
for s21 , s22
Hypothesis Testing
Null Hypothesis (H0 )
Alternative Hypothesis (Ha )
µ=#
µ > #, µ < # or µ 6= #
p=#
p > #, p < # or p 6= #
µ1 − µ2 = #
µ1 − µ2 > #, µ1 − µ2 < #,
Test Statistic
z=
z=p
p1 − p2 > #, p1 − p2 < #,
p̂ − #
#(1 − #)/n
x̄1 − x̄2 − #
z=p 2
s1 /n1 + s22 /n2
or µ1 − µ2 6= #
p1 − p2 = #
x̄ − #
√
s/ n
z=p
p̂1 − p̂2 − #
p̂1 (1 − p̂1 )/n1 + p̂2 (1 − p̂2 )/n2
∗, ∗∗
or p1 − p2 6= #
∗ If # = 0, can also use z = p
p̂1 − p̂2 − #
n1 p̂1 + n2 p̂2
p
, where p̂ =
n1 + n2
p̂(1 − p̂) 1/n1 + 1/n2
p̂1 − p̂2 − #
p̂1 − p̂2 − #
p
is equivalent to z = p
∗∗ For large sample size, z = p
p̂(1 − p̂) 1/n1 + 1/n2
p̂1 (1 − p̂1 )/n1 + p̂2 (1 − p̂2 )/n2
Hypothesis Testing for small samples; standard deviation σ is unknown
Hypothesis
H0 : µ = #
H0 : µ1 − µ2 = #
Statistic
√
t = X̄−#
s/ n
t=
Reference Distribution
t is t-dist. with n − 1 d.f.
X̄q
1 −X̄2 −#
sp n1 + n1
1
t dist with n1 + n2 − 2
2
s2p is the pooled variance calculated as s2p =
(n1 −1)s21 +(n2 −1)s22
n1 +n2 −2
Hypothesis Testing for small samples; Rejection Region
Test
one-sample t-test
(one-sided, right tail)
Alternative Hypothesis
H1 : µ > #
Rejection Region (R.R)
t > t(n−1),α
one-sample t-test
(one-sided, left tail)
H1 : µ < #
t < −t(n−1),α
one-sample t-test
two-sided
H1 : µ 6= #
|t| > t(n−1),α/2
two-sample t-test
(one-sided, right tail)
H1 : µ1 − µ2 > #
t > t(n1 +n2 −2),α
two-sample t-test
(one-sided, left tail)
H1 : µ1 − µ2 < #
t > t(n1 +n2 −2),α
two-sample t-test
(two-sided)
H1 : µ1 − µ2 6= #
|t| > t(n1 +n2 −2),α/2
Download