LECTURE 21 Types of Inference models/approaches Model building versus inferring unkno wn

advertisement
Θ
Θ
pΘ(θ)
Θ
pΘ(θ)
Θ
Θ
Θ
Θ
pΘ(θ)
N
pΘ(θ)
N
N
Θ
pΘ(θ)
Θ Θ
pX|Θ(x | θ)
Types of
Inference models/approaches
pΘ(θ)pΘ(θ)
Sample Applications
N
(x | θ)
pX|Θ
•
Model
building
versus
inferring
unkno
wn
(x
|
θ)
p
pΘ(θ)
X|Θ
N
pΘ(θ)
pΘ(θ)
• Readings: Sections• 8.1-8.2
X
Polling
variables.
E.g., assume X = aS + W
N
N
(x | θ)
p
X
– ModelNbuilding:
X X|Θ
– Design of experiments/sampling
N
pX|Θ(x | θ) methodologies
Θ̂
N
“It is the mark of truly educated people
| θ)(x | know
pX|Θ(x
θ)
p
“signal”
S,
observe
X,
infer
a
X|Θ
Lancet study on Iraq death toll
ΘΘ
X
to be deeply moved by–statistics.”
Θ̂
(x
| the
θ)Θ̂ presence
pX|Θ
–
Estimation
in
of
noise:
X
(x
|
θ)
p
Θ
Estimator
X|Θ
pX (x;Xθ)
pX|Θ(x | θ)
(Oscar Wilde)
X
• Medical/pharmaceutical
trials
a, observe X,
pknow
Θ
Θ̂ estimate S. Θ
p(θ)
Estimator
Θ (θ)
Estimator
X
Θ̂
X
pΘ(θ)
• Hypothesis testing: unknown
takes one of
• Data
mining X
Θ̂
Θ̂
Reality
Model
N N possible values;
Θ
Estimator
pΘ(θ)
few
at small
(e.g., customer arrivals)
Poisson)
Θ ∈ {0, 1}
X =Θ+W
W ∼ aim
fW (w)
–(e.g.,Netflix
competition
Estimator
Θ̂ Θ̂
N
probability
of
incorrect
decision
Estimator
Θ̂
Estimator
(x(x
| θ)
pX|Θ
Θ
pΘ(θ)
N
| θ)
pX|Θ
Finance
Θ
Estimator
Data •
at a small
error
∈ {0,
1} Estimator
X = aim
Θ+W
W ∼ fW (w) • ΘEstimation:
| θ)
pX|Θ(xestimation
LECTURE 21
Estimator
• Design &Θ̂interpretation of experiments
Θ̂
– polling,
trials.
W ∼ .f.W (w)
Matrix Completion
pmedical/pharmaceutical
Y |X (y | x)
Θ̂
•
Netflix
competition
•
Finance
| ·)
Y |X (·to
! Partially observed matrix:pgoal
predict the
X̂
sensors
unobserved entriesΘ̂
fΘ(θ)
pY |X (y | x)
objects X̂ θ N
2 1 pY 4|X (y |5x)
X̂
5 4
? 1 3
3 5
2
Y
θ
measurement
4
?
5 3 ?
pY (y ; θ)
4 1X̂
3
5
θ
2
1?
4
1
3
3
4
2?
1
1
2
3
5
2
3
1
3
?5
θ1
1
51
5
?
5
3?
5
2
3
4
2
3
Θ̂
5
pX (x)
X̂
Estimator
X
10
θ
Estimator
pX|Θ
pX
(x; (x
θ) | θ) Θ
pX|Θ(x | θ)
Θ̂
X
Θ̂
due to copyright restrictions.
Θ̂
pY |X (· | ·)
N
pX (x)
2 ? X4 4
1 5 4 5pY |X (y | x)
4
5?
4
Θ̂
pY (y ; θ)
p1Y (y ; θ)
1/6
Estimator
Graph of S&P 500 index removed
4
1
W ∼ fW
Θ 1}
∈ {0, 1}X = Θ
X+
=W
Θ+W
∼ fW (w)Θ ∈ {0,
W (w)
N
pΘ(θ) XX
pΘ(θ)
• Classical statistics: X
Θ ∈ {0, 1}
X Θ̂
=Θ̂Θ + W
pX|Θ(x |Θθ)
N
N
Θ̂
θ: unknownXparameter
a r.v.)
(θ)
pΘ(not
Θ Θ
pX|Θ(x | θ)
Estimator
◦ E.g., θ =
mass
of
electron
Θ̂
p (θ)
Estimator
N N
Estimator
Θ
X ∈ {0, 1}
Θ
W ∼ fW (w)
• Signal processing
W ∼ fW (w)
fΘ(θ)
θ
– Tracking,
detection, speaker identification,. . .
Y =X +W
pY (y ; θ)
X
Θ
Θ̂
pΘ(θ)
Estimator
N
N
pX|Θ(x | θ)
pX|Θ(x | θ)
Θ̂
X
X
Θ̂
Θ
pΘ(θ)
• Bayesian:
Use priors &XBayes rule
Estimator
X̂
Y
pΘ(θ)
Estimator
N
pX|Θ(x | θ)
pΘp(θ)
(θ)
{0,
1}| θ)
p∈(x
|(x
θ)
pΘ
X|ΘX|Θ
Θ
1/6 X X4
NN
10
X = Estimator
Θ+W
Θ̂
Estimator
θ
Estimator
Θ̂ Θ̂
(x(x
| θ)
pX|Θ
| θ)
pX|Θ
Estimator
Estimator
XX
Θ̂Θ̂
Estimator
Estimator
Bayesian inference: Use Bayes rule
Estimation with discrete data
• Hypothesis testing
– discrete data
pΘ|X (θ | x) =
fΘ|X (θ | x) =
pΘ(θ) pX |Θ(x | θ)
pX (x)
pX (x) =
– continuous data
pΘ|X (θ | x) =
pΘ(θ) fX |Θ(x | θ)
!
fΘ(θ) pX |Θ(x | θ)
pX (x)
fΘ(θ)pX |Θ(x | θ) dθ
• Example:
fX (x)
– Coin with unknown parameter θ
– Observe X heads in n tosses
• Estimation; continuous data
fΘ|X (θ | x) =
• What is the Bayesian approach?
fΘ(θ) fX |Θ(x | θ)
– Want to find fΘ|X (θ | x)
fX (x)
– Assume a prior on Θ (e.g., uniform)
Zt = Θ0 + tΘ1 + t2Θ2
Xt = Zt + Wt,
t = 1, 2, . . . , n
Bayes rule gives:
fΘ0,Θ1,Θ2|X1,...,Xn (θ0, θ1, θ2 | x1, . . . , xn)
1
1
θ)
pX|Θ(x | W
(w)
pX (x) =
f ( ) pf W
X | (x | ) d
(x
|
θ)
p
X|Θ
W
pW ( w )
pX (x)
pΘ(θ)
X
Y = X + W
+
N | θ)
pN
N (x
X|Θ
0 , p1Y} ( y ; ) Y = X + W
EXxaX
m{ ple:
N
| θ)
(x
pX|Θ
X
(x
| θ|) θ) Θ̂
pXp|(x
object at unknown location X
X|Θ
Θ
Θ
W Θ̂p W ( w )
sensors
Least
Mean
Squares
Estimation
pX|Θ(x | θ)
Estimator
pΘ(θ)
XX X
Θ̂
Y = X + W
Estimator in the absence of information
• Estimation
X
N Θ+W
Θ ∈ {0, 1}
X=
W ∼ fW (w)
Θ̂ Θ̂ Θ̂
Estimator
Θ ∈ {0, 1}
X =Θ+W
W ∼ fW (w)
pX|Θ(x | θ)
Θ̂
θ
p X | (f1Θ
= 1}1/6 X =4Θ + W10
Θ| (θ)
Estimator
∈) {0,
W
∼
ft W
E sEstimator
i m(w)
a t or
f=ΘP(θ)
(se nsor i1/6
“ se nses ” t4
h e o b j e c10
t |
= ) θX
Estimator
∈
{0,
=
XΘ
+
W
∼W
fW∼
4t aΘ
θnsor
fW
(w
) = hΘ
Θ
∈n c∈
{e1}
0,o{0,
1 } 1}
==
Θ
++
WW
W
∼
f(w)
( dis
f10
fr o X
m X
se
i )Θ
W (w)
Θ (θ)
Wf1/6
NpΘp(θ)
Θ)(θ)
Θp(θ
Output of Bayesian Inference
• Posterior distribution:
– pmf pΘ|X (· | x) or pdf fΘ|X (· | x)
Θ̂
W
• If interested in a single answer:
– Maximum a posteriori probability (MAP):
◦
fW (w)
Θ {0, 1}
X =Θ+W
4 4 4 101 010 θ θ θ
fΘf(θ)
1 /1/6
6
Θ)(θ) 1/6
Θf(θ
fΘ(θ)
pΘ|X (θ∗ | x) = maxθ pΘ|X (θ | x)
#
• Optimal estimate: c = E[Θ]
◦ fΘ|X (θ∗ | x) = maxθ fΘ|X (θ | x)
• Optimal mean squared error:
– Conditional expectation:
E[Θ | X = y] =
"
minimize E (Θ − c)2
minimizes probability of error;
often used in hypothesis testing
!
1/6
4
10
θ
• find estimate c, to:
Estimator
"
#
E (Θ − E[Θ])2 = Var(Θ)
θfΘ|X (θ | x) dθ
– Single answers can be misleading!
LMS Estimation of Θ based on X
LMS Estimation w. several measurements
• Two r.v.’s Θ, X
• Unknown r.v. Θ
• we observe that X = x
• Observe values of r.v.’s X1, . . . , Xn
– new universe: condition on X = x
• E
"
#
(Θ − c)2 | X = x is minimized by
"
(Θ − E[Θ | X = x])2 | X = x
• Best estimator: E[Θ | X1, . . . , Xn]
• Can be hard to compute/implement
c=
• E
– involves multi-dimensional integrals, etc.
#
≤ E[(Θ − g(x))2 | X = x]
"
#
"
◦ E (Θ − E[Θ | X])2 | X ≤ E (Θ − g(X))2 | X
"
#
"
◦ E (Θ − E[Θ | X ])2 ≤ E (Θ − g(X))2
"
#
#
E[Θ | X] minimizes E (Θ − g (X))2
over all estimators g(·)
#
2
MIT OpenCourseWare
http://ocw.mit.edu
6.041SC Probabilistic Systems Analysis and Applied Probability
Fall 2013
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Download