Stat 643 Review of Probability Results (Cressie)

advertisement
Stat 643 Review of Probability Results (Cressie)
Probability Space: (H,T,T )
H is the set of outcomes
T is a 5-algebra; subsets of H
T is a probability measure mapping from T onto [0,1].
Measurable Space: (H,T ).
Random Variable: Suppose (H,T,T ) is a probability space and let \ : H Ä ‘ be measurable (i.e.,
{= - H: \ (=) Ÿ !} - T). Then \ is said to be a random variable (r.v.).
Integral of a Measurable Function: Suppose (H,T,.) is a measure space (i.e., . maps from T onto
[0,_]) and 0 is a measurable mapping. Then ' 0 . . is defined as a limit of integrals of simple (i.e.,
step) functions:
Write: 0 œ 0 + • 0 • , where 0 + , 0 • 0. If ' 0 + .. • _ and ' 0 • .. • _ then the integral is said to be
finite. If ' 0 + .. œ _ œ ' 0 • . ., then the integral is said not to exist; otherwise it is said to
exist. The measurable function 0 is said to be integrable if ' 0 . . exists and is finite.
Notation: If \ is a r.v. on (H,T,T ), write E(\ ) for ' \.T .
Important Convergence Theorems
Let (H,T,.) be a measure space and (‘,U1 ) be the measurable space of real numbers with the Borel 5algebra U1 . In the following, 1, 0 , {08 }8>1 , and {18 }8>1 denote measurable functions
from (H,T ) into (‘,U1 ).
Fatou's Lemma: If 08
0 a.s. (.), for all 8
1, then
' liminf 08 . . Ÿ liminf ' 08 . . .
8Ä_
8Ä_
Monotone Convergence Theorem: Suppose that a.s. (.), 0 Ÿ 08 Å 0 . Then
' 08 . . Å ' 0 . . .
Dominated Convergence Theorem: Suppose that a.s. (.), 08 Ä 0 as 8 Ä _ and |08 | Ÿ 1 for all
8 1. If ' 1. . • _, then 0 is integrable and lim ' 08 . . œ ' 0 . ..
8Ä_
Extended Dominated Convergence Theorem: Suppose that
(i)
(ii)
(iii)
08 Ä 0 a.s. (.), 18 Ä 1 a.s. (.).
|08 | Ÿ 18 a.s. (.) and ' 18 . . • _, for all 8
lim ' 1 .. œ ' 1.. • _.
8Ä_ 8
1.
1
lim ' 0 . . œ ' 0 . . • _ .
8Ä_ 8
Then,
Note: Dominated convergence is a special case with 18 œ 1 for all 8
1.
Scheffe's Theorem: Suppose 08 0 and 0 0 a.s. (.). Let /8 (E) ´ 'E 08 .. and / (E) ´ 'E
0 .. be measures on (H,T,.) with /8 (H) œ / (H) • _, for all 8 1. If 08 Ä 0 a.s. (.), then
(i)
(ii)
sup{|/8 (E)•/ (E)|: E - T } Ä 0, as 8 Ä _ and
' |08 •0 |.. Ä 0, as 8 Ä _.
Uniform Integrability
The sequence of measurable functions {08 }8>1 is called uniformly integrable (w.r.t. .) if
lim sup '
|08 |.. œ 0 .
- Ä _ 8>1 {|08 |>-}
Theorem: Suppose that .(H) • _, 08 Ä 0 a.s. (.), and the sequence {08 }8>1 is uniformly
integrable. Then {08 : 8 1}, 0 are all integrable and ' 08 . . Ä ' 0 . ..
Various Forms of Convergence for r.v.'s
a.s.
\8 Ä \ if T ( lim \8 œ \ ) œ 1
8Ä_
T
Convergence in probability: \8 Ä \ if lim T (|\8 •\ | ž %) œ 0, a % ž 0
8Ä_
L:
Convergence in Lp :
\8 Ä \ if for ' |\8 |: .T • _, ' |\ |: .T • _ ,
lim ' |\8 •\ |: .T œ 0
8Ä_
e.g., : œ 1 corresponds to convergence in the mean
: œ 2 corresponds to convergence in mean square
Almost sure convergence:
Jensen's Inequality: Let \ ´ (\1 ,...,\. )w be a random vector (i.e., a measurable mapping from
(H,T,T ) to (‘. ,U. ), where U. is the 5-algebra of Borel sets in ‘. ). Suppose \ - H a.s., where H is a
convex set in ‘. , and E(|\ |) • _. (Recall that |\ | ´ (\"# +â+\.# )1/2 .) Define
E(\ ) ´ (E(\1 ),...,E(\. ))w . Let 9: H Ä ‘, where 9 is convex (i.e.,
9(!B € (1•!)C) Ÿ !9(B) € (1•!)9(C)). Then
E(9(\ ))
9(E(\ )).
Corollary: If < is concave then E(<(\ )) Ÿ <(E(\ )) (< is concave iff • < is convex).
Radon-Nikodym Theorem
Definition: A signed measure . on a measurable space (H,T ) is a mapping .: T Ä (•_,_] such that
(s.t.) .(9) œ 0 and
.(-E3 ) œ !.(E3 ),
(*)
M
M
2
where M is countable and {E3 } are disjoint. The equality in (*) is taken to mean that the summation
converges absolutely if .( - E3 ) is finite, and diverges otherwise.
3-M
Jordan Decomposition: A signed measure . can be written as . œ .+ • .• , where .+ and .• are
measures.
Definition: |.| ´ .+ € .• .
Example: Let \ be a r.v. s.t. ' \ • .T • _ . Then
.(E) ´ 'E \.T ; E - T
is a signed measure.
Definition: A measure . is a 5-finite measure if b disjoint {E3 } s.t. H œ - E3 and .(E3 ) • _;
_
3 œ 1,2,... .
3=1
Definition: A [signed] measure . is said to be absolutely continuous (a.c.) with respect to a
[signed] measure / if / (E) œ 0 Ê .(E) œ 0 [|/ |(E) œ 0 Ê |.|(E) œ 0]. Write . << / .
Definition: Two measurable mappings 0 and 1 on (H,T,.) are said to be equivalent if .(0 Á 1) œ 0.
Radon-Nikodym (R-N) Theorem
(1) Let (H,T,T ) be a probability space and . a signed measure s.t. |.| << T . Then b r.v. \ , unique up
to equivalence (T ), s.t.
.(E) œ 'E \.T , a E - T ,
where ' \ • .T • _ .
(2) Let (H,T,/ ) be a measure space, where / is 5-finite, and let . be a signed measure s.t. |.| << / .
Then b measurable mapping 0 , unique up to equivalence (/ ), s.t.
.(E) œ 'E 0 . / , a E - T ,
•
'
where 0 . / • _.
Notes: (i) When (H,T,T ) is a probability space, T is trivially 5-finite and so (1) is just
a special case of (2).
(ii) When . is a measure, 0 in (2) (and thus \ in (1)) is nonnegative.
Notation: \ in (1) or 0 in (2) is called the Radon-Nikodym derivative and is denoted as . ./dT in
(1) or .././ in (2).
Conditional Expectations
Let (H,T,T ) be a probability space and \ a r.v. s.t. \ • is integrable. Let V § T be a sub 5 algebra (i.e., V is a 5-algebra contained in T ). Define a signed measure . on (H,V) by:
3
.(C) ´ 'G \.T ; G - V .
Definition: Let TV be the probability measure on V given by
TV (G ) œ T (G ) ; a G - V .
Then, . << TV and for any V-measurable r.v. ] (i.e., ] •1 ((•_,!]) - V, a ! - ‘ ),
' ] .TV œ ' ] .T .
Definition: A function 1: H Ä ‘ is called (a version of) the conditional expectation of \ given V if
(i)
(ii)
1 is V-measurable
' 1.T œ ' \.T , a G - V .
G
G
Note: The R-N Theorem guarantees the existence of the conditional expectation because
..
.(G ) œ 'G 1.TV , where 1 is the R-N derivative .T
; i.e.,
V
.(G ) œ 'G 1.T .
The r.v. 1 on V is unique up to equivalence.
Notation: 1 œ E(\ |V).
Notes:
(i) If \ is V-measurable (i.e., \ •1 ((•_,!]) - V, a ! - ‘) then E(\ |V) œ \
a.s. (T ).
(ii) If V œ {9,H}, then E(\ |V) œ E(\ ).
(iii) Suppose ] is another r.v. on (H,T,T ) and define U (] ) ´ {] •1 (F ):
F - U } to be the 5-algebra generated by ] . Then write E(\ |U (] )) as E(\ |] ).
(iv) The conditional expectation of \ “smooths" the r.v. \ . Suppose \ is T measurable on (H,T,T ). Then:
Sub 5-algebra:
E(\ | † )
{9,H}
§
V
§
T
:
E(\ )
E(\ |V)
\ a.s.
“smoothest" “smooth" “roughest"
Conditional Monotone Convergence Theorem: If \8
E(\ |V) a.s. (T ).
0, \8 Å \ a.s. (T ), then E(\8 |V) Å
Conditional Dominated Convergence Theorem: If \8 Ä \ a.s. (T ), |\8 | Ÿ ] a.s. (T ) and
E(|] |) • _, then
lim E(\8 |V) œ E(\ |V) a.s. (T ) .
8Ä_
4
Conditional Jensen's Inequality: Let 9: H Ä ‘, where 9 is convex and H is a convex subset of ‘.
Let \ be a r.v. on (H,T,T ) s.t. \ - H a.s. (T ). Suppose E(|9(\ )|) • _. Then E(9(\ )|V ) 9(E(\ |V))
a.s. (T ).
Regular Conditional Probability
Let E - T and ME be the indicator function of M . Then E(ME |V) is a V-measurable r.v. that has some
properties of a probability on sets E - T . Notice that T (=,E) ´ E(ME |V)(=) is a mapping
from H ‚ T onto [0,1].
Definition: Given a probability space (H,T,T ), <: H ‚ T Ä [0,1] is called a regular conditional
probability on T given V if
(i) for each fixed = - H, <(=, † ) is a probability measure on (H,T ),
(ii) for each fixed E - T , <( † ,E) is V-measurable, and
(iii) for every E - T , 'G <(=,E).T (=) œ T (E • G ), a G - V.
Note: Although E(ME |V) satisfies (ii) and (iii), it does not necessarily satisfy (i).
Theorem: If (H,T,T ) œ (‘. ,U. ,T ), then a regular conditional probability on U. given V exists
and is given by <(=,E) œ E(ME |V)(=).
Change of Variable Theorem: Let 0 : (H,T,.) Ä (H* ,T* ,/ ) and 1: (H* ,T* ,/ ) Ä (‘,U1 ) be two
measurable functions, where
/ (E* ) ´ .(0 •1 (E* ))
is the measure induced by 0 on (H*,T *). Then
'
E*
1. / œ '0 •1 (E* ) 1 ‰ 0 . . œ '0 •1 (E* ) 1(0 (=)). . , a E* - T * ,
in the sense that if one of the integrals exists then so does the other and they are equal.
Probability versus Statistics
Probability is concerned with r.v.'s \ : (H,T ,P) Ä (‘,U1 ). Now \ induces a probability measure
T \ on (‘,U1 ) through
T \ (F ) ´ T (\ •1 (F )) , a F - U1 .
Statistics focuses on the triple (‘,U1 ,T \ ) and essentially forgets about (H,T,T ). More generally, \ does
not have to be a r.v. on (‘,U1 ) but could be some more general random quantity. Then
data \ could be thought of simply as a measurable mapping into (k ,U ,T \ ) .
Example: k œ ‘. and U œ U. : Then \ is a random vector. More complicated k and U are
needed when \ is, say, a random set.
Definition: A statistic X is a measurable mapping
X : (k ,U ,T \ ) Ä (g ,Y ) .
Note: T X induces a T X : T X (J ) ´ T \ (X •1 (J )), a J - Y .
5
Notation: U0 ´ U (X ) ´ {X •1 (J ): J - Y }, the 5-algebra generated by X . Then U0 § U is a sub
5-algebra.
Lehmann's Theorem (TSH, p. 42):
A real valued U -measurable function 9 is U0 -measurable iff b a real-valued Y -measurable function
<: (g ,Y ) Ä (‘,U1 ) s.t.
9(B) œ <(X (B)) , a B - k ,
where X is a statistic mapping into (g ,Y ) and U0 œ U (X ).
Partitioning the Sample Space
Suppose k is the sample space and X is a statistic. Then X defines a partition of k as follows: For
B,C - k , B and C are in the same member of the partition (write B µ C) iff X (B) œ X (C). Notice that two
“different" statistics can generate the same partition, e.g.,
X1 (B) œ !B3 and X2 (B) œ !B3 /. .
.
.
3=1
3=1
It is tempting to characterize a statistic's behavior via its partition of k but, for technical reasons, it does
not necessarily generate the 5-algebra of interest, namely U (X ). In general, if
U (X1 ) œ U (X2 ) then we say the two statistics X1 ,X2 are the same.
Change of Variable Formula Involving a Statistic
1
X
Suppose
(k ,U ,T \ ) Ä (g ,Y ,T X ) Ä (‘,U1 ) .
Then,
' •1 1(X (B)).T \ (B) œ ' 1(>).T X (>) ; J - Y ,
X (J )
J
in the sense that if one integral exists then so does the other and they are equal.
Proof:
Assume J œ g , the whole space. First let 1 œ MJ1 for some J1 - Y . Then
1(X (B)) œ MJ1 (X (B)) œ MX •1 (J1 ) (B) .
6
Thus, since k œ X •1 (g ),
' 1(X (B)).T \ (B) œ ' MX •1 (J1 ) (B).T \ (B)
k
k
œ T \ (X •1 (J1 ))
œ T X (J1 )
œ 'g MJ1 (>).T X (>)
œ 'g 1(>).T X (>) .
The same result is obtained when J is not g . Therefore, the result is true for indictor functions,
so true for simple functions, and hence true for limits of simple functions.
Probability Version of Change of Variables Formula
Suppose
1
\
(H,T,T ) Ä (k ,U ,T \ ) Ä (‘,U1 ) .
Then,
' 1(\ (=)).T (=) œ ' 1(B).T \ (B) ,
H
k
which is known as the law of the unconscious statistician.
Change of Variables and the R-N Derivative
Suppose T \ << ., where . is some 5-finite measure on (k ,U ).
Define 0 (B) ´ (.T \ /. .)(B); B - k , which recall is a U -measurable mapping; i.e.,
T \ (F ) œ 'F 0 . . .
Then, for 1 s.t. ' |1(B)|.T T (B) • _, the change of variable formula gives
E(1(\ )) ´ 'H 1(\ (=)).T (=)
œ 'k 1(B).T \ (B) .
Further,
' 1(B).T \ (B) œ ' 1(B)0 (B )..(B) .
k
k
7
This last equality is true for 1( † ) œ MF ( † ), because' 1.T \ œ ' MF .T \ œ T \ (F ) œ 'F 0 .. œ ' MF 0 . .
œ ' 10 . .. Hence it is true for simple functions, and so it is true for limits of simple functions.
Example: . is Lebesgue measure. Write “.B" as shorthand for ..(B). For example, if
' B2 .T \ (B) • _, then E(\ 2 ) œ ' B2 0 (B).B, where 0 (B) is the R-N derivative of T \ wrt .; 0 is
commonly called the probability density function.
8
Download