16FEB2009 HOMEWORK 3 CONDITIONAL INDEPENDENCE DUE WEDNESDAY 04MAR2009 Recall the notation from the lecture notes: Let Ω = ×(Ωv |v ∈ V ), where V is a finite set and Ωv is a measurable space. Let P be a probability measure on Ω, i.e., P ∈ P(Ω), and let A, C ⊆ V be two disjoint subsets of V . The defining equations for the family PA|C := (PA|C,ωC ∈ P(ΩA )|ωC ∈ ΩC ) of conditional probabilities of A given C are Z PA∪C (M × M ) = PA|C,ωC (MA )1MB (ωC )dPC (ωC ), MA ⊆ ΩA , MC ⊆ ΩC . ˙ A C ΩC A compressed way of the above defining equations is the single defining equation: Z PA∪C = PA|C,ωC ⊗ ǫωC dPC (ωC ), ˙ ΩC where ǫωC ∈ P(ΩC ) is the one point measure at ωC , i.e., the family wrt the projection prA∪C,C : (PA|C,ωC ⊗ ǫωC ∈ P(ΩA∪C ˙ )|ωC ∈ ΩC ) is a decomposition of PA∪C ˙ ˙ ΩA∪C → ΩC . ˙ We will assume that the measurable spaces Ωv , v ∈ V , are ”so nice” that for all disjoint subsets A, C ⊆ V the family of conditional probabilities of A given C exists and are unique (up to a null set wrt. PC ). This is for example ensured if Ωv as a measurable space can be considered as a measurable subset of RI for some finite set I. In random variable language1: Let X ≡ (Xv |v ∈ V ) be a random variable with values in Ω ≡ ×(Ωv |v ∈ V ) and distribution P ∈ P(Ω). Let A, C ⊆ V be two disjoint subsets of V . The defining equations for the family PA|C := (PA|C,ωC ∈ P(ΩA )|ωC ∈ ΩC ) of conditional probabilities (distributions) of XA ≡ (Xv |v ∈ A) given XC ≡ (Xv |v ∈ C) are P r((XA , XC ) ∈ MA × MC ) = E(P r(XA ∈ MA |XC )1MC (XC )), MA ∈ ΩA , MC ∈ ΩC , where we ˙ , note that the right hand R have used E for expectation. Note that (XA , XC ) ≡ XA∪C side is ΩC P r(XA ∈ MA |XC = ωC )1MC (ωC )dPC (ωC ), and note that we have used the standard notations P r(XA ∈ MA |XC ) := PA|C and P r(XA ∈ MA |XC = ωC ) := PA|C,ωC for the family of conditional probabilities and entries in the family, respectively. Suppose, in Problems 1-3, R that P has a density f wrt. a (σ-finite) measure µ on Ω, written P = f · µ, i.e., P (M ) = Ω 1M (ω)f (ω)dµ(ω), M ⊆ Ω. Suppose furthermore that f is positive and that µ = ⊗(µv |v ∈ V ) a product (measure) of (σ-finite) measures µv 6= 0 on Ωv , v ∈ V . Let A ⊆ V and let Ac denote the complement of A within V . The marginal distribution prA (P ) =: PA ∈ P(ΩAR) is then given by PA = fA · µA , where µA := ⊗(µv |v ∈ A) and fA (ωA ) := ΩAc f (ωA , ωAc )dµAc (ωAc ), ωA ∈ ΩA . 1It is unecessary to read this paragraph about random variable. It is mentioned only to enhance your understanding of our notation. 1 2 Let B ⊆ V with A ∩ B = ∅. Then A ⊥ B, saying that A and B are (stochastic) independent, i.e., PA∪B = PA ⊗ PB , if and only if for all ωA∪C ≡ (ωA , ωC ) ∈ ΩA × ΩC ≡ ΩA∪C ˙ ˙ ˙ fA∪B ˙ (ωA , ωB ) = fA (ωA ) · fB (ωB ). (1) This statement is NOT correct as stated but should only hold for almost all ωA∪B wrt ˙ µA∪B . From now on we (you) should disregard all such qualifiers ”almost all · · · ”. ˙ Let C ⊆ V with A ∩ C = ∅. A vesion of the conditional probability of A give C is then given by PA|C,ωC = fA|C,ωC · µA , ωC ∈ ΩC , where (2) fA|C,ωC (ωA ) := fA∪C ˙ (ωA , ωC ) , (ωA , ωC ) ≡ ωA∪C ∈ ΩA∪C ≡ ΩA × ΩC . ˙ ˙ fC (ωC ) Suppose that A, B, and C are pairwise disjoint. Then A ⊥ B|C, saying that A and B are = PA|C,ωC ⊗ PB|C,ωC , ωC ∈ ΩC , if and only if for all independent given C, i.e., PA∪B|C,ω ˙ C (ωA , ωB , ωC ) ∈ ΩA∪B ˙ ∪C ˙ (ωA , ωB ) = fA|C,ωC (ωA )fB|C,ωC (ωB ), fA∪B|C,ω ˙ C cf. (1), or (3) fA∪B ˙ ∪C ˙ (ωA , ωB , ωC ) = fA∪C ˙ (ωA , ωC )fB ∪C ˙ (ωB , ωC ) . fC (ωC ) cf. (2). To establish the results in Problem 1 and 2 below you should assume that P = f · µ, use density formulas above in your arguments, and as mentioned above disregard qualifiers ”almost all · · · ”. Problem 1: Let P = f · µ be a probability measure on Ω and let A, B, C ⊆ V be pairwise disjoint. (i) Suppose A and B are independent, i.e., A ⊥ B. Establish that the (constant) family (PA |ωB ∈ ΩB ) is a family PA|B of conditional distribution of A given B. (ii) Suppose A and B are independent given C, i.e., A ⊥ B|C. Establish that the family (PA|C,ωC |(ωB , ωC ) ∈ ˙ ΩB ∪C of conditional probabilities of A given B ∪C. ˙ ) is a family PA|B ∪C ˙ Problem 2: Let P = f · µ be a probability measure on Ω. Establish (CI1)-(CI5) from the lecture notes i.e., establish (CI1) For A, B, C ⊆ V pairwise disjoint: A ⊥ B|C if and only if B ⊥ A|C (CI2) For A, B, C ⊆ V pairwise disjoint and A0 ⊆ A: A ⊥ B|C implies that A0 ⊥ B|C. ˙ 1 : A ⊥ B|C implies that (CI3) For A, B, C ⊆ V pairwise disjoint and A = A0 ∪A A0 ⊥ B|C ∪ A1 . ˙ implies that A ⊥ B ∪D|C. ˙ (CI4) For A, B, C, D ⊆ V pairwise disjoint: A ⊥ B|C and A ⊥ D|B ∪C (CI5) For A, B, C ⊆ V pairwise disjoint: A ⊥ B|C and A ⊥ C|B implies that ˙ A ⊥ B ∪C. Problem 3: Let P = f · µ be a probability measure on Ω and let A, B, C, D ⊆ V be pairwise disjoint. Establish the generalization of (CI5): ˙ ˙ ˙ (CI5*) If A ⊥ B|(C ∪D) and A ⊥ C|(B ∪D) then A ⊥ (B ∪C)|D. 3 Problem 4: As mentioned in the lecture notes (CI5) (and (CI5*)) does not hold in general for all probability measures P ∈ P(Ω). Consider the simple case V = {a, b, c}, Ωv = {0, 1}, v = a, b, c, i.e., Ω = {0, 1}{a,b,c} ≡ {0, 1}3 and P given by P ({(1, 1, 1)}) = P ({(0, 0, 0)}) = 21 and note that P do not have density wrt to a product measure on {0, 1}{a,b,c} . Establish that P does not satisfy (CI5) from Problem 3. Hint: use the three disjoint subsets A = {a}, B = {b}, and C = {c} of V = {a, b, c}. Problem 5: Consider the acyclic mixed graph (AMG) depictured below and known as one of the examples from the lectures: g h b d f k a c e i j (i) Find for each vertex v ∈ V ≡ {a, b, c, d, e, f, g, h, i, j} the neighbors nb(v), the parents pa(v), the ancestors an(v), the decendent de(v), and the non-decendet nd(v) of v. (Fill in the table below.) vetex a b c d e f g h i j k nb(v) pa(v) an(v) de(v) nd(v) ii) Which subsets an(v), v = a, b, c, d, e.f, g, h, i, j, k, are an ancestral subset? iii) Find for each of the following five subsets of {a, b, c, d, e, f, g, h, i, j, k} the smallest ancestral subset containing the subset. The five subsets: {a}, {i}, {k}, {i, h}, and {i, k}. 4 Problem 6: Let V ≡ (V, E) be an acyclic directed graph (ADG). (i) Establish that all the subsets an(v), v ∈ V , are ancestral subsets of V . (ii) Suppose that m ∈ V has no children, i.e., ch(m) = ∅. (We say that m is a terminal vertex.). Establish that Vm := V \{m} is an ancestral subset of V . (iii) Suppose that P ∈ P(Ω) with Ω = ×(Ωv |v ∈ V ) satisfies the Markov conditions wrt. V. The subgraph induced by any subset of V is of course again an ADG. Verify that the marginal distribution PVm of P on Vm satisfies the Markov conditions with respect to Vm the subgraph induced by the subset Vm . (iv) The generalization of (iii). Let A ⊆ V be an ancestral subset. Verify that the marginal distribution PA of P on A satisfies the Markov conditions with respect to VA the subgraph induced by the subset A. GRADING: 15 points for each of the six problems for a possible total of 90 points. You can thus earn up to 40 points more that the scheduled 50 points for each homework/homeproject assignment.