Topic Outline 1) Motivation 2) Representing/Modeling Causal Systems 3) Estimation and Updating 4) Model Search 5) Linear Latent Variable Models 6) Case Study: fMRI 1 Discovering Pure Measurement Models Richard Scheines Carnegie Mellon University Ricardo Silva* University College London Clark Glymour and Peter Spirtes Carnegie Mellon University 2 Outline 1. Measurement Models & Causal Inference 2. Strategies for Finding a Pure Measurement Model 3. Purify 4. MIMbuild 5. Build Pure Clusters 6. Examples a) Religious Coping b) Test Anxiety 3 Goals: • What Latents are out there? • Causal Relationships Among Latent Constructs Relationship Satisfaction Depression or Relationship Satisfaction Depression or ? 4 Needed: Ability to detect conditional independence among latent variables 5 Lead and IQ e2 e3 Parental Resources Lead Exposure IQ Lead _||_ IQ | PR PR ~ N(m=10, s = 3) Lead = 15 -.5*PR + e2 e2 ~ N(m=0, s = 1.635) IQ = 90 + 1*PR + e3 e3 ~ N(m=0, s = 15) 6 Psuedorandom sample: N = 2,000 Parental Resources Lead Exposure IQ Regression of IQ on Lead, PR Independent Variable Coefficient Estimate p-value Screened-off at .05? PR 0.98 0.000 No Lead -0.088 0.378 Yes 7 Measuring the Confounder e1 X1 Lead Exposure e2 e3 X2 X3 Parental Resources IQ X1 = g1* Parental Resources + e1 X2 = g2* Parental Resources + e2 X3 = g3* Parental Resources + e3 PR_Scale = (X1 + X2 + X3) / 3 8 Scales don't preserve conditional independence X1 X2 X3 Parental Resources Lead Exposure IQ PR_Scale = (X1 + X2 + X3) / 3 Independent Coefficient p-value Screened-off Variable Estimate PR_scale 0.290 0.000 No Lead -0.423 0.000 No at .05? 9 Indicators Don’t Preserve Conditional Independence X1 X2 X3 Parental Resources Lead Exposure IQ Regress IQ on: Lead, X1, X2, X3 Independent Variable Coefficient Estimate p-value Screened-off at .05? X1 0.22 0.002 No X2 0.45 0.000 No X3 0.18 0.013 No Lead -0.414 0.000 No 10 Structural Equation Models Work X1 X2 X3 Parental Resources Lead Exposure IQ b Structural Equation Model • E ( bˆ ) 0 • bˆ .07 (p-value = .499) • Lead and IQ “screened off” by PR 11 Local Independence / Pure Measurement Models • For every measured item xi: xi _||_ xj | latent parent of xi F1 x1 x2 x3 F3 F2 x4 x5 x6 x7 x8 x9 x10 x11 x12 12 Local Independence Desirable Truth F1 x1 x2 x3 x4 y1 y2 y3 1 x2 x3 y4 z1 z2 z4 3 2 x4 z3 b31 Specified Model x1 F3 F2 y1 y2 y3 E (bˆ31 ) 0 y4 z1 z2 z3 z4 13 Correct Specification Crucial Truth F1 x1 x2 x3 F3 F2 x4 y1 y2 y3 y4 z1 z2 z3 z4 F4 b31 Specified Model 1 x1 x2 x3 3 2 x4 y1 y2 y3 E (bˆ31 ) 0 y4 z1 z2 z3 z4 14 Strategies • Find a Locally Independent Measurement Model • Correctly specify the MM, including deviations from Local Independence 15 Correctly Specify Deviations from Local Independence Truth F1 x1 x2 x3 F3 F2 x4 y1 y2 y3 y4 z1 z2 z3 z4 F4 b31 Specified Model 1 x1 x2 x3 x4 ex4 E (bˆ31 ) 0 3 2 y1 y2 y3 y4 z1 z2 z3 z4 ez4 16 Correctly Specifying Deviations from Local Independence is Often Very Hard Truth F1 x1 x2 F3 F2 x4 x3 F5 y1 y2 y3 F4 y4 z1 z2 z3 z4 F6 17 Finding Pure Measurement Models Much Easier Truth F1 x1 x2 F3 F2 x4 x3 y1 y2 F5 y3 y4 F4 z1 z2 z3 z4 F6 Truth F1 x1 x2 F3 F2 x3 y1 F5 y2 y3 y4 z3 z4 F6 18 Tetrad Constraints • Fact: given a graph with this structure L 1 W 2 4 3 X Y Z W = 1 L + e1 X = 2 L + e2 Y = 3 L + e3 Z = 4 L + e4 • it follows that CovWXCovYZ = ( s2 ) ( s2 ) = 1 2 L 3 4 L tetrad constraints = (13s2L) (24s2L) = CovWYCovXZ sWXsYZ = sWYsXZ = sWZsXY Early Progenitors Charles Spearman (1904) Statistical Constraints Measurement Model Structure g rm1 * rr1 = rm2 * rr2 m1 m2 r1 r2 Impurities/Deviations from Local Independence defeat tetrad constraints selectively Truth Truth F1 x1 x2 x3 F1 x4 x1 x2 x4 x3 F5 rx1,x2 * rx3,x4 = rx1,x3 * rx2,x4 rx1,x2 * rx3,x4 = rx1,x3 * rx2,x4 rx1,x2 * rx3,x4 = rx1,x4 * rx2,x3 rx1,x2 * rx3,x4 = rx1,x4 * rx2,x3 rx1,x3 * rx2,x4 = rx1,x4 * rx2,x3 rx1,x3 * rx2,x4 = rx1,x4 * rx2,x3 21 Purify True Model F1 x1 x2 x3 F3 F2 x4 y1 y2 y3 y4 z1 z2 z3 z4 F Initially Specified Measurement Model x1 F1 x2 x3 F3 F2 x4 y1 y2 y3 y4 z1 z2 z3 z4 22 Purify Iteratively remove item whose removal most improves measurement model fit (tetrads or c2) – stop when confirmatory fit is acceptable F1 F3 F2 Remove x4 x1 x2 x3 y1 x4 y2 y3 z1 y4 z2 z3 z4 F F1 Remove z2 x1 x2 x3 F3 F2 x4 y1 y2 y3 y4 z1 F z2 z3 z4 Purify Detectibly Pure Subset of Items Detectibly Pure Measurement Model F1 x1 x2 x3 F3 F2 y1 y2 y3 y4 z1 z3 z4 F 24 Purify 25 How a pure measurement model is useful F1 x1 x2 x3 F3 F2 y1 y2 y3 y4 z1 z3 z4 F 1. Consistently estimate covariances/correlations among latents - test conditional independence with estimated latent correlations 2. Test for conditional independence among latents directly 2. Test conditional independence relations among latents directly Question: L1 _||_ L2 | {Q1, Q2, ..., Qn} b21 b21 = 0 L1 _||_ L2 | {Q1, Q2, ..., Qn} MIMbuild Input: - Purified Measurement Model - Covariance matrix over set of pure items MIMbuild PC algorithm with independence tests performed directly on latent variables Output: Equivalence class of structural models over the latent variables 28 Purify & MIMbuild 29 Goal 2: What Latents are out there? • How should they be measured? 30 Latents and the clustering of items they measure imply tetrad constraints diffentially Model 1 F1 x1 x2 Model 2 F2 x5 x5 x4 x3 F1 x1 x2 x1 x2 x3 F1 F3 F2 x4 x4 x3 x5 x5 Model 4 Model 3 F1 F2 x5 x6 x1 x2 F3 F2 x3 x4 x5 x6 31 Build Pure Clusters (BPC) Input: - Covariance matrix over set of original items BPC 1) Cluster (complicated boolean combinations of tetrads) 2) Purify Output: Equivalence class of measurement models over a pure subset of original Items 32 Build Pure Clusters 33 Build Pure Clusters Qualitative Assumptions 1. Two types of nodes: measured (M) and latent (L) 2. M L (measured don’t cause latents) 3. Each m M measures (is a direct effect of) at least one l L 4. No cycles involving M Quantitative Assumptions: 1. Each m M is a linear function of its parents plus noise 2. P(L) has second moments, positive variances, and no deterministic relations 34 Build Pure Clusters Output - provably reliable (pointwise consistent): Equivalence class of measurement models over a pure subset of M For example: L1 L2 L3 True Model m 1 m 2 m3 Output L1 m1 m2 m3 m4 m5 m6 m10 L2 m11 m 7 m8 m9 L3 m4 m5 m6 m7 m8 m9 35 Build Pure Clusters Measurement models in the equivalence class are at most refinements, but never coarsenings or permuted clusterings. L1 m1 m2 m3 L2 m4 m5 m6 Output L3 m7 m8 m9 L2 L1 m1 m2 m3 L1 m1 m2 m3 m9 m4 m5 m6 m7 m8 L1 m1 m2 m3 L4 L3 L3 m4 m5 m6 m7 m8 m9 L2 L3 m4 m5 m6 m7 m8 m9 36 Build Pure Clusters Algorithm Sketch: 1. Use particular rank (tetrad) constraints on the measured correlations to find pairs of items mj, mk that do NOT share a single latent parent 2. Add a latent for each subset S of M such that no pair in S was found NOT to share a latent parent in step 1. 3. Purify 4. Remove latents with no children 37 Build Pure Clusters + MIMbuild 38 Case Studies Stress, Depression, and Religion (Lee, 2004) Test Anxiety (Bartholomew, 2002) 39 Case Study: Stress, Depression, and Religion Masters Students (N = 127) 61 - item survey (Likert Scale) • Stress: St1 - St21 • Depression: D1 - D20 • Religious Coping: C1 - C20 St1 12 St2 1.2 . Specified Model Stress Depression + - + St21 12 C1 C2 . . . . Dep20 12 Coping p = 0.00 Dep1 12 Dep2 12 C20 40 Case Study: Stress, Depression, and Religion Build Pure Clusters St3 12 St4 12 St16 12 St18 12 St20 12 Stress Depression Coping C9 C12 C14 Dep9 12 Dep13 12 Dep19 12 C15 41 Case Study: Stress, Depression, and Religion Assume Stress temporally prior: MIMbuild to find Latent Structure: St3 12 St4 12 St16 12 St18 12 St20 12 + Stress Depression + Coping C9 C12 C14 C15 Dep9 12 Dep13 12 Dep19 12 p = 0.28 42 Case Study : Test Anxiety Bartholomew and Knott (1999), Latent variable models and factor analysis 12th Grade Males in British Columbia (N = 335) 20 - item survey (Likert Scale items): X1 - X20: X2 Exploratory Factor Analysis: X3 X4 X8 X5 X9 X10 Emotionality Worry X6 X7 X15 X14 X16 X17 X18 X20 43 Case Study : Test Anxiety Build Pure Clusters: X3 X2 X8 Cares About Achieving X9 X10 X11 X16 X5 X7 Emotionalty SelfDefeating X6 X14 X18 44 Case Study : Test Anxiety Build Pure Clusters: Exploratory Factor Analysis: X3 X2 X4 X8 X3 X2 X8 Worries About Achieving X5 X5 X9 X9 X10 Emotionality Worry X6 X10 X7 Emotionalty X7 X15 X14 X16 X17 X18 X20 p-value = 0.00 X11 X16 X6 SelfDefeating X14 X18 p-value = 0.47 45 Case Study : Test Anxiety MIMbuild X3 X2 X8 Worries About Achieving X9 X10 X11 X16 Scales: No Independencies or Conditional Independencies X5 X7 Emotionalty SelfDefeating X6 X14 Worries About Achieving-Scale EmotionaltyScale SelfDefeating X18 p = .43 Uninformative 46 Limitations • In simulation studies, requires large sample sizes to be really reliable (~ 400-500). • 2 pure indicators must exist for a latent to be discovered and included • Moderately computationally intensive (O(n6)). • No error probabilities. 47 Open Questions/Projects • IRT models? • Bi-factor model extensions? • Appropriate incorporation of background knowledge 48 References • Tetrad: www.phil.cmu.edu/projects/tetrad_download • Spirtes, P., Glymour, C., Scheines, R. (2000). Causation, Prediction, and Search, 2nd Edition, MIT Press. • Pearl, J. (2000). Causation: Models of Reasoning and Inference, Cambridge University Press. • Silva, R., Glymour, C., Scheines, R. and Spirtes, P. (2006) “Learning the Structure of Latent Linear Structure Models,” Journal of Machine Learning Research, 7, 191-246. • Learning Measurement Models for Unobserved Variables, (2003). Silva, R., Scheines, R., Glymour, C., and Spirtes. P., in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence , U. Kjaerulff and C. Meek, eds., Morgan Kauffman 49