Tetrad: Machine Learning and Graphcial Causal Models Richard Scheines Joe Ramsey Carnegie Mellon University Peter Spirtes, Clark Glymour 1 Goals 1) Convey rudiments of graphical causal models 2) Basic working knowledge of Tetrad IV 2 Tetrad IV: Complete Causal Modeling Tool 3 Tetrad 1) Main website: http://www.phil.cmu.edu/projects/tetrad/ 2) Download site: http://www.phil.cmu.edu/projects/tetrad_download/ 3) Data files: www.phil.cmu.edu/projects/tetrad_download/download/workshop/Data/ 4 Topic Outline 1) Motivation 2) Representing/Modeling Causal Systems 3) Estimation and Updating 4) Model Search 5) Linear Latent Variable Models 6) Case Study: fMRI 5 Statistical Causal Models: Goals 1) Policy, Law, and Science: How can we use data to answer a) subjunctive questions (effects of future policy interventions), or b) counterfactual questions (what would have happened had things been done differently (law)? c) scientific questions (what mechanisms run the world) 2) Rumsfeld Problem: Do we know what we do and don’t know: Can we tell when there is or is not enough information in the data to answer causal questions? 6 Causal Inference Requires More than Probability Prediction from Observation ≠ Prediction from Intervention P(Lung Cancer 1960 = y | Tar-stained fingers 1950 = no) ≠ P(Lung Cancer 1960 = y | Tar-stained fingers 1950 set = no) In general: P(Y=y | X=x, Z=z) ≠ P(Y=y | Xset=x, Z=z) Causal Prediction vs. Statistical Prediction: Non-experimental data (observational study) P(Y,X,Z) Background Knowledge Causal Structure P(Y=y | X=x, Z=z) P(Y=y | Xset=x, Z=z) 7 Foundations of Causal Epistemology Some Causal Structures can parameterize the same set of probability distributions, some cannot X Y Z X Y Z X Y Z X Y Z P1(X,YZ) P2(X,YZ) 8 Causal Search Causal Search: 1. Find/compute all the causal models that are indistinguishable given background knowledge and data 2. Represent features common to all such models Multiple Regression is often the wrong tool for Causal Search: Example: Foreign Investment & Democracy 9 Foreign Investment Does Foreign Investment in 3rd World Countries inhibit Democracy? Timberlake, M. and Williams, K. (1984). Dependence, political exclusion, and government repression: Some cross-national evidence. American Sociological Review 49, 141-146. N = 72 PO degree of political exclusivity CV lack of civil liberties EN energy consumption per capita (economic development) FI level of foreign investment 10 Foreign Investment Correlations fi en cv po -.175 -.480 0.868 fi en 0.330 -.391 -.430 11 Case Study 1: Foreign Investment Regression Results po = .227*fi SE t (.058) 3.941 - .176*en + .880*cv (.059) -2.99 (.060) 14.6 Interpretation: foreign investment increases political repression 12 Case Study 1: Foreign Investment Alternatives En FI CV En FI CV En .31 -.23 FI CV .217 .88 -.176 PO -.48 PO Regression Tetrad - FCI .86 PO Fit: df=2, 2=0.12, p-value = .94 There is no model with testable constraints (df > 0) in which FI has a positive effect on PO that is not rejected by the data. Outline 1) Motivation 2) Representing/Modeling Causal Systems 1) Causal Graphs 2) Standard Parametric Models 1) Bayes Nets 2) Structural Equation Models 3) Other Parametric Models 1) Generalized SEMs 2) Time Lag models 14 Causal Graphs Causal Graph G = {V,E} Each edge X Y represents a direct causal claim: X is a direct cause of Y relative to V Years of Education Years of Education Income Skills and Knowledge Income 15 Causal Graphs Not Cause Complete O m itted C au ses Education Income Happiness Common Cause Complete O m itted C o m m o n C au ses Education Income Happiness 16 Modeling Ideal Interventions Interventions on the Effect Post Pre-experimental System Sweaters On 17 Room Temperature Modeling Ideal Interventions Interventions on the Cause Post Pre-experimental System Sweaters On Room Temperature 18 Interventions & Causal Graphs Model an ideal intervention by adding an “intervention” variable outside the original system as a direct cause of its target. Pre-intervention graph Education Incom e Taxes Intervene on Income “Hard” Intervention E ducation In com e T ax es I “Soft” Intervention E ducation In com e I 19 T ax es Tetrad Demo Build and Save an acyclic causal graph: 1) with 3 measured variables, no latents 2) with at least 3 measured variables, and at least 1 latent 20 Parametric Models 21 Causal Bayes Networks The Joint Distribution Factors S m o k in g [0 ,1 ] According to the Causal Graph, Y e llo w F in g e rs [0 ,1 ] P (V ) Lung C ancer [0 ,1 ] P( X xV P(S,YF, L) = P(S) P(YF | S) P(LC | S) 22 | Direct _ causes ( X ) ) Causal Bayes Networks The Joint Distribution Factors S m o k in g [0 ,1 ] According to the Causal Graph, Y e llo w F in g e rs [0 ,1 ] Lung C ancer [0 ,1 ] P (V ) P( X | Direct _ causes ( X ) ) xV P(S) P(YF | S) P(LC | S) = f() = {1, 2,3,4,5, } All variables binary [0,1]: P(S = 0) = 1 P(S = 1) = 1 - 1 P(YF = 0 | S = 0) = 2 P(YF = 1 | S = 0) = 1- 2 P(YF = 0 | S = 1) = 3 P(YF = 1 | S = 1) = 1- 3 P(LC = 0 | S = 0) = 4 P(LC = 1 | S = 0) = 1- 4 P(LC = 0 | S = 1) = 5 P(LC = 1 | S = 1) = 1- 5 23 Tetrad Demo 24 Structural Equation Models E ducation Causal Graph Income Longevity Structural Equations For each variable X V, an assignment equation: X := fX(immediate-causes(X), eX) Exogenous Distribution: Joint distribution over the exogenous vars : P(e) 25 Linear Structural Equation Models eEducation Causal Graph Path diagram E ducation Education 1 Income Longevity 2 Income Longevity eIncome eLongevity Equations: Education := eEducation Income := Educationeincome Longevity := EducationeLongevity Exogenous Distribution: P(eed, eIncome,eIncome ) - i≠j ei ej (pairwise independence) Structural Equation Model: E.g. V = BV + E - no variance is zero (eed, eIncome,eIncome ) ~N(0,2) 2 diagonal, - no variance is zero 26 Tetrad Demo 1) Interpret your causal graph with 3 measured variables with at least 2 parametric models: a) Bayes Parametric Model b) SEM Parametric Model 2) Interpret your other graph with a parametric model of your choice 27 Instantiated Models 28 Tetrad Demo 1) Instantiate at least one Bayes PM with a Bayes IM 2) Instantiate at least one SEM PM with a SEM IM 3) Instantiate at least one SEM PM with a Standardized SEM IM 4) Generate two data sets (N= 50, N=5,000) for each 29 Outline 1) Motivation 2) Representing/Modeling Causal Systems 1) Causal Graphs 2) Standard Parametric Models 1) Bayes Nets 2) Structural Equation Models 3) Other Parametric Models 1) Generalized SEMs 2) Time Lag models 30 Generalized SEM 1) The Generalized SEM is a generalization of the linear SEM model. 2) Allows for arbitrary connection functions 3) Allows for arbitrary distributions 4) Simulation from cyclic models supported. Hands On 1) Create a DAG. 2) Parameterize it as a Generalized SEM. 3) Open the Generalized SEM and select Apply Templates from the Tools menu. 4) Apply the default template to variables, which will make them all linear functions. 5) For errors, select a non-Gaussian distribution, such as U(0, 1). 6) Save. Time Series Simulation (Hands On) 1) Tetrad includes support for doing time series simulations. 2) First, one creates a time series graph. 3) Then one parameterizes the time series graph as a SEM. 4) Then one instantiates the SEM. 5) Then one simulates data from the SEM Instantiated Model. Time Series Simulation • One can, e.g., calculate a vector auto-regression for it. (One can do this as well from time series data loaded in.) • Attach a data manipulation box to the data. • Select vector auto-regression. • One can create staggered time series data • Attach a data manipulation box. • Select create time series data. • Should give the time lag graph with some extra edges in the highest lag. Estimation 35 Tetrad Demo 1) Estimate one Bayes PM for which you have an IM and data 2) Estimate one SEM PM for which you have an IM and data 3) Import data from charity.txt, and build and estimate model two models to estimate on those data 36 Hypothesis 1 Hypothesis 2 37 Updating 38 Tetrad Demo 1) Pick one of your Bayes IMs 2) Find a variable X to update conditional on Y such that: The marginal on X changes when Y is passively observed = y, but does not change when Y is manipulated = y 3) Find a variable Z to update conditional on W such that: The marginal on Z changes when W is passively observed = w, and changes in exactly the same way when W is manipulated = w 39