#### Transcript Z - UCLA

CAUSAL REASONING FOR DECISION AIDING SYSTEMS COGNITIVE SYSTEMS LABORATORY UCLA Judea Pearl, Mark Hopkins, Blai Bonet, Chen Avin, Ilya Shpitser PRESENTATIONS Judea Pearl Robustness of Causal Claims Ilya Shpitser and Chen Avin Experimental Testability of Counterfactuals Blai Bonet Logic-based Inference on Bayes Networks Mark Hopkins Inference using Instantiations Chen Avin Inference in Sensor Networks Blai Bonet Report from Probabilistic Planning Competition FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES Probability and statistics deal with static relations Statistics Probability inferences Data from passive observations Causal analysis deals with changes (dynamics) 1. Effects of Data interventions Causal 2. Causes of Model Causal effects assumptions 3. Explanations Experiments joint distribution TYPICAL CAUSAL MODEL X Y Z INPUT OUTPUT TYPICAL CLAIMS 1. Effects of potential interventions, 2. Claims about attribution (responsibility) 3. Claims about direct and indirect effects 4. Claims about explanations ROBUSTNESS: MOTIVATION Genetic Factors (unobserved) u x Smoking a y Cancer In linear systems: y = on ax cancer +e The effect of smoking is, in general, a is non-identifiable. non-identifiable (from observational studies). ROBUSTNESS: MOTIVATION Z Price of Cigarettes Genetic Factors (unobserved) u b a y x Smoking Cancer Z – Instrumental variable; cov(z,u) = 0 a is identifiable R yz a b Rxz b a R yz Rxz ROBUSTNESS: MOTIVATION Z Price of Cigarettes Genetic Factors (unobserved) u b a x Smoking y Cancer Problem with Instrumental Variables: The model may be wrong! R yz R yz ab a Rxz ROBUSTNESS: MOTIVATION Z1 Price of Cigarettes Z2 Peer Pressure Genetic Factors (unobserved) u b a g y x Smoking Cancer Solution: Invoke several instruments a1 R yz1 Rxz1 Surprise: a1 = a2 a2 R yz2 Rxz2 model is likely correct ROBUSTNESS: MOTIVATION Z1 Price of Cigarettes Z2 Peer Pressure Genetic Factors (unobserved) u b a g x Smoking y Cancer Z3 Anti-smoking Legislation Zn Greater surprise: a1 = a2 = a3….= an = q Claim a = q is highly likely to be correct ROBUSTNESS: MOTIVATION Genetic Factors (unobserved) u x Smoking a y Cancer s Symptom Symptoms do not act as instruments a remains non-identifiable Why? Taking a noisy measurement (s) of an observed variable (y) cannot add new information ROBUSTNESS: MOTIVATION Genetic Factors (unobserved) Sn u S2 a x y Smoking Cancer S1 Symptom Adding many symptoms does not help. a remains non-identifiable ROBUSTNESS: MOTIVATION Given a parameter a in a general graph a x y Find if a can evoke an equality surprise a1 = a2 = …an associated with several independent estimands of a Formulate: Surprise, over-identification, independence Robustness: The degree to which a is robust to violations of model assumptions ROBUSTNESS: FORMULATION Bad attempt: if: f1, f2: Parameter a is robust (over identifies) a f1() a f 2 () Two distinct functions if model induces constraint g () 0, then a f () t1[ g ()] f () t2[ g ()] ti [ g ()] are distinct. ROBUSTNESS: FORMULATION ex ey b x Ryx = b Rzx = bc Rzy = c ez x = ex y = bx + ey z = cy + ez c y z (b) b R yx b Rzx / Rzy (c) c Rzy c Rzx / R yx constraint: y → z irrelvant to derivation of b Rzx R yx Rzy RELEVANCE: FORMULATION Definition 8 Let A be an assumption embodied in model M, and p a parameter in M. A is said to be relevant to p if and only if there exists a set of assumptions S in M such that S and A sustain the identification of p but S alone does not sustain such identification. Theorem 2 An assumption A is relevant to p if and only if A is a member of a minimal set of assumptions sufficient for identifying p. ROBUSTNESS: FORMULATION Definition 5 (Degree of over-identification) A parameter p (of model M) is identified to degree k (read: k-identified) if there are k minimal sets of assumptions each yielding a distinct estimand of p. ROBUSTNESS: FORMULATION b c x y Minimal assumption sets for c. x c y G1 z x c y z c z x y G3 G2 Minimal assumption sets for b. x b y z z FROM MINIMAL ASSUMPTION SETS TO MAXIMAL EDGE SUPERGRAPHS FROM PARAMETERS TO CLAIMS Definition A claim C is identified to degree k in model M (graph G), if there are k edge supergraphs of G that permit the identification of C, each yielding a distinct estimand. e.g., Claim: (Total effect) TE(x,z) = q x y TE(x,z) = Rzx z x x y y z TE(x,z) = Rzx Rzy ·x z FROM MINIMAL ASSUMPTION SETS TO MAXIMAL EDGE SUPERGRAPHS FROM PARAMETERS TO CLAIMS Definition A claim C is identified to degree k in model M (graph G), if there are k edge supergraphs of G that permit the identification of C, each yielding a distinct estimand. e.g., Claim: (Total effect) TE(x,z) = q x Nonparametric y TE ( x, z ) P( z | x) z x x y z z y TE ( z , x) P( y | x) P( z | x' , y ) P( x' ) y x' CONCLUSIONS 1. Formal definition to ROBUSTNESS of causal claims. 2. Graphical criteria and algorithms for computing the degree of robustness of a given causal claim.