CAUSAL INFERENCE AS A MACHINE LEARNING EXERCISE

Download Report

Transcript CAUSAL INFERENCE AS A MACHINE LEARNING EXERCISE

COMMENTS ON
By Judea Pearl (UCLA)
notation
1990’s
Artificial Intelligence
Hoover
From Hoover (2004)
“Lost Causes”
Hoover slide
notation
1990’s
Artificial Intelligence
Already unified
more?
^
additional?
^
(Statistical)
Commendable!
Already permitted
additional?
^
additional?
^
P. 215
(Statistical)
Commendable!
Already permitted
additional?
^
additional?
^
WHITE & CHALAK
PICTURE OF UNIFICATION
1950 - 2005
SEM
Neyman-Rubin
DAGs
2006

Settable System
MY PICTURE OF UNIFICATION
1920 - 1990
Informal SEM
Neyman-Rubin
Informal Diagrams

1990 - 2000
Formal SEM

Graphs

Complete
Neyman-Rubin
DAGs
2006 (W&C)
Multi-agent
extension
CAUSAL ANALYSIS
WITHOUT TEARS
TRADITIONAL STATISTICAL
INFERENCE PARADIGM
Data
P
Joint
Distribution
Q(P)
(Aspects of P)
Inference
e.g.,
Infer whether customers who bought product A
would also buy product B.
Q = P(B|A)
THE CAUSAL INFERENCE
PARADIGM
Data
Joint
Distribution
Data
Generating
Model
Q(M)
(Aspects of M)
Inference
Some Q(M) cannot be inferred from P.
e.g.,
Infer whether customers who bought product A
would still buy A if we were to double the price.
FROM STATISTICAL TO CAUSAL ANALYSIS:
1. THE DIFFERENCES
Probability and statistics deal with static relations
Statistics
Probability
inferences
Data
from passive
observations
Causal analysis deals with changes (dynamics)
1. Effects of
Data
interventions
Causal
2. Causes of
Model
Causal
effects
assumptions
3. Explanations
Experiments
joint
distribution
FAMILIAR CAUSAL MODEL
ORACLE FOR MANIPILATION
X
Y
Z
INPUT
OUTPUT
WHY CAUSALITY NEEDS
SPECIAL MATHEMATICS
SEM Equations are Non-algebraic:
Y = 2X
X=1
X=1
Y=2
Process information
Static information
Had X been 3, Y would be 6.
If we raise X to 3, Y would be 6.
Must “wipe out” X = 1.
CAUSAL MODELS AND
CAUSAL DIAGRAMS
Definition: A causal model is a 3-tuple
M = V,U,F
with a mutilation operator do(x): M Mx where:
(i)
V = {V1…,Vn} endogenous variables,
(ii)
U = {U1,…,Um} background variables
(iii) F = set of n functions, fi : V \ Vi  U  Vi
vi = fi(pai,ui) PAi  V \ Vi Ui  U
•
CAUSAL MODELS AND
CAUSAL DIAGRAMS
Definition: A causal model is a 3-tuple
M = V,U,F
with a mutilation operator do(x): M Mx where:
(i)
V = {V1…,Vn} endogenous variables,
(ii)
U = {U1,…,Um} background variables
(iii) F = set of n functions, fi : V \ Vi  U  Vi
vi = fi(pai,ui) PAi  V \ Vi Ui  U
q  b1 p  d1i  u1
p  b2q  d 2 w  u2
U1
I
W
Q
P
U2
PAQ
CAUSAL MODELS AND
MUTILATION
Definition: A causal model is a 3-tuple
M = V,U,F
with a mutilation operator do(x): M Mx where:
(i)
V = {V1…,Vn} endogenous variables,
(ii)
U = {U1,…,Um} background variables
(iii) F = set of n functions, fi : V \ Vi  U  Vi
vi = fi(pai,ui) PAi  V \ Vi Ui  U
(iv) Mx= U,V,Fx,
X  V, x  X
where Fx = {fi: Vi  X }  {X = x}
(Replace all functions fi corresponding to X with the
constant functions X=x)
•
CAUSAL MODELS AND
MUTILATION
Definition: A causal model is a 3-tuple
M = V,U,F
with a mutilation operator do(x): M Mx where:
(i)
V = {V1…,Vn} endogenous variables,
(attributes)
(ii)
U = {U1,…,Um} background variables
(iii) F = set of n functions, fi : V \ Vi  U  Vi
vi = fi(pai,ui) PAi  V \ Vi Ui  U
(iv)
q  b1 p  d1i  u1
p  b2q  d 2 w  u2
U1
I
W
Q
P
U2
CAUSAL MODELS AND
MUTILATION
Definition: A causal model is a 3-tuple
M = V,U,F
with a mutilation operator do(x): M Mx where:
(i)
V = {V1…,Vn} endogenous variables,
(attributes)
(ii)
U = {U1,…,Um} background variables
(iii) F = set of n functions, fi : V \ Vi  U  Vi
vi = fi(pai,ui) PAi  V \ Vi Ui  U
(iv)
M
p
q  b1 p  d1i  u1 U1
p  b2q  d 2 w  u2
p  p0
I
W
U2
Q
P
P = p0
PROBABILISTIC
CAUSAL MODELS
Definition: A causal model is a 3-tuple
M = V,U,F
with a mutilation operator do(x): M Mx where:
(i)
V = {V1…,Vn} endogenous variables,
(ii)
U = {U1,…,Um} background variables
(iii) F = set of n functions, fi : V \ Vi  U  Vi
vi = fi(pai,ui) PAi  V \ Vi Ui  U
(iv) Mx= U,V,Fx,
X  V, x  X
where Fx = {fi: Vi  X }  {X = x}
(Replace all functions fi corresponding to X with the
constant functions X=x)
Definition (Probabilistic Causal Model):
M, P(u)
P(u) is a probability assignment to the variables in U.
CAUSAL MODELS AND
COUNTERFACTUALS
Definition:
The sentence: “Y would be y (in situation u),
had X been x,” denoted Yx(u) = y, means:
The solution for Y in a mutilated model Mx,
(i.e., the equations for X replaced by X = x)
and U=u, is equal to y.
•Joint probabilities of counterfactuals:
P(Yx  y, Z w  z ) 

u:Yx (u )  y,Z w (u )  z
P(u )
GRAPHICAL – COUNTERFACTUALS
SYMBIOSIS
Every causal model implies constraints
on counterfactuals
e.g., Yx, z (u )  Yx (u )
Yx  Z y | X
consistent, and readable from the graph.
Every theorem in SEM is a theorem in N-R,
and conversely.
GRAPHICAL TEST OF IDENTIFICATION
The causal effect of X on Y,
P( y | do( x))  P(Yx (u )  y )
is identifiable in G if there is a set Z of
variables such that Z d-separates X from Y in Gx.
G
Z1
Gx
Z1
Z2
Z3
Z2
Z3
Z4
X
Z
Z6
Z5
Y
Z4
X
Z6
Moreover, P(y | do(x)) =  P(y | x,z) P(z)
z
(“adjusting” for Z)
Z5
Y
RULES OF CAUSAL CALCULUS
Rule 1: Ignoring observations
P(y | do{x}, z, w) = P(y | do{x}, w)
if (Y  Z|X,W )G
Rule 2: Action/observation exchange
X
P(y | do{x}, do{z}, w) = P(y | do{x},z,w)
if (Y  Z|X,W )G
Rule 3: Ignoring actions
XZ
P(y | do{x}, do{z}, w) = P(y | do{x}, w)
if (Y  Z|X,W )G
X Z(W)
RECENT RESULTS ON IDENTIFICATION
Theorem (Tian 2002):
We can identify P(v | do{x}) (x a singleton)
if and only if there is no child Z of X connected
to X by a bi-directed path.
X
Z1
Z
Zk
RECENT RESULTS ON IDENTIFICATION
(Cont.)
• do-calculus is complete
• A complete graphical criterion available
•
for identifying causal effects of any set
on any set
References: Shpitser and Pearl 2006
(AAAI, UAI)
CONCLUSIONS
Structural-model semantics enriched with
logic + graphs leads to formal interpretation
and practical assessments of wide variety
of (if not all) causal and counterfactual
relationships.
e.g., causal effects, responsibility,
direct and indirect effects
Multi-agent systems?
MULTI-AGENT GRAPHS
Agent 1
Agent 2
ux1
ux2
X1
X2
uy1
u z1
Y1
Z1
uy2
u z2
Y2
Z2
WHITE & CHALAK
PICTURE OF UNIFICATION
1950 - 2005
SEM
Neyman-Rubin
DAGs
2006

Settable System
MY PICTURE OF UNIFICATION
1920 - 1990
Informal SEM
Neyman-Rubin
Informal Diagrams

1990 - 2000
Formal SEM

Graphs

Complete
Neyman-Rubin
DAGs
2006 (W&C)
Multi-agent
extension