Causal Networks

Download Report

Transcript Causal Networks

Causal Networks
Denny Borsboom
Overview
•
•
•
•
•
The causal relation
Causality and conditional independence
Causal networks
Blocking and d-separation
Excercise
The causal relation
• What constitutes the “secret connexion” of
causality is one of the big questions of philosophy
• Philosophical proposals:
– A causes B means that…
• A invariably follows B (David Hume)
• A is an Insufficient but Nonredundant part of an
Unnecessary but Sufficient condition for B (INUS condition;
John Mackie)
• B counterfactually depends on A: if A had not happened, B
would not have happened (David Lewis)
• …
Backdrop of philosophical accounts
• Can’t cope well with noisy data (i.e., can’t
cope with data)
• Almost all causal relations are observed
through statistical analysis: probabilities
• Probabilities didn’t sit well with the
philosophical analyses, and neither did data
• For a long time, causal inference was
therefore done in a theoretical vacuum
An alternative
• Recently, Judea Pearl suggested an alternative
approach based in the statistical method of
structural relations
• He argues that causal relations should be
framed in terms of interventions on a model:
given a causal model, what would happen to B
if we changed A?
• This is a simple idea but it turned out very
powerful
Pearl’s approach (I)
• A causal relation is encoded in a structural
equation that says how B would change if A
were changed
• This can be coded with the do operator or the
symbol :=
• So B:=2A means that B would change 2 units if
A were to change one unit
• Note that this relation is asymmetric: B:=2A
does not imply that A:=B/2
Pearl’s approach (II)
• The structural equations can be represented
in a graph, by drawing a directed arrow from A
to B whenever (in the model structure)
changing A affects B but not vice versa:
A
B
• Can we relate such a system to data? That is,
under which conditions can we actually
determine the causal relations from the data?
Pearl’s approach (III)
• The classic problem of induction then presents itself as
an identification problem:
• Given a only two variables, it is not possible to deduce
from the data whether A->B or B->A (or some other
structure generated the dependence): both are equally
consistent with the data
• If temporal precedence distinguishes A->B from B->A
then the skeptic may argue that this is all there is to
know (really hardcore skeptics generalize to
experiments)
• This is the root of the platitude that “correlation does
not equal causation”
Pearl’s approach (IV)
• However, where there’s correlational smoke,
there is often a causal fire…
• How to identify that fire?
• 20th century statistics struggled with this issue; at
the end of the 20th century many had given up
• Pearl and Glymour et al. then simultaneously
developed the insight that not correlations or
conditional probabilities but conditional
independence relations are key to the
identification of causal structure
Pearl’s approach (V)
• Trick: shift attention from bivariate to
multivariate systems and then ask two new
questions:
• 1) Which conditional independence relations
are implied by a given causal structure
• 2) Which causal structures are implied by a
given set of conditional independence
relations?
Common Cause
Chain
Collider
B
A
B
A
B
C
C
A
C
Example:
Village size (A)
causes babies (B)
and storks (C)
Example:
Smoking (A)
causes tar (B)
causes cancer (C)
CI:
B and C
conditionally
independent
given A
CI:
A and C
conditionally
independent
given B
Example:
Firing squad
(B & C)
shoot prisoner (A)
CI:
B and C
conditionally
dependent
given A
So…
• If we can cleverly combine these small networks to
build larger networks, then we might have a graphical
criterion to deduce implied CI relations from a causal
graph (i.e., we could look at the graph rather than solve
equations)
• If we have a dataset, we can establish which of a set of
possible causal graphs could have generated the CI
relations observed
• If certain links cannot be deleted from the graph (i.e.,
are necessary to represent the CI relations), then it is in
principle possible to establish causal relations from
non-experimental data
To work!
S=smoke
P(A,B) = ‘probability of A and B’; e.g., P(ØS,C) = ‘probability of not-smoke and c
ØS=not-smoke
P(A|B) = ‘probability of A, given B’; e.g., P(ØS|C) = ‘probability of not-smoke, giv
P(A,B)
‘probability
of A and associated
B’; e.g., P(ØS,C)
= ‘probability
of not-smoke and ca
Table
1b.=Probability
distribution
with Table
1a.
cancer’
P(A|B) = ‘probability of A, given B’; e.g., P(ØS|C) = ‘probability of not-smoke, give
Conditional independence (CI)
cancer’
ØC
(see
handout)
Table C
1a. Contingency
table
for smoking
and cancer with n=100 persons.
S
2/10
4/10
6/10
Table 1/10
1a. Contingency
table for smoking and cancer with n=100 persons.
ØS
3/10
4/10
C
ØC
3/10
7/10
1
S
20
40
60
C
ØC
ØS
10
30
40
S
20
40
60
100
ØS 1c.30
10
30
40 probability distribution in Table 1b.
Table
Table70
legend for
30
70
100
C
ØC
1b. Probability
distribution
associated with Table 1a.
STable P(S,C)
P(S,ØC)
P(S)
Table P(ØS,C)
1b. Probability
distribution
associated with Table 1a.
ØS
P(ØS,ØC)
P(ØS)
C
ØC
P(C)
P(ØC)
1=P(S)+P(ØS)
S
2/10
4/10
6/10 =P(C)+P(ØC)
C
ØC
ØS
1/10
3/10
4/10
S
2/10
4/10
6/10 = P(S,C)+ P(S,ØC)+ P(ØS,C)+ P(ØS,ØC)
3/10
7/10
14/10
ØS
1/10
3/10
3/10
7/10
1
Formula for conditional probabilities: P(A|B)=P(A,B)/P(B)
Table 1c. Table legend for probability distribution in Table 1b.
For
instance:
P(C|S)=P(S,C)/P(S)=0.2/0.6=1/3
Table
1c. Table
legend for probability distribution in Table 1b.
Conditional independence (CI)
(see handout)
2. Independence
Table 2a. Contingency table for smoking and cancer with n=100 persons.
S
ØS
C
20
10
30
ØC
40
30
70
60
40
100
P(C|S)=20/60=1/3
P(C|ØS )=10/40=1/4
P(C)=30/100=3/10
S carries information about C: Learning that S should increase your confidence that C; S
and C are not independent in this table.
Table 2b. Contingency table for smoking and cancer with n=100 persons.
P(C|S)=20/60=1/3
P(C|ØS )=10/40=1/4
P(C)=30/100=3/10
Conditional independence (CI)
S carries information about(see
C: Learning
that S should increase your confidence that C; S
handout)
and C are not independent in this table.
Table 2b. Contingency table for smoking and cancer with n=100 persons.
S
ØS
C
18
12
30
ØC
42
28
70
60
40
100
P(C|S)=18/60=3/10
P(C|ØS )=12/40=3/10
P(C)=30/100=3/10
S carries no information about C: Learning that S should not increase your confidence
that C; S and C are independent in this table.
Formula for independence: A and B are independent iff P(A|B)=P(A), or, which is the
same, iff P(A,B)=P(A)P(B)
Conditional independence (CI)
(see handout)
Conditional independence (CI)
(see handout)
Conditional independence (CI)
(see handout)
Table 3c. Contingency table for stained fingers and cancer, given ØS (only non-smokers
here)
F
ØF
C
2
8
10
ØC
18
72
90
20
80
100
In this table:
P(C|F)=2/20=0,1
P(C|ØF)=8/80=0,1
P(C)=10/100=0,1
Hence, P(C|F)=P(C); therefore C and F are independent in this table
Conclusion: Conditioning on smoking renders F and C probabilistically independent; we
say that ‘C and F are independent given S’.
Formula for conditional independence: A and B are conditionally independent given C iff
P(A|B,C)=P(A|C).
Common Cause
Chain
Collider
B
A
B
A
B
C
C
A
C
Example:
Village size (A)
causes babies (B)
and storks (C)
Example:
Smoking (A)
causes tar (B)
causes cancer (C)
CI:
B and C
conditionally
independent
given A
CI:
A and C
conditionally
independent
given B
Example:
Firing squad
(B & C)
shoot prisoner (A)
CI:
B and C
conditionally
dependent
given A
Common Cause
Chain
Collider
B
A
B
A
B
C
C
A
C
Example:
Village size (A)
causes babies (B)
and storks (C)
Example:
Smoking (A)
causes tar (B)
causes cancer (C)
CI:
B and C
conditionally
independent
given A
CI:
A and C
conditionally
independent
given B
Example:
Firing squad
(B & C)
shoot prisoner (A)
CI:
B and C
conditionally
dependent
given A
Therefore
• Now suppose we are prepared to make some
causal assumptions, most importantly:
– there are no omitted variables that generate
dependencies, and
– all causal relations are necessary to establish the
pattern of CI
• Then we can deduce causal relations from
correlational data (at least in principle)
• Quite a nice result!
Blocking and d-separation
Blocking and d-separation
• It would be nice if we could just look at the graph and
see which CI relations it entails
• This turns out to be possible
• Rule: if you want to know whether in a directed acyclic
graph two variables A and B are independent given C,
see if they are d-separated
• For this you have to (a) check all the paths between A
and B, and (b) see if they are all blocked
• If all paths are blocked by C, then C d-separates A and
B, and you can predict that A is independent of B given
C
Blocking and d-separation
A path between B and F
• A path between two variables is formed by a
series of edges that you can travel to reach
one variable from the other
When is a path blocked?
• A path between A and B is said to be blocked
by a variable C if:
– A and B are connected by a chain in which C is the
middle node (so here that would be A->C->B or
A<-C<-B), or
– A and B are connected by a common cause, and C
is that common cause (here: A <-C -> B), or
– A and B are connected by a common effect
(‘collider’), but C is not that common effect, and C
is not one of the effects of the common effect.
Blocking and d-separation
Blocking and d-separation
So…
• If you have a causal network that consists of
variables coupled through (directed) structural
relations…
• …then you can tell which conditional
independence patterns will arise…
• …just by looking at the picture!!!!!!!!!!!!!
So…
• And in the other direction: if you have a set of
conditional independencies, you can search
for the causal network that could have
produced them
• This is material Lourens will cover next week
Recipe: are A and B independent given
C?
1. List every path between A and B
2. For every path, check whether C blocks it
3. If C blocks all the paths in step (2), then C dseparates A and B, and A is conditionally
independent of B given C
4. If C does not block all the paths in step (2),
then C does not d-separate A and B. In this
case anything may happen: we don’t know.
Practice!