Intervention (causal) analysis using graphical models

If I remember correctly, the nicest explanation occurs in Judea Pearl’s Causality. Which is nice, since Pearl deserves credit for developing the idea – and was rewarded the Turing prize for it and other work.

But as is often the case, the idea is simpler than the explanation. We can pull apart a joint distribution into a chain of conditional distributions:

ρ(A,B,C,D)=ρ(D|A,B,C)ρ(C|A,B)ρ(B|A)ρ(A)

We can simplify this by known conditional independencies, thus creating a network of conditional distributions. So if we know that D is conditionally independent of A and B given C, we get:

ρ(A,B,C,D)=ρ(D|C)ρ(C|A,B)ρ(B|A)ρ(A)

Assume that these represent causal relationships. But do not worry: Note that the order was not important, so as long as we have a ordering on potential causal relationships (such as a temporal ordering) we can obtain a causal ordering from which to obtain presumed causal relationships.

Now if we wish to examine the effects of forcing a variable to take a given value, then we can simply specify the value taken by the variable in place of the conditional distribution – we shall write this ρ(do X=x) which will replace ρ(X|Y) with ρ(X=x)=1 – and calculate. So if we wish to examine the effect of forcing B to take a particular value b, then we have:

ρ(A,C,D|do B)=ρ(D|C)ρ(C|A,B)ρ(do B=b)ρ(A)

Actual implementation makes use of probabilistic models that simplify the calculations of the joint distribution through calculations on the chain on conditional distributions.

But the crucial point is that if we had a potential causal ordering we could have learnt both the conditional independencies and the resulting conditional distributions from empirical data. We did not need to perform controlled statistical experiments.

Note that this will not distort causally upstream distributions, since the conditionality of the specified variable is dropped.