How can we discover whether X causes Y , or vice versa,
that Y causes X, when we are only given a sample over their joint distribution? How can we do this such that X and Y can be univariate,
multivariate, or of different cardinalities? And, how can we do so regardless of whether X and Y are of the same, or of different data type, be it
discrete, numeric, or mixed? These are exactly the questions we answer.
We take an information theoretic approach, based on the Minimum Description Length principle, from which it follows that first describing the
data over cause and then that of effect given cause is shorter than the
reverse direction. Simply put, if Y can be explained more succinctly by
a set of classification or regression trees conditioned on X, than in the
opposite direction, we conclude that X causes Y . Empirical evaluation
on a wide range of data shows that our method, Crack, infers the correct causal direction reliably and with high accuracy on a wide range of
settings, outperforming the state of the art by a wide margin
History
Preferred Citation
Alexander Marx and Jilles Vreeken. Causal Inference on Multivariate and Mixed Type Data. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (ECML PKDD). 2018.
Primary Research Area
Empirical and Behavioral Security
Name of Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (ECML PKDD)
Legacy Posted Date
2019-06-07
Open Access Type
Unknown
BibTeX
@inproceedings{cispa_all_2910,
title = "Causal Inference on Multivariate and Mixed Type Data",
author = "Marx, Alexander and Vreeken, Jilles",
booktitle="{European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (ECML PKDD)}",
year="2018",
}