posted on 2023-11-29, 18:18authored byAlexander Marx, Lincen Yang, Matthijs van Leeuwen
Estimating conditional mutual information (CMI) is an
essential yet challenging step in many machine learning
and data mining tasks. Estimating CMI from data that
contains both discrete and continuous variables, or even
discrete-continuous mixture variables, is a particularly hard
problem. In this paper, we show that CMI for such mixture
variables, defined based on the Radon-Nikodym derivate,
can be written as a sum of entropies, just like CMI for purely
discrete or continuous data. Further, we show that CMI can
be consistently estimated for discrete-continuous mixture
variables by learning an adaptive histogram model. In
practice, we estimate such a model by iteratively discretizing
the continuous data points in the mixture variables. To
evaluate the performance of our estimator, we benchmark it
against state-of-the-art CMI estimators as well as evaluate
it in a causal discovery setting.
History
Preferred Citation
Alexander Marx, Lincen Yang and Leeuwen van. Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. In: SIAM International Conference on Data Mining (SDM). 2021.
Primary Research Area
Empirical and Behavioral Security
Name of Conference
SIAM International Conference on Data Mining (SDM)
Legacy Posted Date
2022-03-28
Open Access Type
Unknown
BibTeX
@inproceedings{cispa_all_3594,
title = "Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms",
author = "Marx, Alexander and Yang, Lincen and van Leeuwen, Matthijs",
booktitle="{SIAM International Conference on Data Mining (SDM)}",
year="2021",
}