Differentiable Pattern Set Mining

Fischer, Jonas; Vreeken, Jilles

doi:10.60882/cispa.24613842.v1

cispa_all_3546.pdf (1.36 MB)

Differentiable Pattern Set Mining

conference contribution

posted on 2023-11-29, 18:18 authored by Jonas Fischer, Jilles VreekenJilles Vreeken

Pattern set mining has been successful in discovering small sets of highly informative and useful patterns from data. To find good models, existing methods heuristically explore the twice-exponential search space over all possible pattern sets in a combinatorial way, by which they are limited to data over at most hundreds of features, as well as likely to get stuck in local minima. Here, we propose a gradient based optimization approach that allows us to efficiently discover high-quality pattern sets from data of millions of rows and hundreds of thousands of features. In particular, we propose a novel type of neural autoencoder called BinaPs, using binary activations and binarizing weights in each forward pass, which are directly interpretable as conjunctive patterns. For training, optimizing a data-sparsity aware reconstruction loss, continuous versions of the weights are learned in small, noisy steps. Hence, this formulation of our problem provides a link between the discrete search space and continuous optimization, thus allowing for a gradient based strategy to discover sets of high-quality and noise-robust patterns. Through extensive experiments on both synthetic and real world data, we show that BinaPs discovers high quality and noise robust patterns, and unique among all competitors, easily scales to real supermarket transaction and biological variant call data.

History

Preferred Citation

Jonas Fischer and Jilles Vreeken. Differentiable Pattern Set Mining. In: ACM International Conference on Knowledge Discovery and Data Mining (KDD). 2021.

Primary Research Area

Algorithmic Foundations and Cryptography

Name of Conference

ACM International Conference on Knowledge Discovery and Data Mining (KDD)

Legacy Posted Date

2021-12-16

Open Access Type

Green

BibTeX

@inproceedings{cispa_all_3546, title = "Differentiable Pattern Set Mining", author = "Fischer, Jonas and Vreeken, Jilles", booktitle="{ACM International Conference on Knowledge Discovery and Data Mining (KDD)}", year="2021", }

Usage metrics

Keywords

Algorithmic Foundations and Cryptography

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Differentiable Pattern Set Mining

History

Preferred Citation

Primary Research Area

Name of Conference

Legacy Posted Date

Open Access Type

BibTeX

Usage metrics

Categories

Keywords

Licence

Exports