CISPA
Browse
cispa_all_3726.pdf (840.41 kB)

Discovering Significant Patterns under Sequential False Discovery Control

Download (840.41 kB)
conference contribution
posted on 2023-11-29, 18:21 authored by Sebastian Dalleiger, Jilles VreekenJilles Vreeken
We are interested in discovering those patterns from data with an empirical frequency that is significantly differently than expec- ted. To avoid spurious results, yet achieve high statistical power, we propose to sequentially control for false discoveries during the search. To avoid redundancy, we propose to update our expect- ations whenever we discover a significant pattern. To efficiently consider the exponentially sized search space, we employ an easy- to-compute upper bound on significance, and propose an effective search strategy for sets of significant patterns. Through an extens- ive set of experiments on synthetic data, we show that our method, Spass, recovers the ground truth reliably, does so efficiently, and without redundancy. On real-world data we show it works well on both single and multiple classes, on low and high dimensional data, and through case studies that it discovers meaningful results.

History

Preferred Citation

Sebastian Dalleiger and Jilles Vreeken. Discovering Significant Patterns under Sequential False Discovery Control. In: ACM International Conference on Knowledge Discovery and Data Mining (KDD). 2022.

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

ACM International Conference on Knowledge Discovery and Data Mining (KDD)

Legacy Posted Date

2022-07-15

Open Access Type

  • Unknown

BibTeX

@inproceedings{cispa_all_3726, title = "Discovering Significant Patterns under Sequential False Discovery Control", author = "Dalleiger, Sebastian and Vreeken, Jilles", booktitle="{ACM International Conference on Knowledge Discovery and Data Mining (KDD)}", year="2022", }

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC