CISPA
Browse

File(s) not publicly available

What's in the Box? Exploring the Inner Life of Neural Networks with Robust Rules

conference contribution
posted on 2023-11-29, 18:18 authored by Jonas Fischer, Anna Olah, Jilles VreekenJilles Vreeken
We propose a novel method for exploring how neurons within neural networks interact. In particular, we consider activation values of a network for given data, and propose to mine noise-robust rules of the form X {\rightarrow} Y , where X and Y are sets of neurons in different layers. We identify the best set of rules by the Minimum Description Length Principle as the rules that together are most descriptive of the activation data. To learn good rule sets in practice, we propose the unsupervised ExplaiNN algorithm. Extensive evaluation shows that the patterns it discovers give clear insight in how networks perceive the world: they identify shared, respectively class-specific traits, compositionality within the network, as well as locality in convolutional layers. Moreover, these patterns are not only easily interpretable, but also supercharge prototyping as they identify which groups of neurons to consider in unison.

History

Preferred Citation

Jonas Fischer, Anna Olah and Jilles Vreeken. What's in the Box? Exploring the Inner Life of Neural Networks with Robust Rules. In: International Conference on Machine Learning (ICML). 2021.

Primary Research Area

  • Trustworthy Information Processing

Secondary Research Area

  • Empirical and Behavioral Security

Name of Conference

International Conference on Machine Learning (ICML)

Legacy Posted Date

2021-12-17

Open Access Type

  • Green

BibTeX

@inproceedings{cispa_all_3552, title = "What's in the Box? Exploring the Inner Life of Neural Networks with Robust Rules", author = "Fischer, Jonas and Olah, Anna and Vreeken, Jilles", booktitle="{International Conference on Machine Learning (ICML)}", year="2021", }

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC