CISPA
Browse

Localizing Memorization in SSL Vision Encoders

Download (1.18 MB)
Recent work on studying memorization in self-supervised learning (SSL) suggests that even though SSL encoders are trained on millions of images, they still memorize individual data points. While effort has been put into characterizing the memorized data and linking encoder memorization to downstream utility, little is known about where the memorization happens inside SSL encoders. To close this gap, we propose two metrics for localizing memorization in SSL encoders on a per-layer (\layermem) and per-unit basis (\unitmem). Our localization methods are independent of the downstream task, do not require any label information, and can be performed in a forward pass. By localizing memorization in various encoder architectures (convolutional and transformer-based) trained on diverse datasets with contrastive and non-contrastive SSL frameworks, we find that (1)~while SSL memorization increases with layer depth, highly memorizing units are distributed across the entire encoder, (2)~a significant fraction of units in SSL encoders experiences surprisingly high memorization of individual data points, which is in contrast to models trained under supervision, (3)~\textit{atypical} (or outlier) data points cause much higher layer and unit memorization than standard data points, and (4)~in vision transformers, most memorization happens in the fully-connected layers. Finally, we show that localizing memorization in SSL has the potential to improve fine-tuning and to inform pruning strategies.

History

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

Conference on Neural Information Processing Systems (NeurIPS)

CISPA Affiliation

  • Yes

BibTeX

@conference{Wang:Dziedzic:Backes:Boenisch:2024, title = "Localizing Memorization in SSL Vision Encoders", author = "Wang, Wenhao" AND "Dziedzic, Adam" AND "Backes, Michael" AND "Boenisch, Franziska", year = 2024, month = 1 }

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC