CISPA
Browse

On Adversarial Training without Perturbing all Examples

Download (1.4 MB)
conference contribution
posted on 2024-04-03, 11:21 authored by Max Losch, Mohamed Omran, David Stutz, Mario FritzMario Fritz, Bernt Schiele
Adversarial Training (AT) is the de-facto standard for improving robustness against adversarial examples. This usually involves a multi-step adversarial attack applied on each example during training. In this paper, we explore only constructing Adversarial Examples (AEs) on a subset of the training examples. That is, we split the training set in two subsets A and B, train models on both (A ∪ B) but construct AEs only for examples in A. Starting with A containing only a single class, we systematically increase the size of A and consider splitting by class and by examples. We observe that: (i) adv. robustness transfers by difficulty and to classes in B that have never been adv. attacked during training, (ii) we observe a tendency for hard examples to provide better robustness transfer than easy examples, yet find this tendency to diminish with increasing complexity of datasets (iii) generating AEs on only 50% of training data is sufficient to recover most of the baseline AT performance even on ImageNet. We observe similar transfer properties across tasks, where generating AEs on only 30% of data can recover baseline robustness on the target task. We evaluate our subset analysis on a wide variety of image datasets like CIFAR-10, CIFAR-100, ImageNet-200 and show transfer to SVHN, Oxford-Flowers-102 and Caltech-256. In contrast to conventional practice, our experiments indicate that the utility of computing AEs varies by class and examples and that weighting examples from A higher than B provides high transfer performance. Code is available at http://github.com/mlosch/SAT.

History

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

International Conference on Learning Representations (ICLR)

Journal

The Twelfth International Conference on Learning Representations (ICLR)

Pages

21

Open Access Type

  • Gold

BibTeX

@conference{Losch:Omran:Stutz:Fritz:Schiele:2024, title = "On Adversarial Training without Perturbing all Examples", author = "Losch, Max" AND "Omran, Mohamed" AND "Stutz, David" AND "Fritz, Mario" AND "Schiele, Bernt", year = 2024, month = 1, journal = "The Twelfth International Conference on Learning Representations (ICLR)" }

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC