On Adversarial Training without Perturbing all Examples

Losch, Max; Omran, Mohamed; Stutz, David; Fritz, Mario; Schiele, Bernt

doi:10.60882/cispa.25532860.v1

On Adversarial Training without Perturbing all Examples

conference contribution

posted on 2024-04-03, 11:21 authored by Max Losch, Mohamed Omran, David Stutz, Mario FritzMario Fritz, Bernt Schiele

Adversarial Training (AT) is the de-facto standard for improving robustness against adversarial examples. This usually involves a multi-step adversarial attack applied on each example during training. In this paper, we explore only constructing Adversarial Examples (AEs) on a subset of the training examples. That is, we split the training set in two subsets A and B, train models on both (A ∪ B) but construct AEs only for examples in A. Starting with A containing only a single class, we systematically increase the size of A and consider splitting by class and by examples. We observe that: (i) adv. robustness transfers by difficulty and to classes in B that have never been adv. attacked during training, (ii) we observe a tendency for hard examples to provide better robustness transfer than easy examples, yet find this tendency to diminish with increasing complexity of datasets (iii) generating AEs on only 50% of training data is sufficient to recover most of the baseline AT performance even on ImageNet. We observe similar transfer properties across tasks, where generating AEs on only 30% of data can recover baseline robustness on the target task. We evaluate our subset analysis on a wide variety of image datasets like CIFAR-10, CIFAR-100, ImageNet-200 and show transfer to SVHN, Oxford-Flowers-102 and Caltech-256. In contrast to conventional practice, our experiments indicate that the utility of computing AEs varies by class and examples and that weighting examples from A higher than B provides high transfer performance. Code is available at http://github.com/mlosch/SAT.

History

Primary Research Area

Trustworthy Information Processing

Name of Conference

International Conference on Learning Representations (ICLR)

Journal

The Twelfth International Conference on Learning Representations (ICLR)

Pages

21

Open Access Type

Gold

BibTeX

@conference{Losch:Omran:Stutz:Fritz:Schiele:2024, title = "On Adversarial Training without Perturbing all Examples", author = "Losch, Max" AND "Omran, Mohamed" AND "Stutz, David" AND "Fritz, Mario" AND "Schiele, Bernt", year = 2024, month = 1, journal = "The Twelfth International Conference on Learning Representations (ICLR)" }