CISPA
Browse
Two-in-One A Model Hijacking Attack Against Text Generation Models.pdf (598.03 kB)

Two-in-One: A Model Hijacking Attack Against Text Generation Models

Download (598.03 kB)
conference contribution
posted on 2024-02-09, 09:21 authored by Wai Man SiWai Man Si, Michael BackesMichael Backes, Yang ZhangYang Zhang, Ahmed Salem
Machine learning has progressed significantly in various applications ranging from face recognition to text generation. However, its success has been accompanied by different attacks. Recently a new attack has been proposed which raises both accountability and parasitic computing risks, namely the model hijacking attack. Nevertheless, this attack has only focused on image classification tasks. In this work, we broaden the scope of this attack to include text generation and classification models, hence showing its broader applicability. More concretely, we propose a new model hijacking attack, Ditto, that can hijack different text classification tasks into multiple generation ones, e.g., language translation, text summarization, and language modeling. We use a range of text benchmark datasets such as SST-2, TweetEval, AGnews, QNLI, and IMDB to evaluate the performance of our attacks. Our results show that by using Ditto, an adversary can successfully hijack text generation models without jeopardizing their utility.

History

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

Usenix Security Symposium (USENIX-Security)

Journal

USENIX Security Symposium (USENIX Security)

Page Range

2223-2240

Publisher

USENIX

BibTeX

@conference{Si:Backes:Zhang:Salem:2023, title = "Two-in-One: A Model Hijacking Attack Against Text Generation Models", author = "Si, W" AND "Backes, M" AND "Zhang, Y" AND "Salem, Ahmed", year = 2023, month = 5, journal = "USENIX Security Symposium (USENIX Security)", pages = "2223--2240", publisher = "USENIX" }

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC