CISPA
Browse
24_sentence_embedding_encoders_ar.pdf (317.21 kB)

Sentence Embedding Encoders are Easy to Steal but Hard to Defend

Download (317.21 kB)
conference contribution
posted on 2024-02-26, 11:07 authored by Adam Dziedzic, Franziska Boenisch, Mingjian Jiang, Haonan Duan, Nicolas Papernot
Self-supervised learning (SSL) has become the predominant approach to training on large amounts of data when no labels are available. Since the corresponding model architectures are usually large, the training process is, in itself, costly, and training relies on dedicated expensive hardware. As a consequence, not every party can train such models from scratch. Instead, new APIs offer paid access to pre-trained SSL models. We consider transformer-based SSL sentence encoders and show that they can be efficiently extracted (stolen) from behind these APIs through black-box query access. Our stealing requires down to 40x fewer queries than the number of the victim's training data points and much less computation. This large gap between low attack costs and high victim training costs strongly incentivizes attackers to steal encoders. To protect the transformer-based sentence encoders against stealing, we propose to embed secret downstream tasks to their training which serve as watermarks. In general, our work highlights that sentence embedding encoders are easily stolen but hard to defend.

History

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

International Conference on Learning Representations (ICLR)

Journal

ICLR 2023 Workshop on Trustworthy ML

BibTeX

@conference{Dziedzic:Boenisch:Jiang:Duan:Papernot:2023, title = "Sentence Embedding Encoders are Easy to Steal but Hard to Defend", author = "Dziedzic, Adam" AND "Boenisch, Franziska" AND "Jiang, Mingjian" AND "Duan, Haonan" AND "Papernot, Nicolas", year = 2023, month = 5, journal = "ICLR 2023 Workshop on Trustworthy ML" }

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC