CISPA
Browse
3580305.3599266.pdf (1.3 MB)

CampER: An Effective Framework for Privacy-Aware Deep Entity Resolution

Download (1.3 MB)
conference contribution
posted on 2024-04-05, 10:44 authored by Yuxiang Guo, Lu Chen, Zhengjie Zhou, Baihua Zheng, Ziquan Fang, Zhikun Zhang, Yuren Mao, Yunjun Gao
Entity Resolution (ER) is a fundamental problem in data preparation. Standard deep ER methods have achieved state-of-the-art effectiveness, assuming that relations from different organizations are centrally stored. However, due to privacy concerns, it can be difficult to centralize data in practice, rendering standard deep ER solutions inapplicable. Despite efforts to develop rule-based privacy-preserving ER methods, they often neglect subtle matching mechanisms and have poor effectiveness as a result. To bridge effectiveness and privacy, in this paper, we propose CampER, an effective framework for privacy-aware deep entity resolution. Specifically, we first design a training pair self-generation strategy to overcome the absence of manually labeled data in privacy-aware scenarios. Based on the self-constructed training pairs, we present a collaborative fine-tuning approach to learn the match-aware and uni-space individual tuple embeddings for accurate matching decisions. During the matching decision-making process, we first introduce a cryptographically secure approach to determine matches. Furthermore, we propose an order-preserving perturbation strategy to significantly accelerate the matching computation while guaranteeing the consistency of ER results. Extensive experiments on eight widely-used benchmark datasets demonstrate that CampER not only is comparable with the state-of-the-art standard deep ER solutions in effectiveness, but also preserves privacy.

History

Primary Research Area

  • Algorithmic Foundations and Cryptography

Name of Conference

ACM International Conference on Knowledge Discovery and Data Mining (KDD)

Journal

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Page Range

626-637

Publisher

Association for Computing Machinery (ACM)

Open Access Type

  • Green

BibTeX

@conference{Guo:Chen:Zhou:Zheng:Fang:Zhang:Mao:Gao:2023, title = "CampER: An Effective Framework for Privacy-Aware Deep Entity Resolution", author = "Guo, Yuxiang" AND "Chen, Lu" AND "Zhou, Zhengjie" AND "Zheng, Baihua" AND "Fang, Ziquan" AND "Zhang, Zhikun" AND "Mao, Yuren" AND "Gao, Yunjun", year = 2023, month = 8, journal = "Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining", pages = "626--637", publisher = "Association for Computing Machinery (ACM)", doi = "10.1145/3580305.3599266" }