TY - GEN
T1 - TextHide
T2 - Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020
AU - Huang, Yangsibo
AU - Song, Zhao
AU - Chen, Danqi
AU - Li, Kai
AU - Arora, Sanjeev
N1 - Publisher Copyright:
©2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - An unsolved challenge in distributed or federated learning is to effectively mitigate privacy risks without slowing down training or reducing accuracy. In this paper, we propose TextHide aiming at addressing this challenge for natural language understanding tasks. It requires all participants to add a simple encryption step to prevent an eavesdropping attacker from recovering private text data. Such an encryption step is efficient and only affects the task performance slightly. In addition, TextHide fits well with the popular framework of fine-tuning pre-trained language models (e.g., BERT) for any sentence or sentence-pair task. We evaluate TextHide on the GLUE benchmark, and our experiments show that TextHide can effectively defend attacks on shared gradients or representations and the averaged accuracy reduction is only 1.9%. We also present an analysis of the security of TextHide using a conjecture about the computational intractability of a mathematical problem.
AB - An unsolved challenge in distributed or federated learning is to effectively mitigate privacy risks without slowing down training or reducing accuracy. In this paper, we propose TextHide aiming at addressing this challenge for natural language understanding tasks. It requires all participants to add a simple encryption step to prevent an eavesdropping attacker from recovering private text data. Such an encryption step is efficient and only affects the task performance slightly. In addition, TextHide fits well with the popular framework of fine-tuning pre-trained language models (e.g., BERT) for any sentence or sentence-pair task. We evaluate TextHide on the GLUE benchmark, and our experiments show that TextHide can effectively defend attacks on shared gradients or representations and the averaged accuracy reduction is only 1.9%. We also present an analysis of the security of TextHide using a conjecture about the computational intractability of a mathematical problem.
UR - http://www.scopus.com/inward/record.url?scp=85098261093&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098261093&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85098261093
T3 - Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020
SP - 1368
EP - 1382
BT - Findings of the Association for Computational Linguistics Findings of ACL
PB - Association for Computational Linguistics (ACL)
Y2 - 16 November 2020 through 20 November 2020
ER -