TY - GEN
T1 - Self-supervised euphemism detection and identification for content moderation
AU - Zhu, Wanzheng
AU - Gong, Hongyu
AU - Bansal, Rohan
AU - Weinberg, Zachary
AU - Christin, Nicolas
AU - Fanti, Giulia
AU - Bhat, Suma
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/5
Y1 - 2021/5
N2 - Fringe groups and organizations have a long history of using euphemisms - ordinary-sounding words with a secret meaning - to conceal what they are discussing. Nowadays, one common use of euphemisms is to evade content moderation policies enforced by social media platforms. Existing tools for enforcing policy automatically rely on keyword searches for words on a "ban list", but these are notoriously imprecise: even when limited to swearwords, they can still cause embarrassing false positives [1]. When a commonly used ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban list is hopeless: consider "pot"(storage container or marijuana?) or "heater"(household appliance or firearm?) The current generation of social media companies instead hire staff to check posts manually, but this is expensive, inhumane, and not much more effective. It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy. Also, when a euphemism is banned, the group that used it need only invent another one, leaving moderators one step behind.This paper will demonstrate unsupervised algorithms that, by analyzing words in their sentence-level context, can both detect words being used euphemistically, and identify the secret meaning of each word. Compared to the existing state of the art, which uses context-free word embeddings, our algorithm for detecting euphemisms achieves 30-400% higher detection accuracies of unlabeled euphemisms in a text corpus. Our algorithm for revealing euphemistic meanings of words is the first of its kind, as far as we are aware. In the arms race between content moderators and policy evaders, our algorithms may help shift the balance in the direction of the moderators.
AB - Fringe groups and organizations have a long history of using euphemisms - ordinary-sounding words with a secret meaning - to conceal what they are discussing. Nowadays, one common use of euphemisms is to evade content moderation policies enforced by social media platforms. Existing tools for enforcing policy automatically rely on keyword searches for words on a "ban list", but these are notoriously imprecise: even when limited to swearwords, they can still cause embarrassing false positives [1]. When a commonly used ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban list is hopeless: consider "pot"(storage container or marijuana?) or "heater"(household appliance or firearm?) The current generation of social media companies instead hire staff to check posts manually, but this is expensive, inhumane, and not much more effective. It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy. Also, when a euphemism is banned, the group that used it need only invent another one, leaving moderators one step behind.This paper will demonstrate unsupervised algorithms that, by analyzing words in their sentence-level context, can both detect words being used euphemistically, and identify the secret meaning of each word. Compared to the existing state of the art, which uses context-free word embeddings, our algorithm for detecting euphemisms achieves 30-400% higher detection accuracies of unlabeled euphemisms in a text corpus. Our algorithm for revealing euphemistic meanings of words is the first of its kind, as far as we are aware. In the arms race between content moderators and policy evaders, our algorithms may help shift the balance in the direction of the moderators.
KW - Coarse-to-fine-grained classification
KW - Euphemism detection
KW - Euphemism identification
KW - Masked Language Model (MLM)
KW - Self-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85115055034&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115055034&partnerID=8YFLogxK
U2 - 10.1109/SP40001.2021.00075
DO - 10.1109/SP40001.2021.00075
M3 - Conference contribution
AN - SCOPUS:85115055034
T3 - Proceedings - IEEE Symposium on Security and Privacy
SP - 229
EP - 246
BT - Proceedings - 2021 IEEE Symposium on Security and Privacy, SP 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 42nd IEEE Symposium on Security and Privacy, SP 2021
Y2 - 24 May 2021 through 27 May 2021
ER -