TY - GEN
T1 - Cloak and swagger
T2 - 35th IEEE Symposium on Security and Privacy, SP 2014
AU - Peddinti, Sai Teja
AU - Korolova, Aleksandra
AU - Bursztein, Elie
AU - Sampemane, Geetanjali
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/11/13
Y1 - 2014/11/13
N2 - Most of what we understand about data sensitivity is through user self-report (e.g., surveys), this paper is the first to use behavioral data to determine content sensitivity, via the clues that users give as to what information they consider private or sensitive through their use of privacy enhancing product features. We perform a large-scale analysis of user anonymity choices during their activity on Quora, a popular question-and-answer site. We identify categories of questions for which users are more likely to exercise anonymity and explore several machine learning approaches towards predicting whether a particular answer will be written anonymously. Our findings validate the viability of the proposed approach towards an automatic assessment of data sensitivity, show that data sensitivity is a nuanced measure that should be viewed on a continuum rather than as a binary concept, and advance the idea that machine learning over behavioral data can be effectively used in order to develop product features that can help keep users safe.
AB - Most of what we understand about data sensitivity is through user self-report (e.g., surveys), this paper is the first to use behavioral data to determine content sensitivity, via the clues that users give as to what information they consider private or sensitive through their use of privacy enhancing product features. We perform a large-scale analysis of user anonymity choices during their activity on Quora, a popular question-and-answer site. We identify categories of questions for which users are more likely to exercise anonymity and explore several machine learning approaches towards predicting whether a particular answer will be written anonymously. Our findings validate the viability of the proposed approach towards an automatic assessment of data sensitivity, show that data sensitivity is a nuanced measure that should be viewed on a continuum rather than as a binary concept, and advance the idea that machine learning over behavioral data can be effectively used in order to develop product features that can help keep users safe.
UR - http://www.scopus.com/inward/record.url?scp=84914169536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84914169536&partnerID=8YFLogxK
U2 - 10.1109/SP.2014.38
DO - 10.1109/SP.2014.38
M3 - Conference contribution
AN - SCOPUS:84914169536
T3 - Proceedings - IEEE Symposium on Security and Privacy
SP - 493
EP - 508
BT - Proceedings - IEEE Symposium on Security and Privacy
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 May 2014 through 21 May 2014
ER -