TY - GEN
T1 - How to quantify graph De-anonymization risks
AU - Lee, Wei Han
AU - Liu, Changchang
AU - Ji, Shouling
AU - Mittal, Prateek
AU - Lee, Ruby B.
N1 - Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.
PY - 2018
Y1 - 2018
N2 - An increasing amount of data are becoming publicly available over the Internet. These data are released after applying some anonymization techniques. Recently, researchers have paid significant attention to analyzing the risks of publishing privacy-sensitive data. Even if data anonymization techniques were applied to protect privacy-sensitive data, several de-anonymization attacks have been proposed to break their privacy. However, no theoretical quantification for relating the data vulnerability against de-anonymization attacks and the data utility that is preserved by the anonymization techniques exists. In this paper, we first address several fundamental open problems in the structure-based de-anonymization research by establishing a formal model for privacy breaches on anonymized data and quantifying the conditions for successful de-anonymization under a general graph model. To the best of our knowledge, this is the first work on quantifying the relationship between anonymized utility and de-anonymization capability. Our quantification works under very general assumptions about the distribution from which the data are drawn, thus providing a theoretical guide for practical de-anonymization/anonymization techniques. Furthermore, we use multiple real-world datasets including a Facebook dataset, a Collaboration dataset, and two Twitter datasets to show the limitations of the state-of-the-art de-anonymization attacks. From these experimental results, we demonstrate the ineffectiveness of previous de-anonymization attacks and the potential of more powerful de-anonymization attacks in the future, by comparing the theoretical de-anonymization capability proposed by us with the practical experimental results of the state-of-the-art de-anonymization methods.
AB - An increasing amount of data are becoming publicly available over the Internet. These data are released after applying some anonymization techniques. Recently, researchers have paid significant attention to analyzing the risks of publishing privacy-sensitive data. Even if data anonymization techniques were applied to protect privacy-sensitive data, several de-anonymization attacks have been proposed to break their privacy. However, no theoretical quantification for relating the data vulnerability against de-anonymization attacks and the data utility that is preserved by the anonymization techniques exists. In this paper, we first address several fundamental open problems in the structure-based de-anonymization research by establishing a formal model for privacy breaches on anonymized data and quantifying the conditions for successful de-anonymization under a general graph model. To the best of our knowledge, this is the first work on quantifying the relationship between anonymized utility and de-anonymization capability. Our quantification works under very general assumptions about the distribution from which the data are drawn, thus providing a theoretical guide for practical de-anonymization/anonymization techniques. Furthermore, we use multiple real-world datasets including a Facebook dataset, a Collaboration dataset, and two Twitter datasets to show the limitations of the state-of-the-art de-anonymization attacks. From these experimental results, we demonstrate the ineffectiveness of previous de-anonymization attacks and the potential of more powerful de-anonymization attacks in the future, by comparing the theoretical de-anonymization capability proposed by us with the practical experimental results of the state-of-the-art de-anonymization methods.
KW - Anonymization utility
KW - De-anonymization capability
KW - Structure-based de-anonymization attacks
KW - Theoretical bounds
UR - http://www.scopus.com/inward/record.url?scp=85049098005&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049098005&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-93354-2_5
DO - 10.1007/978-3-319-93354-2_5
M3 - Conference contribution
AN - SCOPUS:85049098005
SN - 9783319933535
T3 - Communications in Computer and Information Science
SP - 84
EP - 104
BT - Information Systems Security and Privacy - 3rd International Conference, ICISSP 2017, Revised Selected Papers
A2 - Mori, Paolo
A2 - Camp, Olivier
A2 - Furnell, Steven
PB - Springer Verlag
T2 - 3rd International Conference on Information Systems Security and Privacy, ICISSP 2017
Y2 - 19 February 2017 through 21 February 2017
ER -