TY - GEN
T1 - Link prediction by de-anonymization
T2 - 2011 International Joint Conference on Neural Network, IJCNN 2011
AU - Narayanan, Arvind
AU - Shi, Elaine
AU - Rubinstein, Benjamin I.P.
PY - 2011
Y1 - 2011
N2 - This paper describes the winning entry to the IJCNN 2011 Social Network Challenge run by Kaggle.com. The goal of the contest was to promote research on real-world link prediction, and the dataset was a graph obtained by crawling the popular Flickr social photo sharing website, with user identities scrubbed. By de-anonymizing much of the competition test set using our own Flickr crawl, we were able to effectively game the competition. Our attack represents a new application of de-anonymization to gaming machine learning contests, suggesting changes in how future competitions should be run. We introduce a new simulated annealing-based weighted graph matching algorithm for the seeding step of de-anonymization. We also show how to combine de-anonymization with link predictionthe latter is required to achieve good performance on the portion of the test set not de-anonymizedfor example by training the predictor on the de-anonymized portion of the test set, and combining probabilistic predictions from de-anonymization and link prediction.
AB - This paper describes the winning entry to the IJCNN 2011 Social Network Challenge run by Kaggle.com. The goal of the contest was to promote research on real-world link prediction, and the dataset was a graph obtained by crawling the popular Flickr social photo sharing website, with user identities scrubbed. By de-anonymizing much of the competition test set using our own Flickr crawl, we were able to effectively game the competition. Our attack represents a new application of de-anonymization to gaming machine learning contests, suggesting changes in how future competitions should be run. We introduce a new simulated annealing-based weighted graph matching algorithm for the seeding step of de-anonymization. We also show how to combine de-anonymization with link predictionthe latter is required to achieve good performance on the portion of the test set not de-anonymizedfor example by training the predictor on the de-anonymized portion of the test set, and combining probabilistic predictions from de-anonymization and link prediction.
UR - http://www.scopus.com/inward/record.url?scp=80054758441&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80054758441&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2011.6033446
DO - 10.1109/IJCNN.2011.6033446
M3 - Conference contribution
AN - SCOPUS:80054758441
SN - 9781457710865
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 1825
EP - 1834
BT - 2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program
Y2 - 31 July 2011 through 5 August 2011
ER -