Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge

Arvind Narayanan, Elaine Shi, Benjamin I.P. Rubinstein

Research output: Chapter in Book/Report/Conference proceedingConference contribution

113 Scopus citations

Abstract

This paper describes the winning entry to the IJCNN 2011 Social Network Challenge run by Kaggle.com. The goal of the contest was to promote research on real-world link prediction, and the dataset was a graph obtained by crawling the popular Flickr social photo sharing website, with user identities scrubbed. By de-anonymizing much of the competition test set using our own Flickr crawl, we were able to effectively game the competition. Our attack represents a new application of de-anonymization to gaming machine learning contests, suggesting changes in how future competitions should be run. We introduce a new simulated annealing-based weighted graph matching algorithm for the seeding step of de-anonymization. We also show how to combine de-anonymization with link predictionthe latter is required to achieve good performance on the portion of the test set not de-anonymizedfor example by training the predictor on the de-anonymized portion of the test set, and combining probabilistic predictions from de-anonymization and link prediction.

Original languageEnglish (US)
Title of host publication2011 International Joint Conference on Neural Networks, IJCNN 2011 - Final Program
Pages1825-1834
Number of pages10
DOIs
StatePublished - 2011
Externally publishedYes
Event2011 International Joint Conference on Neural Network, IJCNN 2011 - San Jose, CA, United States
Duration: Jul 31 2011Aug 5 2011

Publication series

NameProceedings of the International Joint Conference on Neural Networks

Other

Other2011 International Joint Conference on Neural Network, IJCNN 2011
Country/TerritoryUnited States
CitySan Jose, CA
Period7/31/118/5/11

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge'. Together they form a unique fingerprint.

Cite this