TY - GEN
T1 - De-anonymizing web browsing data with social networks
AU - Su, Jessica
AU - Shukla, Ansh
AU - Goel, Sharad
AU - Narayanan, Arvind
N1 - Publisher Copyright:
© 2017 International World Wide Web Conference Committee (IW3C2).
PY - 2017
Y1 - 2017
N2 - Can online trackers and network adversaries de-anonymize web browsing data readily available to them? We show—theoretically, via simulation, and through experiments on real user data—that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one's feed is unique. Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity. We formalize this intuition by specifying a model of web browsing behavior and then deriving the maximum likelihood estimate of a user's social profile. We evaluate this strategy on simulated browsing histories, and show that given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50% of the time. To gauge the real-world effectiveness of this approach, we recruited nearly 400 people to donate their web browsing histories, and we were able to correctly identify more than 70% of them. We further show that several online trackers are embedded on sufficiently many websites to carry out this attack with high accuracy. Our theoretical contribution applies to any type of transactional data and is robust to noisy observations, generalizing a wide range of previous de-anonymization attacks. Finally, since our attack attempts to find the correct Twitter profile out of over 300 million candidates, it is—to our knowledge—the largest-scale demonstrated de-anonymization to date.
AB - Can online trackers and network adversaries de-anonymize web browsing data readily available to them? We show—theoretically, via simulation, and through experiments on real user data—that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one's feed is unique. Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity. We formalize this intuition by specifying a model of web browsing behavior and then deriving the maximum likelihood estimate of a user's social profile. We evaluate this strategy on simulated browsing histories, and show that given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50% of the time. To gauge the real-world effectiveness of this approach, we recruited nearly 400 people to donate their web browsing histories, and we were able to correctly identify more than 70% of them. We further show that several online trackers are embedded on sufficiently many websites to carry out this attack with high accuracy. Our theoretical contribution applies to any type of transactional data and is robust to noisy observations, generalizing a wide range of previous de-anonymization attacks. Finally, since our attack attempts to find the correct Twitter profile out of over 300 million candidates, it is—to our knowledge—the largest-scale demonstrated de-anonymization to date.
UR - http://www.scopus.com/inward/record.url?scp=85051483706&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051483706&partnerID=8YFLogxK
U2 - 10.1145/3038912.3052714
DO - 10.1145/3038912.3052714
M3 - Conference contribution
AN - SCOPUS:85051483706
SN - 9781450349130
T3 - 26th International World Wide Web Conference, WWW 2017
SP - 1261
EP - 1269
BT - 26th International World Wide Web Conference, WWW 2017
PB - International World Wide Web Conferences Steering Committee
T2 - 26th International World Wide Web Conference, WWW 2017
Y2 - 3 April 2017 through 7 April 2017
ER -