Membership inference attacks against adversarially robust deep learning models

Liwei Song, Reza Shokri, Prateek Mittal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

51 Scopus citations


In recent years, the research community has increasingly focused on understanding the security and privacy challenges posed by deep learning models. However, the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards enhancing our understanding of deep learning models when the two domains are combined together. We do this by measuring the success of membership inference attacks against two state-of-the-art adversarial defense methods that mitigate evasion attacks: adversarial training and provable defense. On the one hand, membership inference attacks aim to infer an individual's participation in the target model's training dataset and are known to be correlated with target model's overfitting. On the other hand, adversarial defense methods aim to enhance the robustness of target models by ensuring that model predictions are unchanged for a small area around each sample in the training dataset. Intuitively, adversarial defenses may rely more on the training dataset and be more vulnerable to membership inference attacks. By performing empirical membership inference attacks on both adversarially robust models and corresponding undefended models, we find that the adversarial training method is indeed more susceptible to membership inference attacks, and the privacy leakage is directly correlated with model robustness. We also find that the provable defense approach does not lead to enhanced success of membership inference attacks. However, this is achieved by significantly sacrificing the accuracy of the model on benign data points, indicating that privacy, security, and prediction accuracy are not jointly achieved in these two approaches.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE Symposium on Security and Privacy Workshops, SPW 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages7
ISBN (Electronic)9781728135083
StatePublished - May 2019
Event2019 IEEE Symposium on Security and Privacy Workshops, SPW 2019 - San Francisco, United States
Duration: May 23 2019 → …

Publication series

NameProceedings - 2019 IEEE Symposium on Security and Privacy Workshops, SPW 2019


Conference2019 IEEE Symposium on Security and Privacy Workshops, SPW 2019
Country/TerritoryUnited States
CitySan Francisco
Period5/23/19 → …

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Safety, Risk, Reliability and Quality


  • Adversarial defense
  • Membership inference attack
  • Security and privacy


Dive into the research topics of 'Membership inference attacks against adversarially robust deep learning models'. Together they form a unique fingerprint.

Cite this