Learning Informative and Private Representations via Generative Adversarial Networks

Tsung Yen Yang, Christopher Brinton, Prateek Mittal, Mung Chiang, Andrew Lan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

It is of crucial importance to simultaneously protect against sensitive attributes in data while building predictive models. In this paper, we tackle the problem of learning representations from raw data that are i) informative and predictive of desirable variables, and ii) private and protect against adversaries that attempt to recover sensitive variables. We cast this problem under the generative adversarial network (GAN) framework and design three components: an encoder, an ally that predicts the desired variables, and an adversary that predicts the sensitive ones. As a use case, we apply our approach to learn representations of raw student clickstream event data captured as they watch lecture videos in massive open online courses (MOOCs). Through experiments on a real- world dataset collected from a MOOC, we demonstrate that our method can learn a low-dimensional representation of each user that i) excels at classifying whether a user will answer a quiz question correctly, and ii) prevents an adversary from recovering each user's identity. Our results indicate that our approach is effective in learning representations that are both informative and private.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
EditorsYang Song, Bing Liu, Kisung Lee, Naoki Abe, Calton Pu, Mu Qiao, Nesreen Ahmed, Donald Kossmann, Jeffrey Saltz, Jiliang Tang, Jingrui He, Huan Liu, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1534-1543
Number of pages10
ISBN (Electronic)9781538650356
DOIs
StatePublished - Jan 22 2019
Event2018 IEEE International Conference on Big Data, Big Data 2018 - Seattle, United States
Duration: Dec 10 2018Dec 13 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018

Conference

Conference2018 IEEE International Conference on Big Data, Big Data 2018
Country/TerritoryUnited States
CitySeattle
Period12/10/1812/13/18

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Information Systems

Keywords

  • Generative adversarial networks
  • Massive open online courses
  • Predictive models
  • Privacy

Fingerprint

Dive into the research topics of 'Learning Informative and Private Representations via Generative Adversarial Networks'. Together they form a unique fingerprint.

Cite this