Active Learning for Discrete Latent Variable Models

Aditi Jha, Zoe C. Ashwood, Jonathan W. Pillow

Research output: Contribution to journalLetterpeer-review


Active learning seeks to reduce the amount of data required to fit the pa-rameters of a model, thus forming an important class of techniques in modern machine learning. However, past work on active learning has largely overlooked latent variable models, which play a vital role in neu-roscience, psychology, and a variety of other engineering and scientific disciplines. Here we address this gap by proposing a novel framework for maximum-mutual-information input selection for discrete latent variable regression models. We first apply our method to a class of models known as mixtures of linear regressions (MLR). While it is well known that active learning confers no advantage for linear-gaussian regression models, we use Fisher information to show analytically that active learning can nevertheless achieve large gains for mixtures of such models, and we val-idate this improvement using both simulations and real-world data. We then consider a powerful class of temporally structured latent variable models given by a hidden Markov model (HMM) with generalized linear model (GLM) observations, which has recently been used to identify discrete states from animal decision-making data. We show that our method substantially reduces the amount of data needed to fit GLM-HMMs and outperforms a variety of approximate methods based on variational and amortized inference. Infomax learning for latent variable models thus offers a powerful approach for characterizing temporally structured latent states, with a wide variety of applications in neuroscience and beyond.

Original languageEnglish (US)
Pages (from-to)437-474
Number of pages38
JournalNeural computation
Issue number3
StatePublished - Mar 2024

All Science Journal Classification (ASJC) codes

  • Arts and Humanities (miscellaneous)
  • Cognitive Neuroscience


Dive into the research topics of 'Active Learning for Discrete Latent Variable Models'. Together they form a unique fingerprint.

Cite this