Competing Bandits in Matching Markets

Lydia T. Liu, Horia Mania, Michael I. Jordan

Research output: Contribution to journalConference articlepeer-review

37 Scopus citations


Stable matching, a classical model for two-sided markets, has long been studied assuming known preferences. In reality agents often have to learn about their preferences through exploration. With the advent of massive online markets powered by data-driven matching platforms, it has become necessary to better understand the interplay between learning and market objectives. We propose a statistical learning model in which one side of the market does not have a priori knowledge about its preferences for the other side and is required to learn these from stochastic rewards. Our model extends the standard multi-armed bandits framework to multiple players, with the added feature that arms have preferences over players. We study both centralized and decentralized approaches to this problem and show surprising exploration-exploitation trade-offs compared to the single player multi-armed bandits setting.

Original languageEnglish (US)
Pages (from-to)1618-1628
Number of pages11
JournalProceedings of Machine Learning Research
StatePublished - 2020
Externally publishedYes
Event23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020 - Virtual, Online
Duration: Aug 26 2020Aug 28 2020

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability


Dive into the research topics of 'Competing Bandits in Matching Markets'. Together they form a unique fingerprint.

Cite this