CAN WE FIND NASH EQUILIBRIA AT A LINEAR RATE IN MARKOV GAMES?

Zhuoqing Song, Jason D. Lee, Zhuoran Yang

Research output: Contribution to conferencePaperpeer-review

3 Scopus citations

Abstract

We study decentralized learning in two-player zero-sum discounted Markov games where the goal is to design a policy optimization algorithm for either agent satisfying two properties. First, the player does not need to know the policy of the opponent to update its policy. Second, when both players adopt the algorithm, their joint policy converges to a Nash equilibrium of the game. To this end, we construct a meta algorithm, dubbed as Homotopy-PO, which provably finds a Nash equilibrium at a global linear rate. In particular, Homotopy-PO interweaves two base algorithms Local-Fast and Global-Slow via homotopy continuation. Local-Fast is an algorithm that enjoys local linear convergence while Global-Slow is an algorithm that converges globally but at a slower sublinear rate. By switching between these two base algorithms, Global-Slow essentially serves as a “guide” which identifies a benign neighborhood where Local-Fast enjoys fast convergence. However, since the exact size of such a neighborhood is unknown, we apply a doubling trick to switch between these two base algorithms. The switching scheme is delicately designed so that the aggregated performance of the algorithm is driven by Local-Fast. Furthermore, we prove that Local-Fast and Global-Slow can both be instantiated by variants of optimistic gradient descent/ascent (OGDA) method, which is of independent interest.

Original languageEnglish (US)
StatePublished - 2023
Event11th International Conference on Learning Representations, ICLR 2023 - Kigali, Rwanda
Duration: May 1 2023May 5 2023

Conference

Conference11th International Conference on Learning Representations, ICLR 2023
Country/TerritoryRwanda
CityKigali
Period5/1/235/5/23

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'CAN WE FIND NASH EQUILIBRIA AT A LINEAR RATE IN MARKOV GAMES?'. Together they form a unique fingerprint.

Cite this