Voting-Based Multiagent Reinforcement Learning for Intelligent IoT

Yue Xu, Zengde Deng, Mengdi Wang, Wenjun Xu, Anthony Man Cho So, Shuguang Cui

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


The recent success of single-agent reinforcement learning (RL) in Internet of Things (IoT) systems motivates the study of multiagent RL (MARL), which is more challenging but more useful in large-scale IoT. In this article, we consider a voting-based MARL problem, in which the agents vote to make group decisions and the goal is to maximize the globally averaged returns. To this end, we formulate the MARL problem based on the linear programming form of the policy optimization problem and propose a primal-dual algorithm to obtain the optimal solution. We also propose a voting mechanism through which the distributed learning achieves the same sublinear convergence rate as centralized learning. In other words, the distributed decision making does not slow down the process of achieving global consensus on optimality. Finally, we verify the convergence of our proposed algorithm with numerical simulations and conduct case studies in practical multiagent IoT systems.

Original languageEnglish (US)
Article number9184075
Pages (from-to)2681-2693
Number of pages13
JournalIEEE Internet of Things Journal
Issue number4
StatePublished - Feb 15 2021

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Information Systems
  • Hardware and Architecture
  • Computer Science Applications
  • Computer Networks and Communications


  • Multiagent reinforcement learning (MARL)
  • primal-dual algorithm
  • voting mechanism


Dive into the research topics of 'Voting-Based Multiagent Reinforcement Learning for Intelligent IoT'. Together they form a unique fingerprint.

Cite this