Multiagent reinforcement learning based spectrum sensing policies for cognitive radio networks

Jarmo Lunden, Sanjeev R. Kulkarni, Visa Koivunen, H. Vincent Poor

Research output: Contribution to journalArticlepeer-review

50 Scopus citations


This paper proposes distributed multiuser multiband spectrum sensing policies for cognitive radio networks based on multiagent reinforcement learning. The spectrum sensing problem is formulated as a partially observable stochastic game and multiagent reinforcement learning is employed to find a solution. In the proposed reinforcement learning based sensing policies the secondary users (SUs) collaborate to improve the sensing reliability and to distribute the sensing tasks among the network nodes. The SU collaboration is carried out through local interactions in which the SUs share their local test statistics or decisions as well as information on the frequency bands sensed with their neighbors. As a result, a map of spectrum occupancy in a local neighborhood is created. The goal of the proposed sensing policies is to maximize the amount of free spectrum found given a constraint on the probability of missed detection. This is addressed by obtaining a balance between sensing more spectrum and the reliability of sensing results. Simulation results show that the proposed sensing policies provide an efficient way to find available spectrum in multiuser multiband cognitive radio scenarios.

Original languageEnglish (US)
Article number6507570
Pages (from-to)858-868
Number of pages11
JournalIEEE Journal on Selected Topics in Signal Processing
Issue number5
StatePublished - 2013
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Electrical and Electronic Engineering


  • Cognitive radio networks
  • collaborative spectrum sensing
  • multiagent reinforcement learning
  • multiuser multiband spectrum sensing policy
  • partially observable stochastic game


Dive into the research topics of 'Multiagent reinforcement learning based spectrum sensing policies for cognitive radio networks'. Together they form a unique fingerprint.

Cite this