TY - JOUR

T1 - A learning framework for cognitive interference networks with partial and noisy observations

AU - Levorato, Marco

AU - Firouzabadi, Sina

AU - Goldsmith, Andrea

N1 - Funding Information:
Part of this work was presented at the 47th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, and at IEEE Globecom 2011. This work was partially supported by AFOSR grant FA9550-08-1-0480 and by the DARPA ITMANET program under Grant 1105741-1-TFIND.

PY - 2012

Y1 - 2012

N2 - An algorithm for the optimization of secondary user's transmission strategies in cognitive networks with imperfect network state observations is proposed. The secondary user minimizes the time average of a cost function while generating a bounded performance loss to the primary users' network. The state of the primary users' network, defined as a collection of variables describing features of the network (e.g., buffer state, ARQ state) evolves over time according to a homogeneous Markov process. The statistics of the Markov process is dependent on the strategy of the secondary user and, thus, the instantaneous idleness/transmission action of the secondary user has a long-term impact on the temporal evolution of the network. The Markov process generates a sequence of states in the state space of the network that projects onto a sequence of observations in the observation space, that is, the collection of all the observations of the secondary user. Based on the sequence of observations, the proposed algorithm iteratively optimizes the strategy of the secondary users with no a priori knowledge of the statistics of the Markov process and of the state-observation probability map.

AB - An algorithm for the optimization of secondary user's transmission strategies in cognitive networks with imperfect network state observations is proposed. The secondary user minimizes the time average of a cost function while generating a bounded performance loss to the primary users' network. The state of the primary users' network, defined as a collection of variables describing features of the network (e.g., buffer state, ARQ state) evolves over time according to a homogeneous Markov process. The statistics of the Markov process is dependent on the strategy of the secondary user and, thus, the instantaneous idleness/transmission action of the secondary user has a long-term impact on the temporal evolution of the network. The Markov process generates a sequence of states in the state space of the network that projects onto a sequence of observations in the observation space, that is, the collection of all the observations of the secondary user. Based on the sequence of observations, the proposed algorithm iteratively optimizes the strategy of the secondary users with no a priori knowledge of the statistics of the Markov process and of the state-observation probability map.

KW - Cognitive networks

KW - Markov decision process

KW - imperfect observations

KW - online learning

UR - http://www.scopus.com/inward/record.url?scp=84866735372&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84866735372&partnerID=8YFLogxK

U2 - 10.1109/TWC.2012.062012.111342

DO - 10.1109/TWC.2012.062012.111342

M3 - Article

AN - SCOPUS:84866735372

VL - 11

SP - 3101

EP - 3111

JO - IEEE Transactions on Wireless Communications

JF - IEEE Transactions on Wireless Communications

SN - 1536-1276

IS - 9

M1 - 6226310

ER -