TY - JOUR
T1 - A learning framework for cognitive interference networks with partial and noisy observations
AU - Levorato, Marco
AU - Firouzabadi, Sina
AU - Goldsmith, Andrea
N1 - Funding Information:
Part of this work was presented at the 47th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, and at IEEE Globecom 2011. This work was partially supported by AFOSR grant FA9550-08-1-0480 and by the DARPA ITMANET program under Grant 1105741-1-TFIND.
PY - 2012
Y1 - 2012
N2 - An algorithm for the optimization of secondary user's transmission strategies in cognitive networks with imperfect network state observations is proposed. The secondary user minimizes the time average of a cost function while generating a bounded performance loss to the primary users' network. The state of the primary users' network, defined as a collection of variables describing features of the network (e.g., buffer state, ARQ state) evolves over time according to a homogeneous Markov process. The statistics of the Markov process is dependent on the strategy of the secondary user and, thus, the instantaneous idleness/transmission action of the secondary user has a long-term impact on the temporal evolution of the network. The Markov process generates a sequence of states in the state space of the network that projects onto a sequence of observations in the observation space, that is, the collection of all the observations of the secondary user. Based on the sequence of observations, the proposed algorithm iteratively optimizes the strategy of the secondary users with no a priori knowledge of the statistics of the Markov process and of the state-observation probability map.
AB - An algorithm for the optimization of secondary user's transmission strategies in cognitive networks with imperfect network state observations is proposed. The secondary user minimizes the time average of a cost function while generating a bounded performance loss to the primary users' network. The state of the primary users' network, defined as a collection of variables describing features of the network (e.g., buffer state, ARQ state) evolves over time according to a homogeneous Markov process. The statistics of the Markov process is dependent on the strategy of the secondary user and, thus, the instantaneous idleness/transmission action of the secondary user has a long-term impact on the temporal evolution of the network. The Markov process generates a sequence of states in the state space of the network that projects onto a sequence of observations in the observation space, that is, the collection of all the observations of the secondary user. Based on the sequence of observations, the proposed algorithm iteratively optimizes the strategy of the secondary users with no a priori knowledge of the statistics of the Markov process and of the state-observation probability map.
KW - Cognitive networks
KW - Markov decision process
KW - imperfect observations
KW - online learning
UR - http://www.scopus.com/inward/record.url?scp=84866735372&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84866735372&partnerID=8YFLogxK
U2 - 10.1109/TWC.2012.062012.111342
DO - 10.1109/TWC.2012.062012.111342
M3 - Article
AN - SCOPUS:84866735372
SN - 1536-1276
VL - 11
SP - 3101
EP - 3111
JO - IEEE Transactions on Wireless Communications
JF - IEEE Transactions on Wireless Communications
IS - 9
M1 - 6226310
ER -