On Myopic Sensing for Multi-Channel Opportunistic Access: Structure, Optimality, and Performance

Reading time: 6 minute
...

📝 Original Info

  • Title: On Myopic Sensing for Multi-Channel Opportunistic Access: Structure, Optimality, and Performance
  • ArXiv ID: 0712.0035
  • Date: 2016-11-17
  • Authors: Researchers from original ArXiv paper

📝 Abstract

We consider a multi-channel opportunistic communication system where the states of these channels evolve as independent and statistically identical Markov chains (the Gilbert-Elliot channel model). A user chooses one channel to sense and access in each slot and collects a reward determined by the state of the chosen channel. The problem is to design a sensing policy for channel selection to maximize the average reward, which can be formulated as a multi-arm restless bandit process. In this paper, we study the structure, optimality, and performance of the myopic sensing policy. We show that the myopic sensing policy has a simple robust structure that reduces channel selection to a round-robin procedure and obviates the need for knowing the channel transition probabilities. The optimality of this simple policy is established for the two-channel case and conjectured for the general case based on numerical results. The performance of the myopic sensing policy is analyzed, which, based on the optimality of myopic sensing, characterizes the maximum throughput of a multi-channel opportunistic communication system and its scaling behavior with respect to the number of channels. These results apply to cognitive radio networks, opportunistic transmission in fading environments, and resource-constrained jamming and anti-jamming.

💡 Deep Analysis

Deep Dive into On Myopic Sensing for Multi-Channel Opportunistic Access: Structure, Optimality, and Performance.

We consider a multi-channel opportunistic communication system where the states of these channels evolve as independent and statistically identical Markov chains (the Gilbert-Elliot channel model). A user chooses one channel to sense and access in each slot and collects a reward determined by the state of the chosen channel. The problem is to design a sensing policy for channel selection to maximize the average reward, which can be formulated as a multi-arm restless bandit process. In this paper, we study the structure, optimality, and performance of the myopic sensing policy. We show that the myopic sensing policy has a simple robust structure that reduces channel selection to a round-robin procedure and obviates the need for knowing the channel transition probabilities. The optimality of this simple policy is established for the two-channel case and conjectured for the general case based on numerical results. The performance of the myopic sensing policy is analyzed, which, based on t

📄 Full Content

The fundamental idea of opportunistic access is to adapt the transmission parameters (such as data rate and transmission power) according to the state of the communication environment including, for example, fading conditions, interference level, and buffer state. Since the seminal work by Knopp and Humblet in 1995 [1], the concept of opportunistic access has found applications beyond transmission and scheduling over fading channels. An emerging application is cognitive radio for opportunistic spectrum access, where secondary users search in the spectrum for idle channels temporarily unused by primary users [2]. Another application is resource-constrained jamming and anti-jamming, where a jammer seeks channels occupied by users or a user tries to avoid jammers.

We consider a general opportunistic communication system where a user has access to N parallel channels and chooses one channel to sense and access in each slot, aiming to maximize its expected long-term reward (i.e., throughput). This user can be a base station, and each channel is associated with a downlink receiver. In this case, channel selection is equivalent to receiver selection, and the general problem considered here also applies to downlink scheduling in a centralized network.

These N channels are modelled as independent and stochastically identical Gilbert-Elliot channels [3], which has been commonly used to abstract physical channels with memory (see, for example, [4], [5]). As illustrated in Fig. 1, the state of a channel -good or bad -indicates the desirability of accessing this channel and determines the resulting reward. For example, for the application of cognitive radio networks, the good state represents an unused channel by primary users while the bad state an occupied channel 1 . The transitions between these two states follow a Markov chain with transition probabilities {p ij } i,j=0,1 . A sensing policy that governs the channel selection in each slot is crucial to the efficiency of multi-channel opportunistic access. The design of the optimal sensing policy can be formulated as a partially observable Markov decision process (POMDP) for generally correlated channels, or a restless multi-armed bandit process for independent channels. Unfortunately, obtaining the optimal policy for a general POMDP or restless bandit process is often intractable due to the exponential computation complexity.

A common approach of trading performance for tractable solutions is to consider myopic policies. A myopic policy aims solely at maximizing the immediate reward, ignoring the impact of the current action on the future reward. Obtaining a myopic policy is thus a static optimization problem instead of a sequential decision-making problem. As a consequence, the complexity is significantly reduced, often at the price of considerable performance loss.

In this paper, we show that for designing sensing strategies for multi-channel opportunistic access, low complexity does not necessarily imply suboptimal performance. The myopic sensing policy with a simple and robust structure achieves the optimal performance under the i.i.d. Gilbert-Elliot channel model.

Under the i.i.d. Gilbert-Elliot channel model, we establish the structure and optimality of the myopic sensing policy and analyze its performance.

  1. Structure of Myopic Sensing: The first contribution of this paper is the establishment of a simple and robust structure of the myopic sensing policy. Besides significant implications in the practical implementation, this result serves as the key to the optimality proof and the performance analysis.

We show that the basic structure of the myopic policy is a round-robin scheme based on a circular ordering of the channels. For the case of p 11 ≥ p 01 , the circular order is constant and determined by the initial information (if any) on the state of each channel. The myopic action is to stay in the same channel when it is good (state 1) and switch to the next channel in the circular order when it is bad. In the case of p 11 < p 01 , the circular order is reversed in every slot with the initial order determined by the initial information on channel states. The myopic policy stays in the same channel when it is bad; otherwise, it switches to the next channel in the current circular order 2 .

The significance of this result in terms of the practical implementations of myopic sensing is twofold. First, it demonstrates the simplicity of myopic sensing: channel selection is reduced to a simple round-robin procedure. The myopic sensing policy requires no computation and little memory. Second, it shows that myopic sensing is robust to model mismatch. Specifically, the myopic sensing policy has a semi-universal structure; it can be implemented without knowing the channel transition probabilities. The only required information about the channel model is the order of p 11 and p 01 . As a result, the myopic sensing policy automatically tracks variations in the c

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut