Reinforcement Learning for Routing and Spectrum Management in Cognitive Wireless Mesh Network

Reinforcement Learning for Routing and Spectrum Management in Cognitive Wireless Mesh Network

Ayoub Alsarhan (Department of Computer Information System, The Hashemite University, Zarqa, Jordan)
DOI: 10.4018/IJWNBT.2016010104
OnDemand PDF Download:
List Price: $37.50


Cognitive radio networks (CRNs) can provide a means for offering end-to-end Quality of Service (QoS) required by unlicensed users (secondary users. SUs). The authors consider the approach where licensed users (primary users, PUs) play the role of routers and lease spectrum with QoS guarantees for the SUs. Available spectrum is managed by the PU admission and routing policy. The main concern of the proposed policy is to provide end-to-end QoS connections to the SUs. Maximizing gain is the key objective for the PU. In this paper, the authors propose a novel resource management scheme where reinforcement learning (RL) is used to drive resource management scheme. The derived scheme helps PUs to adapt to the changes in the network conditions such as traffic load, spectrum cost, service reward, etc, so that PU's gain can continuously be optimized. The approach integrates spectrum adaptations with connection admission control and routing policies. Numerical analysis results show the ability of the proposed approach to attain the optimal gain under different conditions and constraints.
Article Preview

1. Introduction

Spectrum scarcity problem will get worse due to the unexpected explosion in the number of the emerging web-based services. Users want to access the internet anywhere-anytime. As a result, the frequency spectrum, especially the ISM band, becomes congested while supporting these web-based applications. To utilize the available spectrum efficiently, the concept of CRNs is proposed to enable SUs to access the under-utilized portion of the spectrum (Vizziello, Akyildiz, Agustí, Favalli, & Savazzi, 2013). SUs can access the unused spectrum using underlay, overlay or spectrum trading approaches (Alsarhan & Agarwal, 2011; Alsarhan & Agarwal, 2009; Pefkianakis, Wong, & Lu, 2008). In overlay and underlay approaches, SUs access the licensed spectrum without paying any usage charge to PUs. Their access is allowed as long as their usages do not harm the PUs. For example, in IEEE 802.22, SUs can access TV bands.

For the underlay approach, the transmission power for the SUs should be less than predefined interference threshold to avoid interference with PUs (Phunchongharn & Ekram, 2012). However, this constraint enforces SUs to reduce their transmission power and eventually the area of coverage. Therefore, it is very likely that the SU needs intermediate nodes to relay the traffic for the destination node. SUs may spread the transmitted power over a wide frequency band to achieve high data rate on short distances.

In the overlay approach, SUs can access the unused portion of the spectrum. As a result, interference to the PUs is avoided. SUs have to identify and exploit the unused spectrum. They should periodically monitor the PUs activities and vacate the spectrum as soon as their signals interfere with PU signal (Alsarhan & Agarwal, 2011; Alsarhan & Agarwal, 2009; Phunchongharn & Ekram, 2012; Gür, Bayhan, & Alagoz, 2010; Tachwali, Lo, Akyildiz, & Agustí, 2013). Although these approaches help in solving a spectrum scarcity problem, it is not likely to be accepted in the current market since the PUs do not have any financial incentive from SUs usage of spectrum. Routing in multi-hop CRNs is a challenging task since the available spectrum at each PU is imprecise due to the changing traffic load.

The considered system consists of PUs that rent the unused spectrum to SUs. To ensure availability of required spectrum, the PU monitors its spectrum and lease free spectrum with QoS for the SUs. With the available spectrum, user end-to-end QoS connections are realized through the admission control and routing policy. In this paper, we propose a selective request admission and routing policy that selectively admits spectrum requests aiming for optimizing PU’s profit. In order to achieve this goal, the PU resource management scheme based on an RL model is proposed. The scheme integrates optimal connection admission control and routing policy with adaptations of the PU resources to varying network traffic and profits conditions.

The basic concept of RL model is a state dependent service reward which is a dynamic reward of accepting a class requests in the PU. Then the goal of the routing is to select a path with maximum sum of the rewards that is also larger than the service cost. The major contributions of this paper are as follows:

  • A model for routing in the CRNs is proposed and RL is used for path selection. Our model takes into account the economic factors for routing problem that include the profit and the cost of renting PUs’ channels.

  • How RL methodology can be used to obtain a computationally feasible solution to the considered routing problem is described.

  • The performance of the RL model is evaluated under different system parameters.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 6: 2 Issues (2017): Forthcoming, Available for Pre-Order
Volume 5: 1 Issue (2016)
Volume 4: 3 Issues (2015)
Volume 3: 4 Issues (2014)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing