Distributed Learning Algorithm Applications to the Scheduling of Wireless Sensor Networks

Distributed Learning Algorithm Applications to the Scheduling of Wireless Sensor Networks

Fatemeh Daneshfar (University of Kurdistan, Iran) and Vafa Maihami (University of Kurdistan, Iran)
DOI: 10.4018/978-1-4666-4450-2.ch028
OnDemand PDF Download:
No Current Special Offers


Wireless Sensor Network (WSN) is a network of devices denoted as nodes that can sense the environment and communicate gathered data, through wireless medium to a sink node. It is a wireless network with low power consumption, small size, and reasonable price which has a variety of applications in monitoring and tracking. However, WSN is characterized by constrained energy because its nodes are battery-powered and energy recharging is difficult in most of applications. Also the reduction of energy consumption often introduces additional latency of data delivery. To address this, many scheduling approaches have been proposed. In this paper, the authors discuss the applicability of Reinforcement Learning (RL) towards multiple access design in order to reduce energy consumption and to achieve low latency in WSNs. In this learning strategy, an agent would become knowledgeable in making actions through interacting with the environment. As a result of rewards in response to the actions, the agent asymptotically reaches the optimal policy. This policy maximizes the long-term expected return value of the agent.
Chapter Preview


Recent advances in electronics and telecommunication create a network of small sensors (nodes) called Wireless Sensor Network (WSN). Wireless sensor networks are a network of devices denoted as nodes that can sense the environment, primarily process it and send it to a central or sink node. It is a distributing and self-organizing network with low power consumption, small size and reasonable prices which has a variety of applications in monitoring and tracking such as military, healthy, industry and so on. Figure 1 presents a communication architecture related to a WSN (Akyildiz & Sankarasubramaniam, 2002; Yick et al., 2008). Each of these scattered sensor nodes has the capability to collect data (for example temperature, humidity and so on) and transmit it back to the sink node and then send it via web or satellite to the end-users. Of course, it is possible for WSN’s node to be classified to some clusters and communicate through the clusters.

Figure 1.

Communication architecture of a wireless sensor network


Sensor nodes use a battery as the power as shown in Figure 2 (Yu et al., 2006). Since there are many constraints in using of these energy batteries like expensive replacement and charging, the approaches that decrease the nodes’ energy consumption are more significant and considerable. One of the main resources of energy consumption in WSNs is radio nodes. A radio node has two separated modes: active and sleep (on and off) modes. Only in the active mode, nodes can receive and transmit data. However a significant amount of energy in a node is wasted by its radio components when it is in the idle listening with no communication activity (Akyildiz & Vuran, 2010).

Figure 2.

Samples of sensor nodes


There are many suggested protocols for WSNs to control and transmit nodes to go to the sleep or active modes. Despite wireless sensor networks are similar to Mobile Ad-hoc NETworks (MANET) from many directions, but the protocols that are used for MANETs are not appropriate to WSNs. An ad-hoc network is a local network that consists of some connected autonomous devices. Instead of being relied on a central station (hub and switch) like typical networks, ad-hoc networks are self-configurable and have the ability to send and receive data between all the nodes coordinately in several steps. Due to the lack of central control, they need a minimum configuration and management costs. Figure 3 shows an example of a wireless ad-hoc network (Abolhasan et al., 2004). Since both wireless sensor networks’ and ad-hoc networks nodes use battery energies, their life time is limited to the battery life time. Also their communication types are in kind of wireless channel type, which provides an unreliable communication. In these networks human intervention has been minimized and their configuration is automatically (Perillo & Heinzelman, 2004). Also in terms of network size and nodes number, WSNs node’s number is much more than ad-hoc network nodes, (it reaches to thousands of nodes in contrast to the ad-hoc networks with approximately 10 nodes). Sensor networks also are often used in challenging environments, then nodes locations are fixed and node failures are natural. These sensor networks nodes are small sensors and then have smaller batteries, less network lifetime, less memory and less computing power than the ad-hoc network. Again since sensors do not have any identification code, data sending are in a broadcasting way in the sensor networks, in comparison to the ad-hoc network which it is point to point.

Figure 3.

Wireless ad-hoc network


Key Terms in this Chapter

Q-learning: Is a reinforcement learning technique that works by learning an action-value function.

Ad-Hoc Network: An ad-hoc network is a local network that connected to each of the autonomous devices combined. Ad-hoc networks are self-configurable, they send and receive data between all nodes in a few steps.

Idle Listening: Duration of time where transceiver is active but no data transmit or received by sensor.

Transaction: The exchange of related, consecutive frames between two peer medium access control (MAC) entities, required for a successful transmission of a MAC command or data frame.

Transaction Queue: A list of the pending transactions, which be sent using indirect transmission, that are initiated by the medium access control (MAC) sublayer of a given coordinator. The transaction queue is maintained by that coordinator while the transactions are in progress, and its length is implementation-dependent but must be at least one.

Packet: The formatted, aggregated bits that are transmitted together in time across the physical medium.

Sleep Time (Off): Duration of time where no transceiver activity is scheduled to take place.

S-MAC Protocol: Is the first important MAC protocol for wireless sensor networks.

Self-Organizing: The ability of network nodes to detect the presence of other nodes and to organize into a structured, functioning network without human intervention.

Wake up Time (On or Active): Duration of time where transceiver is active and sensor can be data transmit or receive.

Frame: The formats of aggregated bits from a medium access control (MAC) sublayer entities that are transmitted together in time.

Soft Computing: Is a field of computer science which is characterized by the use of inexact solutions to computationally hard tasks such as the NP-complete problems.

MANET: Mobile-ad hoc-networks or an Ad-hoc network that sensor can be mobile.

Intelligent Agent: Is an autonomous entity which observes through sensors and acts upon an environment using actuators. It is also a software that assists people and act on their behalf. Intelligent agents work by allowing people to delegate work that they could have done, to the agent software. Agents can perform repetitive tasks, intelligently summarize complex data.

Reinforcement Signal: Reflects the success or failure of the entire system after it has performed some sequence of actions.

Policy: Defines the learning agent's way of behaving at a given time. a policy is a mapping from perceived states of the environment to actions to be taken when in those states.

RL: A machine learning algorithm based on the interaction with the environment.

Value Function: An approaches attempt to find a policy that maximizes the return by maintaining a set of estimates of expected returns for some policy.

MAC Protocol: Is used to provide the data link layer of the Ethernet LAN system. The MAC protocol encapsulates a SDU (payload data) by adding a 14 byte header (Protocol Control Information (PCI)) before the data and appending an integrity checksum. The checksum is a 4-byte (32-bit) Cyclic Redundancy Check (CRC) after the data. The entire frame is preceded by a small idle period (the minimum inter-frame gap, 9.6 microsecond (µS)) and a 8 byte preamble (including the start of frame delimiter).

R-MAC Protocol: Is developed S-MAC with dynamic duty cycle to decrease S-MAC problems.

Complete Chapter List

Search this Book: