Why General Outlier Detection Techniques Do Not Suffice for Wireless Sensor Networks

Why General Outlier Detection Techniques Do Not Suffice for Wireless Sensor Networks

Yang Zhang, Nirvana Meratnia, Paul Havinga
DOI: 10.4018/978-1-60566-328-9.ch007
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Raw data collected in wireless sensor networks are often unreliable and inaccurate due to noise, faulty sensors and harsh environmental effects. Sensor data that significantly deviate from normal pattern of sensed data are often called outliers. Outlier detection in wireless sensor networks aims at identifying such readings, which represent either measurement errors or interesting events. Due to numerous shortcomings, commonly used outlier detection techniques for general data seem not to be directly applicable to outlier detection in wireless sensor networks. In this chapter, the authors report on the current stateof- the-art on outlier detection techniques for general data, provide a comprehensive technique-based taxonomy for these techniques, and highlight their characteristics in a comparative view. Furthermore, the authors address challenges of outlier detection in wireless sensor networks, provide a guideline on requirements that suitable outlier detection techniques for wireless sensor networks should meet, and will explain why general outlier detection techniques do not suffice.
Chapter Preview
Top

Introduction

Advances in electronic processor technologies and wireless communications have enabled generation of small, low-cost sensor nodes with sensing, computation and short-range wireless communication capabilities. Each sensor node is usually equipped with a wireless transceiver, a small microcontroller, an energy power source and multi-type sensors such as temperature, humidity, light, heat, pressure, sound, vibration, motion, etc. A wireless sensor network (WSN) typically consists of a large number of such low-power sensor nodes distributed over a wide geographical area with one or more possibly powerful sink nodes gathering information of others. These sensor nodes measure and collect data from the target area, perform some data processing, transmit and forward information to the sink node by a multi-hop routing. The sink node can also inform nodes to collect data by broadcasting a query to the entire network or a specific region in the network.

These small and low quality sensor nodes have severe limitations, such as limited energy and memory resources, communication bandwidth and computational processing capabilities. These constraints make sensor nodes more easily generate erroneous data. Especially when battery power is exhausted, probability of generating abnormally high or low sensor values will grow rapidly. On the other hand, operations of sensor nodes are frequently susceptible to environmental effects. The vision of large scale and high density wireless sensor network is to randomly deploy a large number of sensor nodes (up to hundreds or even thousands of nodes) in harsh and unattended environments. In such conditions, it is inevitable that some of sensor nodes will malfunction, which may result in erroneous readings. In addition to noise and sensor faults, abnormal readings may also be caused by actual events (e.g., once fire occurs, the readings of the temperature sensors around the region will intensively increase). These are potential reasons for generating abnormal readings in WSNs, often called outliers.

Coming across various definitions of an outlier, it seems that no universally accepted definition exists. The notion of outliers may even differ from one outlier detection technique to another (Zhang et al., 2007). Two classical definitions of an outlier include

(Hawkins, 1980) and (Barnett & Lewis, 1994). According to the former, “an outlier is an observation, which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism'”, where as the latter defines “an outlier is an observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data”. The term “outlier” can generally be defined as an observation that is significantly different from the other values in a data set.

Complete Chapter List

Search this Book:
Reset