A Self-Organizing Neural Network to Approach Novelty Detection

A Self-Organizing Neural Network to Approach Novelty Detection

Marcelo Keese Albertini (University of São Paulo, Brazil) and Rodrigo Fernandes de Mello (University of São Paulo, Brazil)
DOI: 10.4018/978-1-60566-798-0.ch003
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Machine learning is a field of artificial intelligence which aims at developing techniques to automatically transfer human knowledge into analytical models. Recently, those techniques have been applied to time series with unknown dynamics and fluctuations in the established behavior patterns, such as humancomputer interaction, inspection robotics and climate change. In order to detect novelties in those time series, techniques are required to learn and update knowledge structures, adapting themselves to data tendencies. The learning and updating process should integrate and accommodate novelty events into the normal behavior model, possibly incurring the revaluation of long-term memories. This sort of application has been addressed by the proposal of incremental techniques based on unsupervised neural networks and regression techniques. Such proposals have introduced two new concepts in time-series novelty detection. The first defines the temporal novelty, which indicates the occurrence of unexpected series of events. The second measures how novel a single event is, based on the historical knowledge. However, current studies do not fully consider both concepts of detecting and quantifying temporal novelties. This motivated the proposal of the self-organizing novelty detection neural network architecture (SONDE) which incrementally learns patterns in order to represent unknown dynamics and fluctuation of established behavior. The knowledge accumulated by SONDE is employed to estimate Markov chains which model causal relationships. This architecture is applied to detect and measure temporal and nontemporal novelties. The evaluation of the proposed technique is carried out through simulations and experiments, which have presented promising results.
Chapter Preview
Top

Introduction

Machine learning is a field of artificial intelligence which aims at developing techniques for automatically transferring human knowledge into analytical models (Kecman, 2001). Such techniques support activities in several areas, such as natural language processing (Jelinek, 1997), pattern recognition (Bishop, 2006), search engines (Zhang and Dong, 2000), medical diagnosis (Cox et al., 1982) and fraud detection (Chan and Stolfo, 1998).

With the development of new approaches and learning techniques, researchers identified the need of modeling datasets presenting noise (Barnett and Lewis, 1994), inconsistencies derived from anomalies (Singh, 2002), scarce patterns (Rosen et al., 1996) and information tendency modifications (Spinosa et al., 2007).

The modeling of those datasets has motivated additional studies which originated the novelty detection researches. At the beginning, such studies aimed at identifying rare and unknown information. Such information is observed in, for instance, samples from defective equipments (Tarassenko, 1995; Ypma and Duin, 1997) and exams of patients with rare diseases (Cox et al., 1982). In such cases, data are temporally independent of each other. One example of this situation is the study of breast cancer diagnosis using x-ray image analysis conducted by Tarassenko (1995). That work considers a dataset containing one x-ray exam and one diagnosis per patient which indicates suspected areas. In this kind of application, there is no causal relationship between different patient exams, consequently, previous exams cannot indicate the disease for a new one. Thus, the sequence in which the exams are analyzed is irrelevant.

Another kind of application, named temporal, considers the causal relationship among data sequences (Box and Jenkins, 1976). An example of such application is the analysis of customer’s behavior in using credit cards. In such a situation, the debits compose a time series where data are causal dependent. If the customer’s debit behavior varies in an unexpected manner (depicting novelty), the credit company could, for example, block the card to prevent frauds.

There are different types of temporal applications, some of which present well-behaved series, where previously obtained data can be used to model the required knowledge and, consequently, describe expected behavior. Using such a model, the novelty detection process comprises labeling patterns which are not consistent with the expectations. An example of such an approach is proposed by Ko et al. (1992) which models the current knowledge by storing input pattern characteristics. New patterns are evaluated by assessing their distances to any other in the model. The pattern is labeled as novelty when the distance is above an ad hoc similarity threshold. Other works that present similar approaches include Ypma and Duin (1997) who propose clustering indices based on the unsupervised artificial neural network Self-Organizing Map, and Hayton et al. (2000) who apply a binary classifier based on Support Vector Machines.

Other temporal applications are characterized by series with unknown dynamics and fluctuation in the established behavior of patterns (such as the human interaction with computers (Lane, 1999), inspection robotics (Marsland, 2002) and climate change (Lau and Weng, 1995)). In order to detect novelties in those series, techniques are required to learn and update knowledge structures, and adapting them to data tendencies. The learning and updating process should integrate and accommodate novelty events into the normal behavior model, possibly incurring the revaluation of long-term memories. This sort of application has been addressed by the proposal of incremental techniques such as the ones by Marsland (2002), Ma and Perkins (2003) and Itti and Baldi (2005).

Complete Chapter List

Search this Book:
Reset