Data Stream Mining of Event and Complex Event Streams: A Survey of Existing and Future Technologies and Applications in Big Data

Data Stream Mining of Event and Complex Event Streams: A Survey of Existing and Future Technologies and Applications in Big Data

Chris Wrench (University of Reading, UK), Frederic Stahl (University of Reading, UK), Giuseppe Di Fatta (University of Reading, UK), Vidhyalakshmi Karthikeyan (BT, UK) and Detlef D. Nauck (BT, UK)
Copyright: © 2016 |Pages: 24
DOI: 10.4018/978-1-5225-0293-7.ch003
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Complex Event Processing has been a growing field for the last ten years. It has seen the development of a number of methods and tools to aid in the processing of event streams and clouds though it has also been troubled by the lack of a cohesive definition. This paper aims to layout the technologies surrounding CEP and to distinguish it from the closely related field of Event Stream Processing. It also aims to explore the work done to apply Data Mining Techniques to both of these fields. An outline of stream processing technologies is laid out including the Data Stream Mining techniques that have been adapted for CEP.
Chapter Preview
Top

Introduction

Event Stream Processing (ESP) and Complex Event Processing (CEP) are increasingly wide and valued fields of study in Big Data Analytics. As the Internet of Things becomes more prominent so do events and the need for new and interesting ways of interpreting them. The purpose of this chapter is to clarify the positions of ESP and CEP within the field of Big Data Analytics and outline the range of Data Mining opportunities within ESP and CEP. This is done by identifying the challenges in the field and describing a range complementary and contrasting approaches to overcome them. Though there are numerous papers on the subject, a collection of this specific application was needed.

On this subject there is a useful body of knowledge spread across a wide area rife with different aliases and synonyms and it is difficult to see how the landscape is laid out. Both ESP and CEP evolved out of necessity and independently from multiple problem domains with their own bespoke vocabulary creating a lack of consensus as to the proper title of the field and its components, a phenomenon labelled “Tower of Babel Syndrome” (Cugola & Margara, 2012).

Events and Event Streams are the focus of much of this chapter. An event can be defined in many different ways but at this point it is simplistic to say an event is a thing that happens. An Event Stream is an unbounded series of ordered events which, like all Data Streams, is potentially unbounded (Owens, 2007; Yu, Li, Gu, & Hong, 2011). They are a frequent part of our daily lives and, if monitored and processed intuitively, can be an extremely valuable commodity (Eckert, Oriented, Soa, & Eda, 2009). An Event Stream is effectively a specialised Data Stream and as Big Data teaches us, where there is data there is often information and knowledge to be found (Bramer, 2013).

CEP is the means by which meaningful repeated patterns can be discovered amongst a dynamic collection of low level events. Event Stream Processing is the range of technologies used to process the stream and perform Big Data Analytics. It can be argued that ESP is a specialised form of CEP or the two are different approaches to a similar problem, here again is a debate present throughout the literature.

Event Streams are generated and used in many applications. Those generated by the Stock Market are popular subjects for predictive analytics, the transaction history of users on a website and can be used to optimise said website and predict user behaviour, presenting opportunities for profit from advertisement. Radio Frequency IDentification (RFID) tags have become cheaper, smaller and common place in high street shops. Sensors positioned around a shop register these tags and the Event Stream can be used to prevent shop lifting (Li, 2010). A further example is that of intrusion detection in which a system administrator employs CEP to identify an intrusion on a network amongst legitimate traffic in the stream (Axelsson, 2000). There are many more examples to be found from the briefest of research into the topic.

Event Stream Processing is a subtopic of Data Stream Mining which has very similar goals but is a far more clearly understood and well defined field. Data Streams present their own unique challenges (i.e. those associated with the Velocity, Volume and Variety; Ebbers, Abdel-Gayed, Budhi, & Dolot, 2013) which have been the subject of a great deal of research. These same problems apply to Event Streams so it makes sense to first look at the techniques used in Data Stream Mining.

Studying a stream in real-time enables a system or user to react to events in real-time which is of paramount importance for some applications. It also places special requirements on any stream processing technology. The standard database systems used in the majority of Big Data Analytics are not able to meet these requirements. To address this, the database has been adapted or superseded by the Active Database or the Data Stream Management System (DSMS) along with bespoke stream processing query languages and finally CEP systems. Many of these technologies will be looked at later in this chapter. The chapter will then detail several applications of Big Data Analytics and Machine Learning to ESP and CEP.

Complete Chapter List

Search this Book:
Reset