Deterministic Concept Drift Detection in Ensemble Classifier Based Data Stream Classification Process

Deterministic Concept Drift Detection in Ensemble Classifier Based Data Stream Classification Process

Mohammed Ahmed Ali Abdualrhman (Department of Computer science and Engineering, P E S College of Engineering, Mandya, India) and M C. Padma (PES College of Engineering, Department of Computer Science & Engineering, Mandya, India)
Copyright: © 2019 |Pages: 20
DOI: 10.4018/IJGHPC.2019010103

Abstract

The data in streaming environment tends to be non-stationary. Hence, frequent and irregular changes occur in data, which usually denotes as a concept drift related to the process of classifying data streams. Depiction of the concept drift in traditional phase of data stream mining demands availability of labelled samples; however, incorporating the label to a streamlining transaction is infeasible in terms of process time and resource utilization. In this article, deterministic concept drift detection (DCDD) in ensemble classifier-based data stream classification process is proposed, which can depict a concept drift regardless of the labels assigned to samples. The depicted model of DCDD is evaluated by experimental study on dataset called poker-hand. The experimental result showing that the proposed model is accurate and scalable to detect concept drift with high drift detection rate and minimal false alarming and missing rate that compared to other contemporary models.
Article Preview

Introduction

Batch mode is the key operational approach for many of the machine learning algorithms. A conventional machine learning solution analyses the existing set of historical data and builds a model based on given inputs. However, with the emerging trends of rapid developments and the intrinsic need for the update, the machine learning models required recurrent learning that often leads not to deliver the necessary and desired outcome, which is due to ambiguity state of data change point identification to repeat the training process. To ensure that the models are robust and suitable to rapidly emerging conditions, there is an intrinsic need for detection of concept drift i.e. data change point detection. Varied research contributions carried out on identifying the vivid kinds of drift that takes place from the machine learning solutions (Žliobaitė, 2016), (GAMA, 2014). Majority of these contributions either of qualitative in nature or has been an informal definition, and hardly any kind of standardization of the term has yielded the desired result (Abdualrhman & Padma, 2015).

The particular source of data such as data streams wherein the quantum of data generated is massive and envisaged the issues pertaining to concept drift. It is imperative that expecting data distributions to stay relevant to the emerging trends is an unrealistic expectation. The issue of concept drift in the recent past becomes noteworthy over the pattern or information mining, ML techniques, data extraction procedures etc. One of the key scenarios is about learning against the adversaries like the spam filters and intrusion detection. In the case of a predictive model, the system targets in identifying patterns characteristic of an adversary element, despite the fact that the element is aware of implementing the adaptive learning and changing its course of behaviour to stay unnoticed. The other context is about learning even during and with the existence of hidden parameters. Consumer modelling is considered as a prominent method of learning functions, wherein the system builds a method of consumer information that is not observable and could change during the course of time. The scope of drift is high even in the instance of tracking functions and prognostic maintenance. The model learns the system behaviour but also involves mechanical object’s deterioration over the learning period. Computation problems associated resulting from changes over the period of time is termed as concept drift. The changes could be infinite and could vary vividly depending on the changing trends and the system might need different range of techniques to be adapted. In such conditions, the process of “one-size-fit-all” model may not be effective for managing the concept-drift phenomenon. Alternatively, requirements in practical context might vary with common characteristics. For transferring adaptive techniques from one application to the other, there is need for a way to characterize application tasks in a structured way.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 11: 4 Issues (2019): 1 Released, 3 Forthcoming
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing