A Hybrid Case Based Reasoning Model for Classification in Internet of Things (IoT) Environment

A Hybrid Case Based Reasoning Model for Classification in Internet of Things (IoT) Environment

Saroj Kr Biswas (NIT Silchar, India), Debashree Devi (NIT Silchar, India) and Manomita Chakraborty (NIT Silchar, India)
Copyright: © 2018 |Pages: 19
DOI: 10.4018/JOEUC.2018100107

Abstract

This article describes how the enormous size of data in IoT needs efficient data mining model for information extraction, classification and mining hidden patterns from data. CBR is a learning, mining and problem-solving approach which solves a problem by relating past similar solved problems. One issue with CBR is feature weight to measure the similarity among cases to mine similar past cases. NN's pruning is a popular method, which extracts feature weights from a trained neural network without losing much generality of the training set by using four mechanisms: sensitivity, activity, saliency and relevance. However, training NN with imbalanced data leads the classifier to get biased towards the majority class. Therefore, this article proposes a hybrid CBR model with RUS and cost sensitive back propagation neural network in IoT environment to deal with the feature weighting problem in imbalance data. The proposed model is validated with six real-life datasets. The experimental results show that the proposed model is better than other feature weighting methods.
Article Preview

1. Introduction

IoT can be defined as the inter-networking of physical devices; where devices are enabled with network connectivity and computational model to create, transfer and execute data among them with minimum human intervention (Rose, 2015). IoT creates a global platform for integrating physical world into the digital world with least human intervention and thus improving efficiency, productivity and helps in economic growth of the society (Alam et al., 2016). The data gathered in the IoT network are used to analyze and monitor the complex environment around us, and to obtain greater optimization, higher efficiency and intelligent decision making. The devices connected through IoT share information among other devices connected within the network; thus, create a huge database. Advanced technologies such as data mining and machine learning are embedded with conventional techniques to make the IoT operations smoother.

One of the main objectives of IoT is to create an efficient framework that could support smart decision making which yields the need of a data mining and learning framework. The basic motivation for using data mining in IoT domain is the necessity of conversion of IoT-generated data into knowledge for decision making which leads to use of Knowledge Data Discovery (KDD) into IoT (Tsai et al., 2014). KDD evolves to extract hidden useful information from a collection of data through the steps as: selection, pre-processing, transformation, data mining and interpretation/ evaluation (Tsai et al., 2014). KDD when employed to IoT domain, provides to convert the IoT data to important facts and then to valuable knowledge.

CBR is one of the decision making and problem-solving models which can be efficiently used in IoT domain for smart decision making. CBR solves problems based on past experiences stored in case base and also captures new knowledge/experiences, making it immediately available for solving new problems. (Aamodt et al., 1994; Biswas et al., 2014) have described CBR typically as a cyclical process comprising the four REs: retrieval, reuse, revise and retain. The core of CBR methodology is the retrieval of similar cases stored in case base and hence a similarity measure between cases is required to calculate the similarity among them. Thus, similarity measure becomes the key element in obtaining a reliable solution for new situations (Nunez et al., 2004; Buta, 1994). The task of defining similarity measures for real world problems is one of the greatest challenges as assessing the similarity between cases is a key aspect of CBR. The k nearest neighbor (k-NN) is one of the most popular similarity measures, which uses a distance function to find similarity between cases. However, the biggest problem with k-NN is to determine feature’s weight because several studies have shown that k-NN’s performance is highly sensitive to the definition of its distance function (Watson et al., 1994; Wettschereck et al., 1997). Researchers have introduced many k-NN variants to reduce this sensitivity by parameterizing the distance function with feature weights (Wettschereck et al., 1997).

The similarity of a query, q, with each stored case, in the case base can be calculated by:

(1) where is the parameterized weight value assigned to feature, and

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 32: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 31: 4 Issues (2019): 3 Released, 1 Forthcoming
Volume 30: 4 Issues (2018)
Volume 29: 4 Issues (2017)
Volume 28: 4 Issues (2016)
Volume 27: 4 Issues (2015)
Volume 26: 4 Issues (2014)
Volume 25: 4 Issues (2013)
Volume 24: 4 Issues (2012)
Volume 23: 4 Issues (2011)
Volume 22: 4 Issues (2010)
Volume 21: 4 Issues (2009)
Volume 20: 4 Issues (2008)
Volume 19: 4 Issues (2007)
Volume 18: 4 Issues (2006)
Volume 17: 4 Issues (2005)
Volume 16: 4 Issues (2004)
Volume 15: 4 Issues (2003)
Volume 14: 4 Issues (2002)
Volume 13: 4 Issues (2001)
Volume 12: 4 Issues (2000)
Volume 11: 4 Issues (1999)
Volume 10: 4 Issues (1998)
Volume 9: 4 Issues (1997)
Volume 8: 4 Issues (1996)
Volume 7: 4 Issues (1995)
Volume 6: 4 Issues (1994)
Volume 5: 4 Issues (1993)
Volume 4: 4 Issues (1992)
Volume 3: 4 Issues (1991)
Volume 2: 4 Issues (1990)
Volume 1: 3 Issues (1989)
View Complete Journal Contents Listing