Article Preview
TopI. Introduction
The word IoT can be defined as the correlation and coordination of diverse entities, where an entity could be an object, human, or machine which requests for or provides a service(J. Lin, Wei Yu, Nan Zhang, Xinyu Yang, 2017).In the world of industries, it is known as IIoT(Industrial Internet of Things). IIoT deals with the interconnection between machines, actuators, controllers and intensify productivity and automation in various industrial areas eg., transportation, manufacturing, and processing (A. Hassanzadeh, S. Modi& S. Mulchandani, 2015).Although IIoT proceeds to influence our current predicament and aim to create new future perspectives, it poses significant administrative and design problems (L. Da Xu, W. He&S. Li, 2014).
Anomaly is a term used to describe data that behaves in a way that is not intended, deviated from other data. The detection of anomalies (Anomaly Detection), also known as deviation detection, novelty discovery or outlier recognition, is identifying the patterns from the data that don’t match the anticipated behaviour. Though anomaly is unusual, it is a significant phenomenon. Hence, the research community has thus drawn considerable attention to anomaly detection (X. Liu & P. S. Nielsen, 2016) (M. Schreyer, T. Sattarov, D. Borth, A. Dengel, & B. Reimer, 2018). These anomalies can have a significant impact on the precision of classification based on data, prediction, and other operations; therefore, critical in identifying outliers rapidly in addition to effectively increase the accuracy of future operations based on data (Saihua Cai, Li Li, Sicong Li, Ruizhi Sun, Gang Yuan, 2020).
Anomaly detection has a broad range of applications; including the detection of fraud by credit card, industrial damage detection, healthcare, image processing, intrusion detection by computers, failure detection, and more (C. Chahla, H. Snoussi, L. Merghem& M. Esseghir, 2019).
Anomaly identification strategies in machine learning are classified into three categories depending on the labels available in the dataset:
Supervised Methods
ML models are designed for both abnormal and normal data in supervised learning techniques, in which unknown data case is given the label as anomalous or normal by assessing the concept it relates to (W. Cui & H. Wang, 2017).
Semi-Supervised Methods
Machine learning models are indeed to regular data in semi-supervised techniques, for which an unknown data example is labelled as ordinary if it is rational in following the model; else, the piece of instance is labelled as anomalous (W. Cui & H. Wang, 2017).
Unsupervised Methods
In unsupervised models, no training data is required, mainly because the anomalies in a given data set are assumed to be much more than normal data(W. Cui & H. Wang, 2017).
For this research, we developed a model for anomaly detection using multiple classification algorithms such as Logistic regression(LR), K Nearest Neighbor (KNN), Random Forest(RF), Decision Trees(DT), Light Gradient Boosting Machine (LightGBM), and compare the results based on various performance metrics.
This paper is assembled as follows: Section II discusses Literature Survey, Section III provides Methodologies, Section IV describes the Experimental evaluation, Section V discusses about the conclusion and future scope.
TopIi. Literature Survey
In this section, several aspects are examined concerning the relevant work, like IIoT, several anomaly detection machine algorithms.
Emiliano Sisinniet al. (Emiliano Sisinni, Abusayeed Saifullah, Song Han, Ulf Jennehag, Mikael Gidlund, 2018) have presented a detailed study on IoT, IIoT, and Industry 4.0 including the opportunities and the challenges in this paradigm. They also have discussed some of the recent works in research to overcome the challenges that involve the need for energy efficiency, interoperability, security, real-time performance, and privacy in the IIoT field.