An Architecture of the Semantic Meta Mining Assistant for Adaptive Domain-Oriented Data Processing

An Architecture of the Semantic Meta Mining Assistant for Adaptive Domain-Oriented Data Processing

Yang Jiafeng, Nataly Zhukova, Sergey Lebedev, Man Tianxing
DOI: 10.4018/IJERTCS.302111
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Data mining is applied in various domains for extracting knowledge from domain data. The efficiency of DM algorithms usage in practice depends on the context including data characteristics, task requirements, and available resources. Semantic meta mining is the technique of building DM workflows through algorithm/model selection using a description framework that clarifies the complex relationships between tasks, data, and algorithms at different stages in the DM process. In this article, an architecture of semantic meta mining assistant for domain-oriented data processing is proposed. A case study applied proposed architecture on time series classification tasks is discussed.
Article Preview
Top

1. Introduction

Nowadays huge amounts of data are generated in various domains. Especially for the construction of Internet of Things in embedded and real-time communication systems (Shukla, A. K., et al. (2018), Kumar, H., & Tyagi, I. (2019)), there is a significant need to extract knowledge from these data with data processing and analyses. Data mining is termed as the practice of analyzing enormous prevailing dataset for the generation of new information, otherwise known as the process of knowledge discovery from the database (Joseph, S. I. T., & Thanakumar, I. (2019)). By now a considerable number of DM algorithms have been developed (Meigal, A. Y., et al. (2019)). The efficiency of these algorithms usage in practice depends on the context (including data characteristics, task requirements, and available resources). In different contexts, different algorithms should be used. The task of selecting DM algorithms for data processing and analysis requires knowledge of DM experts. This leads to unjustified consumption of considerable human resources and to time delays.

To formalize data processing and analysis using DM algorithms a number of standards for constructing data mining processes have been developed. Today, to build DM processes there exist three main standards CRISP-DM (Chapman, P., et al. (2000)), SEMMA (Matignon, Randall. (2007)), and KDD ((Fayyad et al. (1996)). According to these standards, the DM processes consist of several stages and hundreds of activities. The stages include data preparation, modeling, evaluation. Each of them requires the choice of the operators/algorithms, thus, to get the effective solutions, we must spend many efforts and it will take a long time.

To address this issue, a number of systems that support DM processes were proposed that can be used for DM workflow generation:

  • RapidMiner (Hofmann, M., & Klinkenberg, R. (Eds.). (2016))

  • OpenML (Vanschoren, J., et al. (2014))

  • Google: Cloud AutoML (Bisong, E. (2019)), Google’s Prediction API (Ujwal, U. J., et al. (2018))

  • Microsoft: Custom Vision (Salvaris, M., et al. (2018))

  • Amazon: Amazon Machine Learning (Herbrich, R. (2017))

  • Others: BigML.com, Wise.io, SkyTree.com, Dato.com, Prediction.io, DataRobot.com

These systems are implemented based on the following techniques:

  • Meta learning (Vilalta, Ricardo, et al. (2004)), that is learning to learn, is defined as the application of machine learning (ML) techniques to meta-data about past machine learning experiments with the goal of modifying some aspects of the learning process in order to improve the performance of the resulting model.

  • AutoML (He, Xin, et al. (2021)) is the process of automating the tasks of applying ML to real-world problems. AutoML consists of meta learning and hyperparameter optimization.

The architectures based on Meta learning and AutoML focus on data preparation and modeling stages of DM processes, rather than on building a workflow for the entire DM process.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 2 Issues (2018)
Volume 8: 2 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing