Entropy and Algorithm of Obtaining Decision Trees in a Way Approximated to the Natural Intelligence

Entropy and Algorithm of Obtaining Decision Trees in a Way Approximated to the Natural Intelligence

Olga Popova (Kuban State Technological University, Krasnodar, Russia), Boris Popov (Kuban State Technological University, Krasnodar, Russia), Vladimir Karandey (Kuban State Technological University, Krasnodar, Russia) and Alexander Gerashchenko (Kuban State Technological University, Krasnodar, Russia)
DOI: 10.4018/IJCINI.2019070104

Abstract

The classification of knowledge of a specified subject area is an actual task. The well-known methods of obtaining decision trees using entropy are not suitable for the classification of the subject area knowledge. So, a new algorithm of obtaining decision trees, whose way of obtaining is approximated to the natural intelligence, is suggested in the article. Here, the knowledge of a subject area is presented as a complex of answers to questions, which help to find the solution to a current task. The connection of entropy with the appearance of knowledge, the classification of previous knowledge, and the definitions used in decision trees are also analyzed in the article. The latter is necessary to compare the suggested algorithm approximated to the natural intelligence with the traditional method, using a small example. The article contains the analysis of solving a classification task for such a subject area as optimization methods.
Article Preview
Top

Introduction

Nowadays numerous classification tasks (Bahnsen, Aouada & Ottersten, 2015; Carlos & Abellán, 2014) are being solved in various areas of science. They have become actual due to the fast development of information technology. The progress in the methods of data collection, storage and processing has made it possible to collect huge masses of data. That is why the methods of automated analysis are necessary for such masses of data in order to process such a great amount of knowledge.

Currently, a number of popular problems can be solved with machine learning: classification, regression, clustering, anomalies detection and others. In classification problems, it is determined to which category the object can be referred according to its attributes. In regression problems, the attribute value of the object can be forecast on the basis of its other attributes. Clustering allows breaking down a set of objects into groups according to attributes of the objects in such a way that the objects were similar to each other within the groups but less similar beyond one group. In anomalies detection problems, search for objects that are “highly dissimilar” to all other ones in the sample or to a group of objects is performed. As for other problems, they are more specific. There are two types of machine learning algorithms – supervised (Deng, Wang, Li, Horng & Zhu, 2019; Cao, Qian, Wu & Wong, 2019) and unsupervised (Aissaoui, Madani, Oughdir & Allioui, 2019; Mak, Lee, & Park, 2019) ones. For unsupervised learning problems, they use samples consisting of objects that are described with a set of attributes. Meanwhile, supervised learning problems have an additional training sample for each object of which the target attribute is known. At present, two types of supervised learning problems are relevant: classification and regression ones. The simplest and the most popular method for solving these problems is the decision trees as they are used in daily life in the most diverse areas of human activity, and in ones being far away from machine learning at times. Visual directions of what to do in what situation can also be called a decision tree. The major advantage of decision trees consists in their being easy to interpret and similar to the model of human decision-making. This is why they have won immense popularity. So, classification method C4.5 using decision trees is considered as the first one in the list of top 10 data mining algorithms (Wu, Kumar, Quinlan, Ghosh, Yang, Motoda, McLachlan, Ng, Liu, Yu, Zhou, Steinbach, Hand, Steinberg, 2008).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 14: 4 Issues (2020): 2 Released, 2 Forthcoming
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing