Genetic Programming as Supervised Machine Learning Algorithm

Genetic Programming as Supervised Machine Learning Algorithm

DOI: 10.4018/978-1-5225-6005-0.ch002

Abstract

This chapter presents the theory and procedures behind supervised machine learning and how genetic programming can be applied to be an effective machine learning algorithm. Due to simple and powerful concept of computer programs, genetic programming can solve many supervised machine learning problems, especially regression and classifications. The chapter starts with theory of supervised machine learning by describing the three main groups of modelling: regression, binary, and multiclass classification. Through those kinds of modelling, the most important performance parameters and skill scores are introduced. The chapter also describes procedures of the model evaluation and construction of confusion matrix for binary and multiclass classification. The second part describes in detail how to use genetic programming in order to build high performance GP models for regression and classifications. It also describes the procedure of generating computer programs for binary and multiclass calcification problems by introducing the concept of predefined root node.
Chapter Preview
Top

Machine Learning

Machine Learning (ML) is a sort of artificial intelligence (AI) that provides learning algorithms, mostly for the computers, with the ability to learn, without being explicitly programmed. The process of ML consists of searching the data to recognize the pattern in the data. The recognizing process can be defined as process of computer learning. Once the patterns are recognized, the computer can make prediction for new or unseen data based on persisted knowledge with more or less accuracy.

ML can be categorized based on the task that is going to be solved:

  • Supervised,

  • Unsupervised and

  • Reinforcement learning.

In supervised ML, the learning process consists of finding the rule that maps inputs (features) to outputs (labels). During the learning process, available data can be divided on the two sets. The training set is used for training and collecting the knowledge from the data. The second set is called validation or testing set which the learning algorithm uses for test against overfitting.

Unsupervised learning is the process of discovering patterns in data without defined output. With unsupervised learning, the correct result cannot be determined because no output variable is defined. Algorithms are left to their capability to discover as much as possible knowledge from the data. Because there is no output variable, there is no need for splitting available data set into training and testing part, and all data is used for training and extracting the patterns and knowledge. This kind of learning can be applied in image and signal processing, computer vision, etc.

Reinforcement ML is the process where computer interacts with a dynamic system in which it must achieve goal (like driving a vehicle or playing a game). Reinforcement learning provides feedback consisting of the information how the last action was treated, was it successful or failure. Based on the feedback, computer can learn and make further decisions.

Complete Chapter List

Search this Book:
Reset