Machine learning techniques have been successfully applied to several real world problems in areas as diverse as image analysis, Semantic Web, bioinformatics, text processing, natural language processing,telecommunications, finance, medical diagnosis, and so forth. A particular application where machine learning plays a key role is data mining, where machine learning techniques have been extensively used for the extraction of association, clustering, prediction, diagnosis, and regression models. This text presents our personal view of the main aspects, major tasks, frequently used algorithms, current research, and future directions of machine learning research. For such, it is organized as follows: Background information concerning machine learning is presented in the second section. The third section discusses different definitions for Machine Learning. Common tasks faced by Machine Learning Systems are described in the fourth section. Popular Machine Learning algorithms and the importance of the loss function are commented on in the fifth section. The sixth and seventh sections present the current trends and future research directions, respectively.
What Is Machine Learning?
Informally speaking, the main goal of machine learning is to build a computational model from past experience of what has been observed. For such, machine learning studies the automated acquisition of domain knowledge looking for the improvement of systems performance as result of experience.
In the beginning of the 1980s, Michaslky, Carbonell, and Mitchell (1983) presented one of the first definitions of machine learning “Self-constructing or self-modifying representations of what is being experienced for possible future use” (p. 10).
The focus of this definition is on programs that modify themselves in response to feedback from their environment. This definition reflects the main research lines at that time: expert systems (Weiss & Kulikowski, 1991), automatic programming, and reinforcement learning (Sutton, 1998).
A more recent definition appears in (Hand, Mannila, & Smyth, 2001) “Analysis of observational data to find unsuspected relationships and to summarize the data in novel ways that is both understandable and useful for the data owner” (p. 1).
An even more recent definition is due to (Alpaydin, 2004), where machine learning is defined as “Programming computers to optimize a performance criterion using example data or past experience” (p. 3).
Clearly, the task here is much closer to a data analysis task, enlarging the range of practical applications, mainly industrial and commercial, where machine learning is frequently employed. In any case we can define machine learning as the acquisition of a useful (understandable) representation of a data set from its extensional representation.
Key Terms in this Chapter
Decision Tree: A symbolic learning classifier that represents a discrete function by a decision tree.
LazyLearning: A learning approach where the instances are memorized.
Data Mining: The process of extraction of useful information in large databases.
Machine Learning: The programming of computers to optimize a performance criterion using example data or past experience.
Bayesian Networks: Probabilistic models for knowledge representation under uncertainty.
Neural Networks: Learning models based on the structure and processing of the nervous system.
Artificial Intelligence: The area of information technology concerned with the automation of reasoning, learning, and perception.
Inductive Learning: A learning approach based on the induction of a general concept from a limited set of observations.
Support Vector Machines: Large margin models based on statistical learning theory.
Hidden Markov Models: Stochastic models capable of performing a sequence of decisions.