A Taxonomy of Data Mining Problems

A Taxonomy of Data Mining Problems

Nayem Rahman (Portland State University, Portland, USA)
DOI: 10.4018/978-1-7998-2460-2.ch026


Much of the research in data mining and knowledge discovery has focused on the development of efficient data mining algorithms. Researchers and practitioners have developed data mining techniques to solve diverse real-world data mining problems. But there is no single source that identifies which techniques solve what problems and how, the advantages and limitations, and real-life use-cases. Lately, identifying data mining techniques and corresponding problems that they solve has drawn significant attention. In this paper, the author describes the progress made in developing data mining techniques and then classify them in terms of data mining problems taxonomy to help assist practitioners in using appropriate data mining techniques that solve business problems. This will allow researchers to expand the body of knowledge in this discipline. This article proposes a data mining problems taxonomy based on data mining techniques being used. Prominent data mining problems include classification, optimization, prediction, partitioning, relationship, pattern matching, recommendation, ranking, sequential patterns and anomaly detection. The data mining techniques that are used to solve these data mining problems in general fall under top 10 data mining algorithms.
Chapter Preview

Literature Review

Most business organizations have huge volumes of data. Knowledge workers with no special tools cannot read and analyze the data. It requires processing of data using ETL tools, business intelligence (Schlesinger & Rahman, 2015; Rahman & Iverson, 2015) and data mining tools (MacLennan et al. 2009) in order to make informed business decisions. ETL tools, database SQL and reporting tools can do this job to some extent. These tools are not enough to analyze data and find interesting patterns. For instance, the data mining findings that those who buy diapers also happen to purchase beers is an unrelated and complex pattern to identity (Padmanabhan and Tuzhilin, 2000). Conventional SQL and reporting tools cannot handle this. This requires running complex algorithms. A large number of data mining algorithms (Wu et al. 2008) and tools are on the market to process data, find relationships and patterns in raw data and deliver results that can be utilized in decision support systems (DSS). Reporting tools are also available for knowledge workers to generate reports to help make business decisions.

Complete Chapter List

Search this Book: