 # Predictive Analytics

Sema A. Kalaian (Eastern Michigan University, USA) and Rafa M. Kasim (Indiana Tech University, USA)
DOI: 10.4018/978-1-4666-7272-7.ch002

## Abstract

Predictive analytics and modeling are analytical tools for knowledge discovery through examining and capturing the complex relationships and patterns among the variables in the existing data in efforts to predict the future organizational performances. Their uses become more common place due largely to collecting massive amount of data, which is referred to as “big data,” and the increased need to transform large amounts of data into intelligent information (knowledge) such as trends, patterns, and relationships. The intelligent information can then be used to make smart and informed data-based decisions and predictions using various methods of predictive analytics. The main purpose of this chapter is to present a conceptual and practical overview of some of the basic and advanced analytical tools of predictive analytics. The chapter provides a detailed coverage of some of the predictive analytics tools such as Simple and Multiple-Regression, Polynomial Regression, Logistic Regression, Discriminant Analysis, and Multilevel Modeling.
Chapter Preview
Top

## Introduction

Predictive analytics, which is also referred to as predictive modeling techniques, are used in a variety of disciplines and fields of study such as business, management, engineering, marketing, technology, actuarial science, information systems, health informatics, and education. Predictive modeling methods are quantitative statistical techniques that are used most often to make future business predictions and decisions based on past historical data (Evans & Lindner, 2012). Kuhns and Johnson (2013) defines predictive modeling as “the process of developing a mathematical tool or model that generates an accurate prediction” (p. 2). Their uses become more common place due largely to

• 1.

Collecting massive amount of data, which is referred to as “big data”; and

• 2.

Increasingly complex nature of related predictive research and problems.

The main objective of predictive modeling is to predict an unknown value of a dependent (outcome) variable from known values of a set of exploratory independent (predictor) variables by analyzing and capturing the relationships between the dependent and independent variables in any research problem (Maisel & Cokins, 2014; Siegel 2014). The results and findings of these analyses are then used to make future predictions such as predicting specific future trends, risks, and behavior patterns. Consumer purchasing patterns, consumer loyalty, credit risks, credit limits, tax fraud, unemployment rates, and consumer attrition are examples of such predictions of future trends and consumer behavior patterns that the businesses and organizations often deal with.

Generally, predictive modeling is a complex data analytic process, of which predictive model building is only a part of the analytic process. The predictive analytics process includes:

• 1.

Understanding the predictive research problem and the massive data to be analyzed.

• 2.

Managing the data, preparing the data for analysis.

• 3.

Analyzing the data and building the analytic models.

• 4.

Evaluating the results and accuracy of the predictive modeling.

• 5.

Deploying and tailoring the final models to directly addressing the original predictive research problem.

One of the most difficult tasks for predictive analysts, researchers and students conducting predictive modeling is identifying the most appropriate predictive analytic technique that can be utilized to answer a particular research oriented business question. In addition, the type and number of the dependent variables are two important primary factors to determine the appropriateness of the predictive modeling technique that can be used for a particular business problem. Provost and Fawcett (2013) stated that the success in today’s data-driven businesses requires being able to think about how to correctly apply the principles, concepts, and techniques of predictive modeling to particular predictive business problems.

With the rise of using the internet and electronic devices (e.g., smart phones) to collect massive amount of data as well as technological advances in computer processing power and data storage capabilities, the demand for effective and sophisticated knowledge discovery and predictive modeling techniques has grown exponentially over the last decade. These knowledge discovery and predictive analytics techniques help business executives and policy makers make informed decisions to solve complex organizational and business problems. For example, the survival of businesses and organizations in a knowledge-and-data driven economy is derived from the ability to transform large quantities of data and information to knowledge (Maisel & Cokins, 2014; Siegel 2014). This knowledge can be used to make smart and informed data-based decisions and predictions using predictive analytics techniques. In fact, a decade ago, most such data was not collected or entirely overlooked as a key resource for business and organizational success because lack of knowledge and understanding of the value of such information (Hair, 2007).

## Key Terms in this Chapter

Multilevel Modeling: Multilevel modeling are advanced analytical methods to describe, explain, and capture the hierarchical relationships between variables at one level (micro-level) of the existing data that are affected by variables at higher level of the hierarchy (macro-level) in efforts to predict future performances, risks, trends, and behaviors.

Simple Linear Regression: Simple linear regression analysis methods are analytic techniques for explaining, analyzing and modeling the linear relationships between a continuous dependent variable and an independent variable in the recent and past existing data in efforts to build predictive models for making future business performance and risk predictions.

Discriminate Analysis: Discriminant Analysis is a predictive analytic technique that uses the information from a set of independent variables to predict the value of a discrete (categorical) dependent variable, which represents the mutually exclusive groups in the predictive model.

Predictive Analytics: Predictive analytics and modeling are statistical and analytical tools that examine and capture the complex relationships and underlying patterns among variables in the existing data in efforts to predict the future organizational performances, risks, trends, and behavior patterns.

Multiple Linear Regression: Multiple linear regression analysis methods are analytic techniques for explaining, analyzing and modeling the linear relationships between a continuous dependent variable and two or more independent variables in the recent and past existing data in efforts to build predictive models for making future business performance and risk predictions.

Polynomial Regression: Polynomial regression is a predictive analytics method that is used instead of the linear regression analysis (simple or multiple linear regression analyses) for describing and explaining the nonlinear relationships between the dependent and independent variables in the recent and past existing data in efforts to predict future performances, risks, and behaviors.

Two-Level Multilevel Model: Refers to analytic methods for data with two levels where individuals (micro-level) are nested within organizational groups (macro-level) and there are independent variables (predictors) characterizing each of the two levels of the multilevel model.

Logistic Regression: Logistic regression is a predictive analytic method for describing and explaining the relationships between a categorical dependent variable and one or more continuous or categorical independent variables in the recent and past existing data in efforts to build predictive models for predicting a membership of individuals or products into two groups or categories.

## Complete Chapter List

Search this Book:
Reset