Visualization of Feature Engineering Strategies for Predictive Analytics

Visualization of Feature Engineering Strategies for Predictive Analytics

Saggurthi Kishor Babu (Andhra Loyola Institute of Engineering and Technology, Vijayawada, India) and S. Vasavi (VR Siddhartha Engineering College, Vijayawada, India)
Copyright: © 2018 |Pages: 25
DOI: 10.4018/IJNCR.2018100102

Abstract

Predictive analytics can forecast trends, determines statistical probabilities and to act upon fraud and security threats for big data applications. Predictive analytics as a service (PAaaS) framework based upon ensemble model that uses Gaussian process with varying hyper parameters, Artificial Neural Networks, Auto Regression algorithm and Gaussian process is discussed in the authors' earlier works. Such framework can make in-depth statistical insights of data that helps in decision making process. This article reports the presentation layer of PAaaS for real time visualization and analytical reporting of these statistical insights. Result from various feature engineering strategies for predictive analytics is visualized in specific to type of feature engineering strategy and visualization technique using Tableau.
Article Preview

1. Introduction

As explained in Buytendijk and Trepanier’s work, (2010) predictive models can find relationship between outcome and dependent variables. There are six phases for predictive analytics process. In the initial phase, project is defined with outcomes, objectives, scope and the deliverables from the project. In the next phase, data is collected from various sources and is analyzed. This analysis requires strategies for preprocessing such as data cleaning, transformation and data modeling so that useful data is extracted for further processing. Subsequently validate the initial hypothesis using statistical models. The next phase is predictive modeling for forecasting the future. Results after implementation can be deployed for using it in the day to day decision making. The last phase is, monitoring the model in order to ensure that it is providing the expected results. Performance of computing layer of our framework is described in Babu, Vasavi, and Nagarjuna (2017) (Babu, & Vasavi, 2018). This layer finds, which of the algorithms such as Artificial Neural Networks (ANN), Auto Regression algorithm (ARX) and Gaussian process (GP) is better for income tax dataset to identify fraud in the projected tax values. Feature Engineering is the way toward changing crude information into features that better represents to the basic issue to the predictive models, bringing about enhanced model accuracy on unseen information. The execution of machine learning strategies is intensely dependent on the selection of data representation on which they are connected. Hence, a great part of the actual effort in deploying machine learning. Calculations goes into the plan of preprocessing pipelines and information changes that outcome in are introduction of the information that can support successful machine learning. Feature selection is the process of finding suitable features to be used in the model building. Feature selection techniques are used for four phases:

  • 1.

    Simplification of models so as to interpret by scientists/clients;

  • 2.

    Shorter training time;

  • 3.

    Execration of dimensionality;

  • 4.

    Enhanced generalization by decreasing over fitting.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2019): 1 Released, 3 Forthcoming
Volume 7: 4 Issues (2018)
Volume 6: 2 Issues (2017)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing