A New Back-Propagation Neural Network Algorithm for a Big Data Environment Based on Punishing Characterized Active Learning Strategy

A New Back-Propagation Neural Network Algorithm for a Big Data Environment Based on Punishing Characterized Active Learning Strategy

Qiuhong Zhao (School of Economics and Management, Beihang University, Beijing, China), Feng Ye (School of Economics and Management, Beihang University, Beijing, China) and Shouyang Wang (Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China)
Copyright: © 2013 |Pages: 14
DOI: 10.4018/ijkss.2013100103
OnDemand PDF Download:
List Price: $37.50


This paper introduces the active learning strategy to the classical back-propagation neural network algorithm and proposes punishing-characterized active learning Back-Propagation (BP) Algorithm (PCAL-BP) to adapt to big data conditions. The PCAL-BP algorithm selects samples and punishments based on the absolute value of the prediction error to improve the efficiency of learning complex data. This approach involves reducing learning time and provides high precision. Numerical analysis shows that the PCAL-BP algorithm is superior to the classical BP neural network algorithm in both learning efficiency and precision. This advantage is more prominent in the case of extensive sample data. In addition, the PCAL-BP algorithm is compared with 16 types of classical classification algorithms. It performs better than 14 types of algorithms in the classification experiment used here. The experimental results also indicate that the prediction accuracy of the PCAL-BP algorithm can continue to increase with an increase in sample size.
Article Preview

1. Introduction

With the fast-paced development of networks and the rapid popularization of information technology, the volume of massive, multidimensional, dynamic, real-time information data increases rapidly. In the era of big data, data analysis becomes increasingly important. Effective learning algorithms, that depend on intelligent analysis to mine useful information from huge amounts of data, not only have the potential to discover laws that govern the development and changes of things, but also can improve the efficiency of management and decision-making.

A back-propagation (BP) neural network algorithm is a type of intelligent learning algorithm that exhibits superior performance and has been widely used in many fields, such as prediction, identification, and system evaluation. Undeniably, however, BP algorithm remains three drawbacks, over fitting, local minimum and slow learning speed. After years of research, BP neural networks have made great progress.

For example, some scholars have proposed the multi-network cooperation model (Ji & Ma, 1997) to handle over fitting problem. And other scholars put forward hybrid algorithms to overcome local minimum problem. They combine BP with Heuristic algorithms, such as PSO (Particle Swarm Optimization)(Guo, Qiao, & Hou, 2006), and genetic algorithm(Ming, Yan-chun, Xin-min, & Xiao-gang, 2010). In addition, aiming at the problem of slow learning speed, scholars treat BP as a nonlinear optimization model and minimize the output error. They use quasi-newton method(Huang & Lin, 2009), or LM (Levenberg-Marquardt) algorithm (Li & Liu, 2008) to optimize the learning speed. However, the drawback is that the learning speed of a traditional neural network remains slow. This problem is more serious when the algorithm faces a large number of samples. Thus, designing an effective learning strategy to improve the learning efficiency is a practical approach to extend the application of back-propagation neural networks to the big data field.

The core concept of an active learning strategy is that choosing learning samples reasonably can obtain better results than sampling at random. Active learning is suitable for complex data structures and samples with high computational cost(Lewi, Butera, & Paninski, 2009). Active learning algorithms can be divided into two categories: algorithms with explicit objective functions (Birlutiu, Groot, & Heskes, 2013; Dror & Steinberg, 2008; Lewi, et al., 2009) and those with implicit objective functions(Freund, Seung, Shamir, & Tishby, 1997; Lewis & Gale, 1994). This study explores algorithms with explicit objective functions.

In recent years, some scholars have considered adding active learning strategies to the BP neural network algorithm; for example, Seliya and Khoshgoftaar integrated active learning strategies into a BP neural network to judge computer network attacks (Seliya & Khoshgoftaar, 2010). Engelbrecht classifies active learning into incremental learning strategies and the selective learning strategy(A. P. Engelbrecht, 2001). The main idea of an incremental learning strategy is that the training set is initiated based on a subset of the candidate dataset. Subsequently, for each iteration, further subsets are chosen from the candidate dataset and added to the remaining training set based on some criteria. Meanwhile, the selected subsets are removed from the candidate dataset. The selective learning strategy is different from the incremental learning strategy in that the selected subset is not deleted from the candidate dataset. Therefore, for each iteration, all candidate data have a chance to be selected on certain criteria as the new training set.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2017): 3 Released, 1 Forthcoming
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing