Accelerating Training Process in Logistic Regression Model using OpenCL Framework

Accelerating Training Process in Logistic Regression Model using OpenCL Framework

Hamada M. Zahera (Menoufia University, Al Minufya, Egypt) and Ashraf Bahgat El-Sisi (Menoufia University, Al Minufya, Egypt)
Copyright: © 2017 |Pages: 12
DOI: 10.4018/IJGHPC.2017070103
OnDemand PDF Download:
List Price: $37.50


In this paper, the authors propose a new parallel implemented approach on Graphics Processing Units (GPU) for training logistic regression model. Logistic regression has been applied in many machine learning applications to build building predictive models. However, logistic training regularly requires a long time to adapt an accurate prediction model. Researchers have worked out to reduce training time using different technologies such as multi-threading, Multi-core CPUs and Message Passing Interface (MPI). In their study, the authors consider the high computation capabilities of GPU and easy development onto Open Computing Language (OpenCL) framework to execute logistic training process. GPU and OpenCL are the best choice with low cost and high performance for scaling up logistic regression model in handling large datasets. The proposed approach was implement in OpenCL C/C++ and tested by different size datasets on two GPU platforms. The experimental results showed a significant improvement in execution time with large datasets, which is reduced inversely by the available GPU computing units.
Article Preview

1. Introduction

In recent decades, many machine learning techniques have been applied in complex problems like prediction, recommendation, and classification. Logistic regression is a popular machine learning algorithm which has been used in various predicting tasks such as financial analysis and medical diagnose; it can anticipate an output value based on a set of attributes or input variables. First, we need to build a logistic model through training by previous cases. The datasets should have a variety of training examples and consider many cases. The training complexity in logistic regression depends on problem characteristics and datasets volume. Such training like this will take long hours and even days to achieve the desired accuracy in logistic model (Van Heeswijk, Miche, Oja, & Lendasse, 2011). Also, finding the best-fit setting for building a logistic model needs a certain amount of cross-validation experiments that can also be very time-consuming (Diamantidis, Karlis, & Giakoumakis, 2000).

As many techniques were proposed to speed up the training process and get high performance computation, for example Multithreaded, Multi-core CPUs, Message Passing Interface (MPI) and recently Open Computing Language (OpenCL) (Zouaneb, Belarbi, & Chouarfia, 2016). Each technology has its own deployment characteristics and execution cost. Nowadays, many researchers focus in parallelizing a variety of complex computational algorithms (Lotrič & Dobnikar, 2009) based on recent capabilities of new hardware architectures. OpenCL is one of High performance computation and development framework that is developed by Khronos group to expand GPUs computation from not only graphics rending, but also general purpose computation (often called General Purpose GPU: GPGPU) (Harris, 2005) (Witten, Frank, Hall, & Pal, 2016). OpenCL provides an industry standard for parallel programming of heterogeneous computing platforms (Zouaneb, Belarbi, & Chouarfia, 2016), it is not dedicated to specific GPU vendors like Compute Unified Device Architecture (CUDA) which is restricted only for NVidia GPUs. The recent developments of GPU have shown a superb computational performance with the current multi-core CPUs. Nevertheless, GPUs are specially designed to facilitate accelerated graphics processing; they have been used as general purpose computing devices. Certain high performance GPUs are now designed to executes general-purpose processes instead of graphics rendering, which it was the only usage of GPUs before (Munshi, 2009). The specifics of OpenCL architecture are well considered into our implementation to devise the parallelization carefully.

We present a GPU implementation of the logistic training process that achieves a significant performance while processing large size datasets, the training process takes few milliseconds compared to seconds in implementation using standard APIs. We examined our proposed methods with six different size datasets on two platforms. The results showed that our parallel implementation has a distinct advantage over original algorithm, which can be obtained in a wide range of available GPU hardware. We encourage researchers in other study areas to boost large data processing by parallelizing algorithms GPU and OpenCL frameworks. This solution does not require extra resources and make a considerable utilization of machine resources: CPU and GPU. The main contributions of this paper are as follows:

  • An accelerated and parallel approach for the training process in logistic regression. Our proposed method will help in accelerating processing many complex application (e.g. face detection, speech recognition) in a short time.

  • Our implementation is a platform independent. The proposed approach can run on any GPU-supported device.

  • Maximize resource utilization and engaging available computing resources in processing such as CPU and all available GPU devices.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing