Mobile GPU Computing Based Filter Bank Convolution for Three-Dimensional Wavelet Transform

Mobile GPU Computing Based Filter Bank Convolution for Three-Dimensional Wavelet Transform

Di Zhao (Chinese Academy of Sciences, China)
Copyright: © 2017 |Pages: 17
DOI: 10.4018/978-1-5225-0983-7.ch031


Mobile GPU computing, or System on Chip with embedded GPU (SoC GPU), becomes in great demand recently. Since these SoCs are designed for mobile devices with real-time applications such as image processing and video processing, high-efficient implementations of wavelet transform are essential for these chips. In this paper, the author develops two SoC GPU based DWT: signal based parallelization for discrete wavelet transform (sDWT) and coefficient based parallelization for discrete wavelet transform (cDWT), and the author evaluates the performance of three-dimensional wavelet transform on SoC GPU Tegra K1. Computational results show that, SoC GPU based DWT is significantly faster than SoC CPU based DWT. Computational results also show that, sDWT can generally satisfy the requirement of real-time processing (30 frames per second) with the image sizes of 352×288, 480×320, 720×480 and 1280×720, while cDWT can only obtain read-time processing with small image sizes of 352×288 and 480×320.
Chapter Preview

1. Introduction

System on Chip (SoC) is a tiny but complete computer, which consists almost all components of a computer such as CPU, GPU and memory. Because of SoC’s advantage of low power consumption, these chips are wildly embedded into mobile systems. Recently mainstream SoCs include Atom from Intel, Tegra from Nvidia, Snapdragon from Qualcomm, Ax from Apple, MTx from MediaTek, etc, where x means number(s). SoC runs on multiple operation systems such as Android (Alejandro Acosta & Francisc Almeida, 2014; Alejandro Acosta & Francisco Almeida, 2014a), Linux and Windows.

Existing research topics of SoC include performance analysis (A. Acosta & F. Almeida, 2014; Alejandro Acosta & Francisco Almeida, 2014b; Papadopoulos et al., 2014), power consumption (Grasso, Radojkovic, Rajovic, Gelado, & Ramirez, 2014; Papadopoulos et al., 2014; Zhan, Lung, & Srivastava, 2014), etc. Similar with regular GPU in desktop or notebook, SoC embedded GPU (SoC GPU) is responsible for graphics processing for SoC (Giles & Reguly, 2014). Companies develop different architectures for SoC GPU for example Apple Ax’s PowerVR, Tegra K1’s Kepler (Singh & Jain, 2014), etc.

Tegra K1 is one of Nvidia’s latest SoCs which include 32-bit version and 64-bit version. Tegra K1 32-bit version is released in 2014, and Tegra K1 64-bit version is on developing. Tegra K1 32-bit is fabricated by 28nm HPM. Tegra is developed for applications such as rendering (Mobeen & Lin, 2012; Rodríguez & Alcocer, 2012; Q. Wang, Yu, Rasmussen, & Yu, 2014), ray tracing (Lee et al., 2013), optical flow (Plyer, Le Besnerais, & Champagnat, 2014), face recognition (Kwang-Ting & Yi-Chu, 2011; Y.-C. Wang, Donyanavard, & Cheng, 2012), object tracking (Růžička & Mašek, 2014), computational photography (Pulli & Troccoli, 2014) and sift detector (Rister, Guohui, Wu, & Cavallaro, 2013).

A wavelet is a mathematical function for decomposing a given function into different scale components, wavelet is applied to digital signal processing for decades. Wavelet transform generally includes two categories: continuous wavelet transform and discrete wavelet transform. Discrete Wavelet Transform (DWT) is a category of wavelet transform with discrete wavelet coefficients (Press, 2007), and there are two mainstream DWT implementation algorithms: filter bank convolution and the lifting scheme (Jung, Park, & Kim, 2005).

Complete Chapter List

Search this Book: