Residual Reconstruction Algorithm Based on Half-Pixel Multi-Hypothesis Prediction for Distributed Compressive Video Sensing

Residual Reconstruction Algorithm Based on Half-Pixel Multi-Hypothesis Prediction for Distributed Compressive Video Sensing

Ying Tong (PLA University of Science and Technology, Nanjing, China), Rui Chen (Nanjing Institute of Technology, Nanjing, China), Jie Yang (Nanjing Institute of Technology, Nanjing, China) and Minghu Wu (Hubei Collaborative Innovation Center for High-efficiency Utilization of Solar Energy, Hubei University of Technology, Wuhan, China)
DOI: 10.4018/IJMCMC.2018100102

Abstract

Compressed sensing (CS) provides a method to sample and reconstruct sparse signals far below the Nyquist sampling rate, which has great potential in image/video acquisition and processing. In order to fully exploit the spatial and temporal characteristics of video frame and the coherence between successive frames, we propose a half-pixel interpolation based residual reconstruction method for distributed compressive video sensing (DCVS). At the decoding end, half-pixel interpolation and bi-directional motion estimation helps refine the side information for joint decoding of the non-key-frames. We apply a multi-hypothesis based on residual reconstruction algorithms to reconstruct the non-key-frames. Performance analysis and simulation experiments show that the quality of side information generated by the proposed algorithm is increased by about 1.5dB, with video reconstruction quality increased 0.3~2dB in PSNR, when compared with prior works on DCVS.
Article Preview

Introduction

Distributed Compressive Video Sensing (DCVS) is a popular framework for video compression. It has injected compressed sensing (CS) (Candes et al., 2008) to Distributed Video Coding (DVC) (Ji et al., 2012) and resulted in low-complexity and low-cost video encoding. It is suitable for those applications where video information is acquired by resource (i.e. memory, power, computational resource, etc.) limited devices, such as mobile cameras, wireless sensor nodes, etc. Unlike the traditional video coding standards, the video sequence is coded separately but decoded jointly with the side information in DVC system. So, the encoder can be very simple and meet the requirements of the resource limited video terminals. Moreover, CS technology provides a new method to sample, compress, and recovery signals. CS theorem breaks through the limitations of conventional Nyquist sampling theorem, and it has been applied in capturing compressed image/video signals efficiently.

There are many DCVS frameworks have been proposed. Typical frameworks include Do T T's DISCOS (Distributed Compressed Video Sensing) (Do et al., 2009) and its improvement aDISCOS (adaptive DISCOS) (Zhang et al., 2014), Prades's framework (Prades et al., 2009), and Kang's DCVS (Kang et al., 2009), etc. In these frameworks, at the encoder, source video sequences are usually divided into several GOPs (Group of Pictures). Each GOP contains a key-frame, followed by several CS-frames. In DISCOS scheme, key-frames are intra-coded using a traditional video coding standard, such as MPEG or H.26x, CS-frames (Do et al., 2009) are sensed by compressive sampling. For CS-frames, it employs a fixed measurement rate for each CS-frame, and the size of sparse dictionary is constant. This scheme ignores the diversified contents in various blocks within a frame as well as temporal variations among frames. So, in aDISCOS scheme, the measurement rate is adjusted according to the spatial and temporal sparsity. The sparse dictionary size is also adaptively adjusted to improve the coding performance. It need a feedback to obtain the information from the decoder. To keep the encoder simple, video frames are compressed independently by a number of random sampling operations. Also, motion estimation and other analyses are conducted at the decoder, leading to a joint and more complicated decoding to gain a higher recovery quality. Although DISCOS and aDISCOS frameworks reduce the complexity of the encoder, because the key-frames are still coded by a traditional video coding standard, so in these schemes the coding complexity is still high.

Kang's framework is very similar to DISCOS, the difference is all frames go through a regular CS-sampling with different measurement rates. Since key-frames and CS-frames are all compressive measured, so we call them key-frames and non-key-frames later. Generally, the measurement rate (i.e. sampling rate) of the key-frames is greater than the rate of the non-key-frames. When decoding, an independent reconstruction algorithm is carried out to reconstruct the key-frames. The reconstruction for non-key-frames is however much more complicated, because both the proceeding and successive decoded key-frames will be conducted to derive a good side information by motion-guided interpolation. The quality of side information would essentially influence the recovery quality of the non-key-frames.

On the basis of the above frameworks, many works have been carried out from different aspects:

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 10: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing