Article Preview
TopIntroduction1
Wireless embedded camera sensors have become ubiquitous components in various imaging applications, such as public safety and security systems, smart building operations, intelligent transportation, and remote health care. Rather than merely presenting raw data collected by camera sensors to the user, an application usually aims to automatically discover and extract meaningful information from the camera sensors and to achieve as much autonomy as possible in the physical system. Automatic video data analysis tools, which could detect, recognize, track objects of interest, and understand their behaviors, have become indispensable components in today’s imaging applications.
The performance of automatic analysis methods relies on the quality of images that are processed. It is therefore essential to introduce objective metrics for predicting the quality of images evaluated by automatic analysis algorithms. In the field of image quality assessment (IQA), a diverse range of image quality models, ranging from full-reference to reduced-reference and no-reference ones, were designed for predicting the perceptual quality evaluated by human subjects (Ma et al., 2018, pp. 1202-1213; Wang et al., 2018, pp. 1-14; Wang, Bovik, Sheikh, & Simoncelli, 2004, pp. 600-612).
The quality of a video sequence judged by an automatic analysis algorithm, however, is not necessarily sensitive to the same factors that drive human perceptions. The perceptual image quality assessments usually try to emulate known characteristics of the human visual system (HVS), such as the contrast sensitivity and the visual attention mechanisms. The contrast sensitivity mechanism means that the HVS is sensitive to the relative luminance change rather than the absolute luminance change (Wang et al., 2004, pp. 600-612). The visual attention mechanism is that only a local area in the image can be perceived with high resolution by the human observer at one time instance at typical viewing distances, due to the foveation feature of the HVS (Yang et al., 2016, pp. 3475-3488). On the other hand, automatic analysis methods run by machines can “perceive” the absolute luminance change precisely and have a better global “view”. For example, the problem of evaluating motion imagery quality for tracking in airborne reconnaissance systems was studied in Irvine and Wood’s research (2013, p. 87130Z). It was found that automated target detection algorithms are less sensitive to spatial resolution than humans, but factors such as jitter in the temporal domain, texture complexity, edge sharpness, and level of noise have a strong effect on the performance of target detection. In our recent work (Kong, Dai, & Zhang, 2016, pp. 3797-3801), we found that unlike human beings who can easily extract and focus on a moving object from a blurred background, the performance of object detection algorithms can be affected by the quality of the background. These results suggest that new models are needed for evaluating the quality of images from the perspective of automatic analysis algorithms.
In a wireless imaging system, automatic analysis could be deployed using two strategies: in the central server on compressed videos; or at the local cameras on uncompressed videos as a preprocessing step. The impact of video compression on the accuracy of analysis algorithms has been studied in some recent works (Tahboub, Reibman, & Delp, 2017, pp. 4192-4196; Zhong, & Reibman, 2018, pp. 1-6), which aim at finding the optimal compression rates under a quality requirement. Apart from the distortion introduced by compression, the quality of an image or a video could be degraded during the data acquisition or sensing process, e.g., distortion caused by noise or motion blur, or reduced image resolution due to storage or bandwidth constraints on embedded cameras. These factors should also be taken into consideration to evaluate the quality of an image.