Article Preview
TopIntroduction
Influenced by classical steganalysis (Farid, 2002; Avcibas, 2003), the use of statistical image features becomes common for source imaging device (e.g., camera, scanner) identification. Source imaging device identification can be thought of as a process of steganalysis if device noise in images is regarded as a disturbance caused by externally embedded messages. As a result, the statistics of the images captured by different cameras are believed to be different.
A variety of image features have been proposed and studied in prior arts of steganalysis. In Farid and Lyu (2002), they found that strong higher-order statistical regularities exist in the wavelet-like decomposition of a natural image, and the embedding of a message significantly alters these statistics and thus becomes detectable. Two sets of image features were studied. The mean, variance, skewness and kurtosis of the subband coefficients form the first feature set while the second feature set is based on the errors in an optimal linear predictor of coefficient magnitude. A total of 216 features were extracted from the wavelet decomposed image to form the feature vector. Support vector machines (SVM) were employed to detect statistical deviations. Avcibas et al. (2003) proved that steganographic schemes leave statistical evidence that can be exploited for detection with the aid of image quality features and multivariate regression analysis. To detect the difference between cover and stego images, 19 image quality metrics (IQMs) were proposed as steganalysis tools.
Statistical image features were introduced for forensic image investigation as soon as this research field emerged. In one early camera identification scheme, Kharrazi et al. (2004) studied a set of features that designate the characteristics of a specific digital camera to classify test images as originating from a specific camera. CFA (color filter array) configuration, demosaicing algorithms and color processing/transformation were believed to have great impact on the output image of camera. Thus, three average values in RGB channels of an image, three correlations between different color bands, three neighbor distribution centers of mass in RGB channels as well as three energy ratios between different color bands were used for reflecting color features. Moreover, each color band of the image was performed with wavelet decomposition, and the mean of each subband was calculated, just as in Farid and Lyu (2002). In addition to color features, 13 IQMs were borrowed from Avcibas et al. (2003) to describe the characteristics of image quality. The average identification accuracy for their SVM classifier was 88.02%. This scheme was re-implemented on different camera brands and models in (Tsai et al., 2006).
In one early scanner identification scheme, Gou et al. (2009) proposed a total of 30+18+12=60 statistical noise features to reflect the characteristics of the scanner imaging pipeline and motion system. The mean and STD (standard deviation) features were extracted using 4 filters (i.e., averaging filter, Gaussian filter, median filter, and Wiener adaptive filters with 3×3 and 5×5 neighborhood) in each of three color bands to form the first 2×5×3=30 features. The STD and goodness of Gaussian fitting were extracted from the wavelet decomposed image of each color band in 3 orientations to form another 2×3×3=18 wavelet features. Two neighborhood prediction errors were calculated from each color band at two brightness levels to form the last 2×3×2=12 features. The outcome of their SVM classifier had the identification accuracy over 95%.