Article Preview
TopIntroduction
Digital signal has replaced traditional simulated signal to become the most popular information carrier in communication. However, it’s easy to edit, attack and forge the digital information for the increasingly rich of the multimedia editing tools.
In real life, digital speech signals are likely to cause attacker’s interest and be maliciously attacked. For attacked signal, the expressed meaning is different to the original one. If recipient regards the attacked signal is an authentic one and acts according to the requirements, it may cause serious consequences (Zhang et al., 2015). Fortunately, the forensic technology based on digital watermarking (Akhaee et al., 2010; Pun & Yuan, 2013; Peng et al., 2013, Lei et al., 2013) gives a method to verify the authenticity of speech signal.
Digital audio watermarking schemes are usually used for protecting audio copyright (Xiang et al., 2006; Wang et al., 2011; Yamamoto & Iwakini, 2009; Salma et al., 2010; Wang, Healy & Timoney, 2011) and have been achieved an outstanding progress in recent years. In (Wang, Shi, Wang & Yang, 2016) authors proposed a robust audio watermarking method based on invariant exponent moments and synchronization code technique. Watermark generated by binary image is embedded into host audio. Experiment results demonstrated that the scheme is resistance against most attacks, and the binary image extracted from watermarked signal after being attacked is similar to the original one. If the scheme is used for authentication, most attacks will not be detected. In Bai et al. (2011), authors given the audio watermarking scheme based on SVD–DCT with the synchronization code technique. Binary image as watermark is embedded into the high-frequency band of the SVD–DCT block blindly. The scheme is robust against various common signal processing attacks. So, if the schemes (Wang, Shi, Wang & Yang, 2016; Bai et al., 2011) is used for authentication, most attacks will not be detected.
As a carrier to transmit information, the meaning of digital speech signal to express should be intact and authentic. For audiences and users, if they consider the attacked signal as the original one and act according to the instructions of the attacked signal, it may cause serious consequences. So, for digital speech signals, the method used for speech forensics is indispensable, which can be achieved by using digital watermark (Liu et al., 2016). Outstanding progress has been achieved in recently, while they are unsuitable for speech authentication (Liu, Huang, Sun, & Qi, 2016).
By using detection of multiple compression and encoder’s identification, Korycki (2014) proposed an authentication scheme for compressed audio recordings. The compressed recordings are authenticated by evaluation of statistical features extracted from MDCT coefficients and other parameters obtained from compressed audio files, used for training selected machine learning algorithms. Although the scheme enhanced the robustness and the effectiveness, it needs a large number of training data.