Article Preview
TopIntroduction
For digital speech signals, there are a lot of research results in speech enhancement (Shima, Ahmad, & Babak, 2015; Ji & Danny, 2014; Mohamed & Pascal, 2014; Belinda & Kuldip, 2014; Seon & Hong, 2014) and speaker recognition (Shiha, Linb, Wanga, & Lina, 2011; Srikanth, 2014, pp. 137-145; Kawthar & Abderrahmane, 2014; Khan, Baig, & Youssef, 2010). However the speech content authentication schemes are rare. In our daily life, digital speech signals are likely to cause attacker’s interest and be maliciously attacked. If the attacked signals are not be detected, the authentication client will consider the attacked speech be veracity, which will cause serious consequences. Thus, the research of speech content authentication is realistic significance and practical applications.
Digital audio watermarking schemes can be divided into two kinds according to the generation of watermark. The first is that watermark used has nothing to do with the audio content, for example, the use of binary image as watermark (Xiang, Huang, &Yang, 2006; Wang, Ma, &Niu, 2011; Vivekananda, Indranil, & Abhijit, 2011). The second is that watermark is generated by audio features or audio content (Wang & Fan, 2010; Fan & Wang, 2008). For the first kind, the binary image being used to generate watermark needs to be transmitted to authentication client, which increases the transmission bandwidth and the probability of being attacked. If the binary image has been tampered in the process of transmission, authentication client will consider that the audio content has been tampered. For the second kind, the watermark and watermarked audio are transmitted to authentication client together, which reduce transmission bandwidth and increase the security of the watermark transmission. If watermark is attacked, the audio content will also be attacked. So the probability that authentication client detected the attack is increased greatly. Considering the above analysis, the application prospect of content-based speech content authentication algorithms will be more extensive.
Unfortunately, for the existed watermarked schemes content-based and robust against desynchronization attacks, they have some shortcomings: 1) Watermark is generated and embedded based on public features (Wang & Fan, 2010), which is vulnerable to feature-analysed substitution attack (Liu & Wang, 2014). 2) For the schemes robust against desynchronization attacks (Wang, Ma, &Niu, 2011; Vivekananda, Indranil, & Abhijit, 2011; Bai, Ing, & Zhen, 2011), on the one hand, the synchronization codes face the threat of substitution attack (Liu & Wang, 2014). On the other hand, the signal between two neighbouring synchronization codes is regarded as watermarked signal, but they do not verify the authenticity of watermarked signal detected.