Article Preview
Top1. Introduction
Audio content authorship protection and disputes are a challenging research problem. Several attempts have been employed to protect audio signals via two different technologies: encryption and digital watermarking. Cryptography acts as a shielding jacket that prevents unauthorized access, but once the contents are decrypted the protection is unshielded. On the other hand, a robust digital watermark should be unfeasible to defeat without destroying the host signal. Most of the proposed or actual watermarking applications are: broadcast monitoring, owner identification, transaction tracking, and content authentication. Embedding information into audio sequences is more complicated task than that of images, due to superiority of the human auditory system, HAS, over human visual system, HVS, (Cox et al., 1997). This is the reason why only few audio watermarking techniques have been reported in the literature. We have addressed this challenging situation with empirical mode decomposition, EMD, and Hilbert Haung Transformation, HHT.
The International Federation of Phonographic Industry, IFPI, and STEP 2001 (hhttp://www.trl.ibm.com/projects/RightsManagement/datahiding/dhstep_e.htmi, March 30 2012) require audio watermarking resist two main constraints: (1) imperceptibility – which can be measured in terms of noise to signal ration, SNR, and it should be higher than 20 dBs; (2) robustness: which require that watermark signal should be robust against time scale modification (TSM) for ±10% as well as other signal processing attacks such as MP3 compression, white noise (-40dB), down sampling and the like. In this paper, we present a new method which can embed the watermark into the Intrinsic Mode Functions (IMFs) that contains the highest and lowest energy in the HHT domain. Since the watermark is added to the significant IMFs, our method is more resistant to common signal processing manipulation than the time domain and other frequency domain methods. This is due to the fact that the EMD method provides perfect time-frequency localization properties than discrete wavelet transform (DWT) and discrete cosine transform (DCT) domains and leads to implicit audio-visual masking. This decomposition method is adaptive and highly efficient for perceptual transparency (inaudibility) and robustness. Since the watermark can be embedded into the whole transform domain, then, the capacity of hiding information is larger than other existing methods and the embedded information can be completely restored even if with high distortion. More we can control the information hiding capacity by varying the size of the frame.
The rest of this paper is organized as follows: In section 2, we present a review of the TSM and MP3 attacks on audio watermarking methods and related works. In Section 3, we briefly describe the empirical mode decomposition and IMFs. The proposed scheme is introduced in section 4. The simulation experimental results are presented in section 5, followed by conclusions in section 6.