Information Hiding Using Interpolation for Audio and Speech Signals

Information Hiding Using Interpolation for Audio and Speech Signals

Mamoru Iwaki (Niigata University, Japan)
DOI: 10.4018/978-1-4666-2217-3.ch004
OnDemand PDF Download:
No Current Special Offers


In this chapter, a time-domain high-bit-rate information hiding method using interpolation techniques, which can extract embedded data in both informed (non-blind) and non-informed (blind) ways, is proposed. Three interpolation techniques are introduced for the information hiding method, i.e., spline interpolation, Fourier-series interpolation, and linear-prediction interpolation. In performance evaluation, spline interpolation was mainly examined as an example implementation. According to the simulation of information hiding in music signals, the spline interpolation-based method achieved audio-information hiding for CD-audio signals at bit rate of about 2.9 kbps, and about 1.1 kbps under MP3 compression (160 kbps). The objective sound quality measured by the Perceptual Evaluation of Audio Quality (PEAQ) was maintained if the length of interpolation data increased. The objective sound quality was also evaluated for the Fourier series-based implementation and the linear prediction-based one. Fourier series interpolation achieved the same sound quality as spline interpolation did. Linear prediction interpolation required longer interpolation signals to get good sound quality.
Chapter Preview


There are many application fields of information hiding. For example, digital watermarking is applied for copyright protection, fingerprinting, authentication, copy control, owner identification, broadcast monitoring, security control, and tamper proofing (Cox, 2002; Seadle, 2002; Voloshynovskiy, 2001; Wu, 2004; Cano, 2002; Ruiz, 2000; Anand, 1998; Miaou, 2000; Kalker, 1999). As for the human audio system, louder sounds usually tend to mask weaker sounds in the time and frequency domains. Audio-watermarking methods are classified according to choice of hiding domain as time-domain ones or frequency-domain (transformation-domain) ones. Under the premise that human audio system cannot detect small changes in certain temporal portions (Bassia, 1998; Bender, 1996; Deller, 2000; Lee, 2005), time-domain methods embed watermarks by modifying audio-signal samples according to amplitude (Van Schyndel, 1994; Foote, 2003; Ozer, 2005). Compared to frequency-domain methods, time-domain methods generally have an advantage in terms of high bit rate. However, time-domain methods are usually not robust against modification of stego signals (Johnson, 2000) because the process they use for detecting hidden information supposes that stego signals can be received cleanly without deformation in the transmission channel like noise (Linnartz, 1998; Miller, 1999). In contrast, frequency-domain (and transformed-domain) methods embed watermarks in the frequency domain; for example, spread-spectrum watermarking hides a narrow-band watermark signal in a wide-band host signal (Cox, 1997; Ruiz, 2000). Frequency-domain methods are robust against modification of stego signal; however, their maximum bit rate is not that large. Some watermarking techniques exploit both temporal and frequency information in the embedding process (Boney, 1996; Petitcolas, 1998). Other techniques utilize insensitivity of human hearing to acoustical echoes or phase difference (Voloshynovskiy, 2001; Chen, 2001; Cheng, 2001; Gurijala, 2003; Deller, 2000; Celik, 2005). A suitable information-hiding method should provide high bit rate and achieve high robustness without producing noticeable auditory distortion. However, bit rates of most digital-watermarking methods for audio signals are about 10 to 100 bps, and they cannot handle high bit rate.

Although time-domain methods generally provide good sound quality and high bit rate, they have a weakness in terms of robustness against signal modification. When a signal value is modified, embedded information is simultaneously destroyed, because the hidden information is coded directly in the signal value of the original audio signal. Aiming to address this issue, here, we introduce the idea that a set of signal values should stand for one bit of embedded information. This condition will improve the robustness of time-domain methods against signal modification because most signal values recall the embedded information correctly even if some signal values have been varied. This can be understood by an improvement in the LSB modification method (Schyndel, 1994) and in the quantization index modulation (Chen, 2001; Liu, 2003, 2004).

In this chapter, a new information-hiding method for audio and speech signals with interpolation technique is proposed, and some examples are given. This method can detect and extract hidden information from a stego signal independently of the availability of the original host signal (i.e., the information is blindly detectable). Additionally, it can provide high bit rate. Using spline interpolations of degrees two and three, it achieved bit rate of about 2.9 kbps per channel for CD-audio music signals, and about 1.2 kbps even if watermarked signals were degraded by MP3 128-kbps compression while maintaining high sound quality.

Complete Chapter List

Search this Book: