Signal Processing, Perceptual Coding and Watermarking of Digital Audio: Advanced Technologies and Models

Signal Processing, Perceptual Coding and Watermarking of Digital Audio: Advanced Technologies and Models

Xing He (BrainMedia LLC, USA)
Indexed In: SCOPUS
Release Date: July, 2011|Copyright: © 2012 |Pages: 200
ISBN13: 9781615209255|ISBN10: 1615209255|EISBN13: 9781615209262|DOI: 10.4018/978-1-61520-925-5

Description

The availability of increased computational power and the proliferation of the Internet have facilitated the production and distribution of unauthorized copies of multimedia information. As a result, the problem of copyright protection has attracted the interest of worldwide scientific and business communities.

Signal Processing, Perceptual Coding and Watermarking of Digital Audio: Advanced Technologies and Models focuses on watermarking, in which data is marked with hidden ownership information, as a promising solution to copyright protection issues. Compared to embedding watermarks into still images, hiding data in audio is much more challenging due to the extreme sensitivity of the human auditory system to changes in the audio signal. This book focuses on understanding human perception processes and including them in effective psychoacoustic models, as well as synchronization, which is an important component of a successful watermarking system.

Topics Covered

The many academic areas covered in this publication include, but are not limited to:

  • A Fast and Precise Synchronization Method for Digital Audio Watermarking
  • A High Quality Audio Coder Using Proposed Psychoacoustic Model
  • Discrete Wavelet Packet Transform
  • Human Auditory System and Psychoacoustics
  • Novel Applications of Digital Watermarking
  • Principles of Spread Spectrum
  • Survey of Spread Spectrum based Audio Watermarking Schemes
  • Techniques for Improved Spread Spectrum Detection
  • Watermarking Schemes

Table of Contents and List of Contributors

Search this Book:
Reset

Preface

The availability of increased computational power and the proliferation of the Internet have facilitated the production and distribution of unauthorized copies of multimedia information. As a result, the problem of copyright protection has attracted the interest of the worldwide scientific and business communities. The most promising solution seems to be the watermarking process where the original data is marked with ownership information hidden in an imperceptible manner in the original signal. Compared to embedding watermarks into still images, hiding data in audio is much more challenging due to the extreme sensitivity of the human auditory system to changes in the audio signal. Understanding of the human perception processes and including them in effective psychoacoustic models is the key to successful watermarking. Aside from psychoacoustic modeling, synchronization is also an important component for a successful watermarking system. In order to recover the embedded watermark from the watermarked signal the detector has to know the beginning location of the embedded watermark first.

In this book, we focus on those two issues. We propose a psychoacoustic model which is based on the discrete wavelet packet transform (DWPT). This model takes advantage of the flexibility of DWPT decomposition to closely approximate the critical bands and provides precise masking thresholds, resulting in increased extent of inaudible spectrum and reduction of sum to signal masking ratio (SSMR) compared to the existing competing techniques. The proposed psychoacoustic model has direct applications to digital perceptual audio coding as well as digital audio watermarking.

For digital perceptual audio coding, the greater extent of inaudible spectrum provided by the psychoacoustic model results more audio samples to be quantized to zero, leading to a decreased compression bit rate. The reduction of SSMR on the other hand, allows a coarser quantization step, which further cuts the necessary bits for audio representation in the audible spectrum areas. In other words, the audio compressed with the proposed digital perceptual codec achieves better subjective quality than an existing coding standard when operating at the same information rate, which is proven by the subjective listening test.

Digital audio watermarking applications will benefit from the proposed psychoacoustic model from two perspectives: a) It can embed more watermarks to the inaudible spectrum, which results to a watermark payload increase and b) It hiding higher energy watermarks to the audible spectrum areas possible, which leads to improved robustness and greater resiliency to attacks and signal transformations than existing techniques, as proven by the experimental results.

We finally introduce a fast and robust synchronization algorithm for watermarking which exploits the consistency of the signal energy distribution under varying transformation conditions and uses a matched filter approach in a fast search for determining the precise watermark location. The proposed synchronization method achieves error free sample-to-sample synchronization under different attacks and signal transformations and shows very high robustness to severe malicious time scaling manipulation.

Author(s)/Editor(s) Biography

Xing He is a senior audio research engineer in the research group at SRS Labs, Inc. located in Santa Ana, California. From January 2006 to August 2008, he was a principal systems engineer in the research group at BrainMedia, LLC located in New York City. Prior to this appointment, he was a research engineer at the Panasonic (China) Research and Development Center conducting research on Automatic Speech Recognition (ASR). He holds a PhD from the Department of Electrical and Computer Engineering at the University of Miami, in addition to his master’s and bachelor’s degrees from the Department of Electrical Engineering at Beijing Jiaotong University, Beijing, China. Dr. He’s research focuses on digital signal processing, with emphasis on speech signal enhancement, perceptual audio coding and compression, psychoacoustic modeling, and digital audio watermarking.

Indices