Towards Layer Adaptation for Audio Transmission

Towards Layer Adaptation for Audio Transmission

Jan Holub (Czech Technical University in Prague, Prague, Czech Republic), Oldřich Slavata (Czech Technical University in Prague, Prague, Czech Republic), Pavel Souček (Czech Technical University in Prague, Prague, Czech Republic), Odysseas Zisimopoulos (Wireless Telecommunications Laboratory, Department of Electrical and Computer Engineering, University of Patras, Patras, Greece), Dimitris Toumpakaris (Wireless Telecommunications Laboratory, Department of Electrical and Computer Engineering, University of Patras, Patras, Greece) and Stavros Kotsopoulos (Wireless Telecommunications Laboratory, Department of Electrical and Computer Engineering, University of Patras, Patras, Greece)
DOI: 10.4018/IJITN.2014100104
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

When audio is transmitted over the wireless channel, the quality of the audio depends on the signal-to-noise ratio (SNR). The purpose of this paper is to investigate if rate adaptation can be avoided, and a system can rely instead on the audio encoder and decoder to alleviate the effect of channel errors. To this end, the paper reports on a set of experiments on various combinations of channel conditions, constellation sizes and audio encoding used and on the final audio quality achieved. The Mean Opinion Score (MOS) is used for performance evaluation. The MOS values are generated using the ITU-T P.862 (PESQ) and P.863 (POLQA) algorithms, and also using tests by experts. The results support the common practice of adapting the physical layer parameters to changing channel conditions. However, in some cases, it is possible to maintain a constant rate without impacting significantly the quality of the audio. This means that the complexity associated with physical layer and audio rate adaptation can be avoided leading to simpler and more robust designs.
Article Preview

1. Introduction

Traditionally, communication systems have been designed using a layered architecture. Each layer accomplishes a well-defined set of tasks and communicates with the layer immediately above and the layer immediately below through well-defined interfaces. The layered architecture reduces the complexity of systems and simplifies the design because it allows the designers to focus on optimizing each layer.

However, this architecture is not always optimal. In some scenarios, optimizing a given layer without taking into consideration other layers may result in loss of efficiency. One example is the behavior of congestion-avoidance protocols in networks where the physical medium is a wireless channel with fading. Another example is resource allocation in MIMO systems, where it is no longer possible for the MAC and the PHY layer to coordinate through a few simple metrics and the allocation needs to be performed in a joint fashion.

The transmission of audio through wireless networks is a paradigm where use of cross-layer approaches instead of a purely layered approach may prove to be advantageous (Chen, Liu, Wu, and Chen, 2011). Currently, state-of-the-art wireless systems use Adaptive Modulation and Coding (AMC) in the physical layer in order to provide transport of bit streams with a Bit-Error Rate (BER) not exceeding a value specified by upper layers. In order to maintain the target BER, the rate of the raw bit stream is adapted as the quality of the channel changes. When the channel quality is low because of fading and/or interference, or when more users enter in a system with limited resources, the data rate is dropped. Inversely, more data can be transmitted in a given time slot when the SNR is high or when more bandwidth can be allocated to a given user.

In live audio transmission, if it is important that the audio quality remain the same throughout transmission, the user will need to boost its power if the SNR decreases, resulting in higher power consumption. Moreover, the system may be forced to deny service to users requesting transmission in order to maintain the data rates of existing users. An alternative approach is to drop the quality of the transmitted audio signal. In some cases, such as audio broadcasting in home networks, this may be highly undesirable.

In other applications, it might be preferable to maintain a given stream rate, even if this results in some degradation of the audio or some periods of silence (e.g. secured low bit-rate voice or video transmissions). Moreover, in applications where some delay can be tolerated, and, therefore, interleaving can be used either in the audio encoder or by physical layer algorithms, it may not be necessary to change the data rate.

When Adaptive Coding and Modulation is used, the adaptation is based on appropriate metrics. Typically, the used metric is the Bit-Error Rate (BER) that can be offered from the channel at a given instant. The BER is a function not only of the channel condition, but also of the transmission rate. However, when the physical medium is used to transport audio, it might be possible to base the adaptation decision on other metrics. For example, if some amount of audio buffering is used, it might be possible to delay switching to a lower rate, as the channel may improve before the buffer empties. For this decision, metrics such as the size of the buffer and the statistics of the channel could be considered. Even if a larger BER cannot be avoided, if the audio decoder can conceal the additional errors it might be better to maintain the same transmission rate at the physical layer. Bandwidth saving, latency and packet loss for different options are studied in (Saldana, Fernandez-Navajas, Ruiz-Mas, Murillo, Viruete-Navarro, and Aznar, 2012). In this paper, the Mean Opinion Score (MOS) is used (ITU-T Rec. P.862, 2001), (ITU-T Rec. P.862.1, 2003), (ITU-T Rec. P.863, 2011). The MOS quantifies the quality of the audio as perceived by a listener. It can be obtained via listening tests by experts, or using objective testing by means of speech transmission quality algorithms (ITU-T Rec. P.862, 2001), (ITU-T Rec. P.862.1, 2003), (ITU-T Rec. P.863, 2011).

Our aim is to investigate if the common practice of adapting the physical layer parameters is beneficial when the goal is to maintain a given MOS rather than a given BER of the encoded audio bit stream. This is motivated by the observation (Slavata & Holub, 2014) that, although larger BER values create more bit errors to the encoded audio stream, by keeping the audio rate constant instead of reducing it, the audio decoder may be able to better conceal the effect of the additional errors on the quality of the audio.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing