Quality Analysis of VoIP in Real-Time Interactive Systems over Lossy Networks

Quality Analysis of VoIP in Real-Time Interactive Systems over Lossy Networks

Maha Z. Mouasher (The University of Jordan, Jordan) and Ala' F. Khalifeh (German-Jordanian University, Jordan)
DOI: 10.4018/978-1-4666-7377-9.ch016
OnDemand PDF Download:
List Price: $37.50


Voice over Internet Protocol (VoIP) systems have been spreading massively during the recent years. However, many challenges are still facing this technology among which is the lossy behavior and the uncontrolled network impairments of the Internet. In this chapter, the authors design and implement a VoIP test-bed utilizing the Adobe Real-Time Media Flow Protocol (RTMFP) that can be used for many voice interactive applications. The test-bed was used to study the effect of changing some voice parameters, mainly the encoding rate and the number of frames per packet as function of the network packet loss. Several experiments were conducted on several voice files over different packet losses, concluding in the best combination of parameters in low, moderate, and high packet loss conditions to improve the performance of voice packets measured by the Perceptual Evaluation of Speech Quality (PESQ) values.
Chapter Preview


The market penetration of VoIP has massively influenced traditional telephony as witnessed by the popularity of many VoIP services and applications (e.g. SKYPE, Cisco TelePresence, Ekiga and Oovoo (Vaughan, 2012)). It has become one of the most popular Internet Protocol (IP) based real-time communication services in recent years, it is of no doubt that VoIP networks are cost-effective over conventional telephone networks with Time Division Multiplexing (TDM) and leased lines. In real-time interactive systems, VoIP has been entitled an important role in enabling voice based real-time services, such as voice conferencing, that is applied in various sectors and fields every day. According to Zhao & Yagi & Nakajima & Juzoji, (2002). VoIP is taking a firm hold in the telecommunication market for a wide range of interactive applications including telemedicine and e-health. Telecommunications is the key to any successful telemedicine activity, and is considered a tool of vital importance to the future of healthcare. For many years, telemedicine was supported by dedicated telecommunication assets- Plain Old Phone Systems (POTS), satellite based, etc. Today, with the widespread of the Internet, VoIP based services are becoming more attractive (Latifi, 2008). However, still facing many challenges mainly the transmission over lossy connections and its consequences on Quality of Service (QoS). In this chapter, experimental tests have been conducted on lossy network impairments and a proposed test-bed is being built as a very efficient tool for testing Internet based interactive systems. The designed VoIP test-bed utilizes an open source, license free voice codec along with a development free publically available version of Adobe Flash Media server (FMS). Those features draw the interest of some commercially well-known companies for online speech therapy telepractice to use a similar design to the one proposed, in their interactive e-health solution, speech-language pathologists to school districts are provided through Telepractice as delivery model commonly referred to as online speech therapy, where the system connects the therapists to children in need of speech therapy in an interactive real-time system, the connection is held between the therapists and the children in need for speech therapy through the use of FMS. The investment in such a system degraded huge financial commitments. However, there is still a room for further developing the quality of the delivered VoIP services.

In fact many companies will act as a model on how VoIP can be utilized in many interactive systems and applications in different fields such as e-health services. However, despite VoIP widespread in many fields and applications, many corporations are still reluctant for introducing VoIP on a larger scale because of many QoS concerns (Mase & Toyama & Bilhaj & Suda, 2001; Sabrina & Valin, 2009), which stress the demand for more research concluding mechanisms and methods to improve its speech quality, which according to ITU E.800 is defined as “the quality of spoken language as perceived when acoustically displayed. Result of a perception and assessment process, in which the assessing subject establishes a relationship between the perceived characteristics” (ITU-T E.800, 2008). To that end, and in order to support VoIP applications over the Internet, two conflicting requirements need to be met. In one hand, shared resources have to be controlled so that resource usage is optimized, on the other hand, VoIP applications are considered one of the most applications that demands strict QoS level and sensitive services to network impairments as the packet loss, available bandwidth, delay, jitter, etc. (Rejaie & Handley & Estrin, 2000; Sabrina & Valin, 2009). According to (Birke & Mellia & Petracca & Rossi, 2007), the authors presented a large dataset of measurements collected from the FastWeb backbone, which is one of the first worldwide Telecom operator to offer VoIP and high-speed data access to the end-users, they found that among the previous stated impairments, the packet loss is the major source of impairment that greatly affects VoIP call quality which has an immense influence on delivered speech, thus it is required to minimize the effect of packet loss impairment and others on speech quality during any interactive transmission to guarantee the best available quality.

Key Terms in this Chapter

FMS: Is a media server from Adobe Systems. It works with the Flash Player runtime to create media driven, multi-user Rich Internet Applications (RIAs) allowing multiple Flash Player clients runtime to connect and exchange multimedia contents from the server.

FFP: The number of audio frames per packet used in the packetization process.

QoS: Is an acronym for Quality of Service defined in ITU-T recommendation E.800 and it is the measurement of service performance effects determining the degree of the satisfaction of the user of such service.

PESQ: Is an acronym for Perceptual Evaluation of Speech Quality. PESQ is an ITU standard P. 862 in telecommunications and IP networks and one of the objective quality assessment methods used to represent human subjective tests. It assess the speech quality by comparing the original and the degraded version of the speech sample.

RTMFP: Is an acronym for Real-Time Message Protocol. RTMFP is based on UDP and is developed by Adobe system for video-audio-data transmission, supporting client-server model as well as Peer-to-Peer (P2P) model.

Codec: Is software that is used to encode and compress or decode and decompress a digital media file, such as audio. Image, or video.

Speex: Is a non-commercial, open-source and free from patent royalties speech codec. It is designed for packet networks and VoIP applications, as well as file-based compression. The Speex algorithm supports a wide variety of variable bit-rate mode to reserve bandwidth and maintain a pre-specified level of speech quality.

VOIP: Is an acronym for Voice over Internet Protocol. VoIP is an alternative technology for Public Switched Telephone Network (PSTN) and circuit switching. VoIP technology digitizes and compresses voice conversations into voice packets which are carried over data centric packet switching networks such as the Internet Protocol (IP) networks. Thus VoIP makes creation of new and innovative services possible with greater possibility for cost reduction in phone calls and infrastructure costs due to the widespread availability of IP networks.

Complete Chapter List

Search this Book: