IP Telephony provides a way for an enterprise to extend consistent communication services to all employees, whether they are in main campus locations, at branch offices, or working remotely, also with a mobile phone. IP Telephony transmits voice communications over a network using open standard-based Internet protocols. This is both the strength and weakness of IP Telephony as the involved basic transport protocols (RTP, UDP, and IP) are not able to natively guarantee the required application quality of service (QoS). From the point of view of an IP Telephony Service Provider this definitely means possible waste of clients and money. Specifically the problem is at two different levels: i) in some countries, wherelong distance and particularly international call tariffs are high, perhaps due to a lack of competition or due to cross subsidies to other services, the major opportunity for IP Telephony Service Providers is for price arbitrage. This means working on diffusion of an acceptable service, although not at high quality levels; ii) in other countries, where different IP Telephony Service Providers already exist, the problem is competition for offering the best possible quality. The main idea behind this chapter is to analyze specifically the state of the art playout control strategies with the following aims: i) propose the reader the technical state of the art playout control management and planning strategies (overview of basic KPIs for IP Telephony); ii) compare the strategies IP Telephony Service Provider can choose with the aim of saving money and offering a better quality of service; iii) introduce also the state of the art quality index for IP Telephony, that is a set of algorithms for taking into account as many factors as possible to evaluate the service quality; iv) provide the reader with examples on some economic scenarios of IP Telephony.
The combination of IP and a telephonic service gives IP Telephony. IP Telephony implements services like sending/receiving/management of voice and data-voice, between two or more users in a real time fashion over an already existing IP channel. The basic frameworks for implementing an IP Telephony solution are ITU-T H.323 (Rec. H.323, 2006) and SIP (RFC 3261, 2002). The ITU-T H.323 is a recommendation that defines the protocols to provide audio-visual communication sessions on any packet network, involving both the management of the call and the speech packet transport, including the speech coding, (see Figure 1).
The H.323 standard specifies four kinds of components, providing a point-to-point and point-to-multipoint IP Telephony service (see also Figure 2 from packetizer.com):
The ITU-T H.323 components
Terminals: A personal computer (PC) or a stand-alone device, running an H.323 and the multimedia applications
Gateways: Providing of connectivity between an H.323 network and a non–H.323 network
Gatekeepers: The focal point for all calls within the H.323 network
Multipoint control units (MCUs): Support for conferences of three or more H.323 terminals
All these components and the related operations are detailed in the ITU-T documents. The ITU-T H.323 is an umbrella recommendation that comprises directly, or through a reference, all the standards needed for coding, transporting and managing IP Telephony sessions. As an example, coding operations are referenced in ITU-T H.323 documents but described in detail in other recommendations such as ITU-T G.723 (Rec. G.723.1, 2006) and G.729 (Rec. G.729, 2007).
Furthermore, H.323 is part of the H.32X family, which comprises different recommendations (see Table 1).Table 1.
|H.322||LANs that provide guaranteed QoS|
|H.320||integrated services digital networks (ISDN)|
|H.321 and H.310||broadband integrated services digital networks (B–ISDN)|
On the contrary, the Session Initiation Protocol (SIP) (RFC 3261, 2002) is only an application-layer control (signaling) protocol for managing sessions with one or more participants. It can be used to create two-party, multi-party, or multicast sessions that include Internet telephone calls, multimedia distribution, and multimedia conferences. SIP clients use TCP or UDP to connect to SIP servers and other SIP endpoints. SIP is primarily used in setting up voice/video calls. However, it can be used in any application where session initiation is a requirement.
Key Terms in this Chapter
Real-time Protocol (RTP): The Real-time Transport Protocol (or RTP) defines a standardized packet format for delivering audio and video over the Internet. It was developed by the Audio-Video Transport Working Group of the IETF and first published in 1996 as RFC 1889 which was made obsolete in 2003 by RFC 3550. Real time transport protocol can also be used in conjunction with RSVP protocol which enhances the field of multimedia applications (from wikipedia.org).
ITU-T H.323: A protocol suite defined by ITU-T, is for voice transmission over internet (Voice over IP or VOIP). In addition to voice applications, H.323 provides mechanisms for video communication and data collaboration, in combination with the ITU-T T.120 series standards. H.323 is one of the major VOIP standards, just as Megaco and SIP. H.323 is an umbrella specification, because it includes a various other ITU standards (from cisco.com).
Quality of Service. Quality of Service (QoS): Refers to the capability of a network to provide better service to selected network traffic over various technologies, including Frame Relay, Asynchronous Transfer Mode (ATM), Ethernet and 802.1 networks, SONET, and IP-routed networks that may use any or all of these underlying technologies. The primary goal of QoS is to provide priority including dedicated bandwidth, controlled jitter and latency (required by some real-time and interactive traffic), and improved loss characteristics. Also important is making sure that providing priority for one or more flows does not make other flows fail. QoS technologies provide the elemental building blocks that will be used for future business applications in campus, WAN, and service provider networks (from www.cisco.com documentation).
ITU-T G.723.1: A dual rate codec that compresses the speech signal to 6.3 and 5.3 kbps while maintaining the toll quality and relatively low delay (from ieeexplore).
Talkspurt: A single unit of silence between two periods of speech.
Internet Protocol (IP): The Internet Protocol (IP) is a network-layer (Layer 3) protocol that contains addressing information and some control information that enables packets to be routed. IP is documented in RFC 791 and is the primary network-layer protocol in the Internet protocol suite. Along with the Transmission Control Protocol (TCP), IP represents the heart of the Internet protocols. IP has two primary responsibilities: providing connectionless, best-effort delivery of datagrams through an internetwork; and providing fragmentation and reassembly of datagrams to support data links with different maximum-transmission unit (MTU) sizes networks (from www.cisco.com documentation).
Mean Opinion Score (MOS): A common measure 8from 1 (bad) to 5 (excellent)) for subjective speech quality is the Mean Opinion Score (MOS) scale, defined in the ITU-T standard P.800. In a MOS test, the test persons listen to short speech samples, where every speech sample consists of two to five sentences. The total MOS score is then the mean of all individual results. Due to the absolute nature of the grading, this kind of test is also called an Absolute Category Rating (ACR) test. (from ericsson.com).
Session Initiation Protocol (SIP): The Session Initiation Protocol (SIP) is an application-layer control (signaling) protocol for creating, modifying, and terminating sessions with one or more participants. It can be used to create two-party, multiparty, or multicast sessions that include Internet telephone calls, multimedia distribution, and multimedia conferences. (cit. RFC 3261). SIP is designed to be independent of the underlying transport layer; it can run on TCP, UDP, or SCTP. It was originally designed by Henning Schulzrinne (Columbia University) and Mark Handley (UCL) starting in 1996. The latest version of the specification is RFC 3261 from the IETF SIP Working Group. In November 2000, SIP was accepted as a 3GPP signaling protocol and permanent element of the IMS architecture (from wikipedia.org).
International Telecommunication Unit (ITU): ITU is the leading United Nations agency for information and communication technologies. As the global focal point for governments and the private sector, ITU’s role in helping the world communicate spans 3 core sectors: radiocommunication, standardization and development. ITU also organizes TELECOM events and was the lead organizing agency of the World Summit on the Information Society. ITU is based in Geneva, Switzerland, and its membership includes 191 Member States and more than 700 Sector Members and Associates (from ww.itu.int).
Key performance indicator (KPI): Key Performance Indicators (KPI) are financial and non-financial metrics used to help an organization define and measure progress toward organizational goals. KPIs are used in Business Intelligence to assess the present state of the business and to prescribe a course of action. The act of monitoring KPIs in real-time is known as business activity monitoring. KPIs are frequently used to “value” difficult to measure activities such as the benefits of leadership development, engagement, service, and satisfaction (from wikipedia.org).
User Datagram Protocol (UDP): User Datagram Protocol (UDP) is one of the core protocols of the Internet protocol suite. Using UDP, programs on networked computers can send short messages sometimes known as datagrams (using Datagram Sockets) to one another. UDP is sometimes called the Universal Datagram Protocol. It was designed by David P. Reed in 1980. UDP does not guarantee reliability or ordering in the way that TCP does (from wikipedia.org).
Public Switched Telephone Network (PSTN): Is the network of the world’s public circuit-switched telephone networks, in much the same way that the Internet is the network of the world’s public IP-based packet-switched networks. Originally a network of fixed-line analog telephone systems, the PSTN is now almost entirely digital, and now includes mobile as well as fixed telephones. The PSTN is largely governed by technical standards created by the ITU-T, and uses E.163/E.164 addresses (known more commonly as telephone numbers) for addressing (from wikipedia.org).
ITU-T G.711. G.711: Is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. G.711 represents logarithmic pulse-code modulation (PCM) samples for signals of voice frequencies, sampled at the rate of 8000 samples/second (from wikipedia.org).
European Telecommunications Standards Institute (ETSI): The European Telecommunications Standards Institute (ETSI) produces globally-applicable standards for Information and Communications Technologies (ICT), including fixed, mobile, radio, converged, broadcast and internet technologies. ETSI is a not-for-profit organization with almost 700 ETSI member organizations drawn from 60 countries world-wide (from www.etsi.org).
Jitter: Jitter is an unwanted variation of one or more signal characteristics in electronics and telecommunications. Jitter may be seen in characteristics such as the interval between successive pulses, or the amplitude, frequency, or phase of successive cycles. Jitter is a significant factor in the design of almost all communications links (from wikipedia.org).
ITU-T G.729. G.729: Is an audio data compression algorithm for voice that compresses voice audio in chunks of 10 milliseconds. Music or tones such as DTMF or fax tones cannot be transported reliably with this codec, and thus use G.711 or out-of-band methods to transport these signals. G.729 is mostly used in Voice over IP (VoIP) applications for its low bandwidth requirement. Standard G.729 operates at 8 kbit/s, but there are extensions, which provide also 6.4 kbit/s and 11.8 kbit/s rates for marginally worse and better speech quality respectively. Also very common is G.729a which is compatible with G.729, but requires less computation. This lower complexity is not free since speech quality is marginally worsened (from wikipedia.org).