The Speech-Enabled Web

The Speech-Enabled Web

L. E. Moser (University of California, Santa Barbara, USA) and P. M. Melliar-Smith (University of California, Santa Barbara, USA)
Copyright: © 2008 |Pages: 10
DOI: 10.4018/978-1-59140-993-9.ch079


Speech recognition and synthesis technology has advanced to the point where the use of voice input and output is now feasible for Web-based applications over the Internet. This article describes applications, standards, and architectures for a speech-enabled Web, or SpeechWeb. The ready availability of mobile devices, such as cell phones and PDAs with wireless access to the Internet but without a conventional desktop keyboard, mouse, and large display, make voice input and output very compelling. Voice input and output for small screen/ keyboard devices, and for hands-/eyes-free situations, is essential to enable the user’s interaction with the device and to make it more user friendly.

Key Terms in this Chapter

Speech Recognition: The process of interpreting human speech for transcription or as a method of interacting with a computer, using a computer equipped with a source of speech input, such as a microphone.

SpeechWeb: A collection of hyper-linked applications that are distributed over the Internet and are accessible by spoken commands and queries that are input through remote end-user devices.

Speech Synthesis Markup Language (SSML): A standard that specifies the rendering of synthesized speech to the user.

Voice Extensible Markup Language (Voice-XML): A standard that is used for defining dialogs and for specifying the exchange of information between a user and a speech application.

Speech Recognition Grammar Specification (SRGS): A standard that allows applications to specify the words and phrases that users are prompted to speak.

Speech Synthesis: The artificial production of human speech. Speech synthesis systems are also called text-to-speech systems in reference to their ability to convert text into speech.

Speech Application Language Tags (SALT): A standard that enables multi-modal and telephony access to the Web by providing access to information, applications, and Web services from PCs, telephones, and PDAs.

Complete Chapter List

Search this Book: