Speechfind: Advances in Rich Content Based Spoken Document Retrieval

Speechfind: Advances in Rich Content Based Spoken Document Retrieval

Wooil Kim (Center for Robust Speech Systems (CRSS) and Erik Jonsson School of Engineering and Computer Science at the University of Texas at Dallas, USA) and John H.L. Hansen (University of Texas at Dallas, USA)
DOI: 10.4018/978-1-59904-879-6.ch017
OnDemand PDF Download:
$37.50

Abstract

This chapter addresses a number of advances in formulating spoken document retrieval for the National Gallery of the Spoken Word (NGSW) and the U.S.-based Collaborative Digitization Program (CDP). After presenting an overview of the audio stream content of the NGSW and CDP audio corpus, an overall system diagram is presented with a discussion of critical tasks associated with effective audio information retrieval that include advanced audio segmentation, speech recognition model adaptation for acoustic background noise and speaker variability, and information retrieval using natural language processing for text query requests that include document and query expansion. Our experimental online system entitled “SpeechFind” is presented which allows for audio retrieval from the NGSW and CDP corpus. Finally, a number of research challenges as well as new directions are discussed in order to address the overall task of robust phrase searching in unrestricted audio corpora.
Chapter Preview
Top

Introduction

The focus of chapter is to provide an overview of the SpeechFind online spoken document retrieval system, including its subtasks, corpus enrollment, and online search and retrieval engines (Hansen, Huang, Zhou, Seadle, Deller, Gurijala, et al., 2005, http://www.ngsw.org) and the Collaborative Digitization Program (CDP, http://cdpheritage.org). The field of spoken document retrieval requires an interdisciplinary effort, with researchers from electrical engineering (speech recognition), computer science (natural language processing), historians, library archivists, and so forth. As such, we provide a summary of acronyms and definition of terms at the end of this chapter to assist those interested in spoken document retrieval for audio archives.

The problem of reliable speech recognition for spoken document/information retrieval is a challenging problem when data are recorded across different media, equipment, and time periods. NGSW is the first large-scale repository of its kind, consisting of speeches, news broadcasts, and recordings that are of significant historical content. The U.S. National Science Foundation recently established an initiative to provide better transition of library services to digital format. As part of this Phase-II Digital Libraries Initiative, researchers from Michigan State University (MSU) and University of Texas at Dallas (UTD, formerly at Univ. of Colorado at Boulder) have teamed to establish a fully searchable, online WWW database of spoken word collections that span the 20th century. The database draws primarily from holdings of MSU’s Vincent Voice Library (VVL) that includes +60,000 hours of recordings.

In the field of robust speech recognition, there are a variety challenging problems that persist, such as reliable speech recognition across wireless communications channels, recognition of speech across changing speaker conditions (e.g. emotion and stress [Bou-Ghazale & Hansen, 2000; Hansen, 1996; Sarikaya & Hansen, 2000] and accent [Angkititrakul & Hansen, 2006; Arslan & Hansen, 1997]), or recognition of speech from unknown or changing acoustic environments. The ability to achieve effective performance in changing speaker conditions for large vocabulary continuous speech recognition (LVCSR) remains a challenge, as demonstrated in recent DARPA evaluations focused on broadcast news (BN) vs. previous results from the Wall Street Journal (WSJ) corpus.

One natural solution to audio stream search is to perform forced transcription for the entire dataset, and simply search the synchronized text stream. While this may be a manageable task for BN (consisting of about 100 hours), the initial offering for NGSW will be 5000 hours (with a potential of +60,000 total hours), and it will simply not be possible to achieve accurate forced transcription since text data will generally not be available. Other studies have also considered Web-based spoken document retrieval (SDR) (Fujii & Itou, 2003; Hansen, Zhou, Akbacak, Sarikaya, & Pellom, 2000; Zhou & Hansen, 2002). Transcript generation of broadcast news can also be conducted in an effort to obtain near real-time close-captioning (Saraclar, Riley, Bocchieri, & Goffin, 2002). Instead of generating exact transcripts, some studies have considered summarization and topic indexing (Hori & Furui, 2000; Maskey & Hirschberg, 2003; Neukirchen, Willett, & Rigoll, 1999), or more specifically, topic detection and tracking (Walls, Jin, Sista, & Schwartz, 1999), and others have considered lattice-based search (Saraclar & Sproat, 2004). Some of these ideas are related to speaker clustering (Moh, Nguyen, & Junqua, 2003; Mori & Nakagawa, 2001), which is needed to improve acoustic model adaptation for BN transcription generation. Language model adaptation (Langzhou, Gauvain, Lamel, & Adda, 2003) and multiple/alternative language modeling (Kurimo, Zhou, Huang, & Hansen, 2004) have also been considered for SDR. Finally, cross and multilingual-based studies have also been performed for SDR (Akbacak & Hansen, 2006; Navratil, 2001; Wang, Meng, Schone, Chen, & Lo, 2001).

Key Terms in this Chapter

LVCSR: Large Vocabulary Continuous Speech Recognition

Word Error Rate: (WER): A performance measure for speech recognition that includes substitution errors (i.e., miss-recognition of one word for another), deletion errors (i.e., words missed by the recognition system), and insertions (i.e., words introduced into the text output by the recognition system).

Mel Frequency Cepstral Coefficients: (MFCC): A standard set of features used to parameterize speech for acoustic models in speech recognition

NGSW: The National Gallery of the Spoken Word – National Science Foundation (NSF in USA) supported Digital Libraries Initiative consortium of Universities to establish the first nationally recognized, fully searchable online audio archive.

Broadcast News: (BN): An audio corpus consisting of recordings from TV and radio broadcasts used for developing/performance assessment of speech recognition systems

Out-of-Vocabulary: (OOV): In speech recognition, the available vocabulary must first be defined. OOV refers to vocabulary contained in the input audio signal, which is not part of the available vocabulary lexicon, and therefore will always be miss-recognized using automatic speech recognition.

Managing Gigabytes (MG): One of the two general purpose-based systems available for text search and indexing. See the textbook by Witten, Moffat, and Bell (1999) for extended discussion.

SDR: Spoken Document Retrieval

Collaborative Digitization Program (CDP): A consortium of libraries, universities, and archives working together to establish best practices for transitioning materials (e.g., audio, image, etc.) to digital format.

ASR: Automatic Speech Recognition

Complete Chapter List

Search this Book:
Reset
List of Reviewers
Table of Contents
Detailed Table of Contents
Foreword
Gary Gorman
Preface
Yin-Leng Theng, Schubert Foo, Dion Goh, Jin-Cheon Na
Acknowledgement
Chapter 1
Leonardo Candela, Donatella Castelli, Pasquale Pagano
This chapter introduces OpenDLib, a digital library service system developed at ISTI-CNR for easing the creation and management of digital... Sample PDF
OpenDLib: A Digital Library Service System
$37.50
Chapter 2
Mohammed Nasser Al-Suqri, Esther O.A. Fatuyi
Deliberate exploitation of natural resources and excessive use of environmentally abhorrent materials have resulted in environmental disruptions... Sample PDF
Digital Library Service System; Digital Library System; Document Model; Grid Computing; Grid Infrastructure; Institutional Repository; Service Oriented Architecture
$37.50
Chapter 3
Sarah-Jane Saravani
This chapter describes a learning object repository case study undertaken at the Waikato Institute of Technology, Hamilton, New Zealand, during 2005... Sample PDF
Access and Control; Digital Libraries; Information Ethics; Privacy; Security
$37.50
Chapter 4
Jian-hua Yeh, Shun-hong Sie, Chao-chen Chen
In this chapter, we describe X-system, a general digital library platform which is capable of handling large-scale digital contents with flexible... Sample PDF
Extensible Digital Library Service Platform
$37.50
Chapter 5
Juan C. Lavariega, Lorena G. Gomez, Martha Sordia-Salinas, David A. Garza-Salazar
This chapter presents the services and functionality that a personal digital library (PDL) system should provide. The chapter includes a reference... Sample PDF
Personal Digital Libraries
$37.50
Chapter 6
George Pyrounakis, Mara Nikolaidou
In the last years, a great number of digital library and digital repository systems have been developed by individual organizations, mostly... Sample PDF
Comparing Open Source Digital Library Software
$37.50
Chapter 7
Ian H. Witten, David Bainbridge
This chapter describes the evolution of the Greenstone digital library project through its first 10 years of development. It provides an overview of... Sample PDF
The Greenstone Digital Library Software
$37.50
Chapter 8
Yin-Leng Theng, Nyein Chan Lwin Lwin, Jin-Cheon Na, Schubert Foo, Dion Hoe-Lian Goh
This chapter addresses the issues of resource discovery in digital libraries (DLs) and the importance of knowledge organization tools in building... Sample PDF
Design and Development of a Taxonomy Generator: A Case Example for Greenstone
$37.50
Chapter 9
Schubert Foo, Yin-Leng Theng, Dion Hoe-Lian Goh, Jin-Cheon Na
Digital archives typically act as stand-alone digital libraries to support search and discovery by users to access its rich set of digitized... Sample PDF
From Digital Archives to Virtual Exhibitions
$37.50
Chapter 10
Carmen Galvez
This chapter presents the different standardization methods of terms at the two basic approaches of nonlinguistic and linguistic techniques, and... Sample PDF
Standardization of Terms Applying Finite-State Transducers (FST)
$37.50
Chapter 11
Fu Lee Wang, Christopher C. Yang
As more information becomes available online, information-overloading results. This problem can be resolved through the application of automatic... Sample PDF
Extracting the Essence: Automatic Text Summarization
$37.50
Chapter 12
Metadata Interoperability  (pages 122-130)
K. S. Chudamani, H. C. Nagarathna
Metadata is data about data. Metadata originated in the context of digital information in databases. This chapter looks at the various standards... Sample PDF
Metadata Interoperability
$37.50
Chapter 13
Payam M. Barnaghi, Wei Wang, Jayan C. Kurian
The Semantic Web is an extension to the current Web in which information is provided in machine-processable format. It allows interoperable data... Sample PDF
Semantic Association Analysis in Ontology-Based Information Retrieval
$37.50
Chapter 14
Gerald Schaefer
As image databases are growing, efficient and effective methods for managing such large collections are highly sought after. Content-based... Sample PDF
Effective and Efficient Browsing of Large Image Databases
$37.50
Chapter 15
Giovanni Semeraro, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops
Exploring digital collections to find information relevant to a user’s interests is a challenging task. Information preferences vary greatly across... Sample PDF
User Profiles for Personalizing Digital Libraries
$37.50
Chapter 16
Shiyan Ou, Christopher S.G. Khoo, Dion Hoe-Lian Goh
This chapter describes various text summarization techniques and evaluation techniques that have been proposed in literature and discusses the... Sample PDF
Automatic Text Summarization in Digital Libraries
$37.50
Chapter 17
Wooil Kim, John H.L. Hansen
This chapter addresses a number of advances in formulating spoken document retrieval for the National Gallery of the Spoken Word (NGSW) and the... Sample PDF
Speechfind: Advances in Rich Content Based Spoken Document Retrieval
$37.50
Chapter 18
Sofia Stamou
This chapter introduces a personalized ranking function as a means of offering Web information seekers with search results that satisfy their... Sample PDF
Using Topic-Specific Ranks to Personalize Web Search
$37.50
Chapter 19
Irene Lourdi, Mara Nikolaidou
his chapter presents basic guidelines for maintaining digital cultural collections in order for them to be interoperable and easily retrievable from... Sample PDF
Guidelines for Developing Digital Cultural Collections
$37.50
Chapter 20
Neide Santos, Fernanda C.A. Campos, Regina M.M. Braga Villela
Nowadays, social, economical, cultural, and technological changes deeply stress the professional profiles. As a consequence, everyone needs to be... Sample PDF
Digital Libraries and Ontology
$37.50
Chapter 21
Svenja Hagenhoff, Björn Ortelbach, Lutz Seidenfaden
Information and communication technologies seem to bring new dynamics to the established, but partly deadlocked, system of scholarly communication.... Sample PDF
A Classification Scheme for Innovative Types in Scholarly Communication
$37.50
Chapter 22
Stefano Paolozzi, Fernando Ferri, Patrizia Grifoni
This chapter describes multimodality as a means of augmenting information retrieval activities in multimedia digital libraries. Multimodal... Sample PDF
Improving Multimedia Digital Libraries Usability Applying NLP Sentence Similarity to Multimodal Sentences
$37.50
Chapter 23
Ana Kovacevic, Vladan Devedzic
Our research efforts are oriented towards applying text mining techniques in order to help librarians make more informative decisions when selecting... Sample PDF
Duplicate Journal Title Detection in References
$37.50
Chapter 24
Jin-Cheon Na, Tun Thura Thet, Dion Hoe-Lian Goh, Yin-Leng Theng, Schubert Foo
This chapter introduces word segmentation methods for Indo-China languages. It describes six different word segmentation methods developed for the... Sample PDF
Word Segmentation in Indo-China Languages for Digital Libraries
$37.50
Chapter 25
Dion Hoe-Lian Goh, Khasfariyati Razikin, Alton Y.K. Chua, Chei Sian Lee, Schubert Foo
Social tagging is the process of assigning and sharing among users freely selected terms of resources. This approach enables users to... Sample PDF
On the Effectiveness of Social Tagging for Resource Discovery
$37.50
Chapter 26
Taha Osman, Dhavalkumar Thakker, Gerald Schaefer
While many digital image libraries allow access to large repositories of images, unfortunately, often the provided free-text search returns... Sample PDF
Semantic Annotation and Retrieval of Images in Digital Libraries
$37.50
Chapter 27
Ali Shiri
This chapter introduces a new category of digital library user interfaces called metadata-enhanced visual interfaces. Drawing on the earlier... Sample PDF
Metadata and Metaphors in Visual Interfaces to Digital Libraries
$37.50
Chapter 28
Judy Jeng
This chapter introduces the concept of usability and provides examples of how usability has been used in digital library evaluations. Usability is a... Sample PDF
Usability Evaluation of Digital Library
$37.50
Chapter 29
Stephen Kimani, Emanuele Panizzi, Tiziana Catarci, Margerita Antona
The gathering of user requirements is key to the gaining of a deeper understanding of the needs evolving from the user’s operational context and... Sample PDF
Digital Library Requirements: A Questionnaire-Based Study
$37.50
Chapter 30
Spyros Veronikis, Giannis Tsakonas, Christos Papatheodorou
The present chapter introduces digital library services’ utilization through handheld devices, such as personal digital assistants (PDAs) and... Sample PDF
Handhelds for Digital Libraries
$37.50
Chapter 31
Mila M. Ramos, Luz Marina Alvaré, Cecilia Ferreyra, Peter Shelton
This chapter introduces the Consultative Group on International Agricultural Research (CGIAR) Virtual Library as a tool for linking researchers and... Sample PDF
The CGIAR Virtual Library Bridging the Gap Between Agricultural Research and Worldwide Users
$37.50
Chapter 32
Robert Neumayer, Andreas Rauber
In this chapter, we introduce alternative ways to access digital audio collections. We give an overview of existing applications based on... Sample PDF
Map-Based User Interfaces for Music Information Retrieval
$37.50
Chapter 33
Hideyasu Sasaki
In this chapter, we discuss the issues on patent and trade secret issues on digital libraries, especially patentable parameter-setting components... Sample PDF
Patent and Trade Secret in Digital Libraries
$37.50
Chapter 34
Thomas Mandl
This chapter describes personalization strategies adopted in digital libraries. Personalization and individualization are introduced as means to... Sample PDF
User-Adapted Information Services
$37.50
Chapter 35
Hepu Deng
Digital resources are readily available and easily accessible with the rapid development of information and communication technologies nowadays.... Sample PDF
An Empirical Analysis of the Utilization of University Digital Library Resources
$37.50
Chapter 36
Gerald Schaefer, Simon Ruszala
Following the ever-growing sizes of image databases, effective methods for visualising such databases and navigating through them are much sought... Sample PDF
Visualisation of Large Image Databases
$37.50
Chapter 37
Cláudio de Souza Baptista, Ulrich Schiel
A multimedia digital library copes with the storage and retrieval of resources of different media such as video, audio, maps, images, and text... Sample PDF
Towards Multimedia Digital Libraries
$37.50
Chapter 38
Nuria Lloret Romero, Margarita Cabrera Méndez, Alicia Sellés Carot, Lilia Fernandez Aquino
The Biblioteca Valenciana was created by the decree 5/1985 of the 8th of January and is presented primarily as “upper library centre of the... Sample PDF
BIVALDI the Digital Library of the Valencian Bibliographic Inheritance
$37.50
Chapter 39
Rubén Béjar, J. Nogueras-Iso, Miguel Ángel Latre, Pedro Rafael Muro-Medrano, F. J. Zarazaga-Soria
This chapter introduces Spatial Data Infrastructures (SDI) and establishes their strong conceptual and technical relationships with geographic... Sample PDF
Digital Libraries as a Foundation of Spatial Data Infrastructures
$37.50
Chapter 40
O. Cantán Casbas, J. Nogueras-Iso, F. J. Zarazaga-Soria
A new collaboration paradigm is in order between Digital Libraries (DL) and Geographic Information Systems (GIS). These important Information... Sample PDF
DL and GIS: Path to a New Collaboration Paradigm
$37.50
Chapter 41
Piedad Garrido Picazo, Jesús Tramullas Saz, Manuel Coll Villalta
This chapter introduces digital libraries as a means of cultural heritage access and diffusion. It argues that digital libraries, combined with... Sample PDF
Digital Libraries Beyond Cultural Heritage Information
$37.50
Chapter 42
Wan Ab. Kadir Wan Dollah, Diljit Singh
Information and communication technologies have been used to assist in various functions of library and information units. Digital reference... Sample PDF
Reference Services in Digital Environment
$37.50
Chapter 43
Frances L. Lightsom, Alan O. Allwardt
The U.S. Geological Survey (USGS) has developed three related digital libraries providing access to topical and georeferenced information for... Sample PDF
USGS Digital Libraries for Coastal and Marine Science
$37.50
Chapter 44
Digital Preservation  (pages 431-440)
Stephan Strodl, Christoph Becker, Andreas Rauber
The rapid ongoing changes in software and hardware put digital information at risk. The challenge is to keep electronic data accessible, viewable... Sample PDF
Digital Preservation
$37.50
Chapter 45
Gerald Schaefer
While image retrieval and image compression have been pursued separately in the past, compressed domain techniques, which allow processing or... Sample PDF
Visual Pattern Based Compressed Domain Image Retrieval
$37.50
Chapter 46
Thomas Lidy, Andreas Rauber
This chapter provides an overview of the relatively young but increasingly important domain of Music Information Retrieval, an Information Retrieval... Sample PDF
Music Information Retrieval
$37.50
Chapter 47
Juha Kettunen
This abstract describes the networked cooperation of the academic libraries and the consortium of the digital libraries of the Finnish universities... Sample PDF
The Strategic Plan of Digital Libraries
$37.50
Chapter 48
Leonardo Bermón-Angarita, Antonio Amescua-Seco, Maria Isabel Sánchez-Segura, Javier García-Guzmán
This paper establishes the incorporation of knowledge management techniques as a means to improve actual software process asset libraries. It... Sample PDF
Software Process Asset Libraries Using Knowledge Repositories
$37.50
Chapter 49
Lee Yen Han
In recent years, the development of information technologies and network distributions has brought about the creation of useful learning resources... Sample PDF
The Role and Integration of Digital Libraries in E-Learning
$37.50
Chapter 50
Kanwal Ameen, Muhammad Rafiq
This chapter aims to discuss the development of digital libraries in Pakistan. It gives an account of the digital transformation taking place in the... Sample PDF
Development of Digital Libraries in Pakistan
$37.50
Chapter 51
Seungwon Yang, Barbara M. Wildemuth, Jeffrey P. Pomerantz, Sanghee Oh
This chapter introduces the effort of developing a digital library (DL) curriculum by an interdisciplinary team from Virginia Tech and the... Sample PDF
Core Topics in Digital Library Education
$37.50
Chapter 52
Natalie Pang
Using historical perspectives from ancient libraries in Europe, this chapter is focused on the core role of libraries as centres of knowledge.... Sample PDF
Digital Libraries as Centres of Knowledge: Historical Perspectives from European Ancient Libraries
$37.50
Chapter 53
Wolfgang Ratzek
Triggered by a rapid diffusion of ICT within the last two decades, libraries have undergone a (r)evolutionary change in both mission and services.... Sample PDF
The European Approach Towards Digital Library Education: Dead End or Recipe for Success?
$37.50
Chapter 54
Faisal Ahmad, Tamara Sumner, Holly Devaul
The limited scope of digital libraries can be attributed to the brick and mortar vision of the library metaphor. In order to extend scope of digital... Sample PDF
New Roles of Digital Libraries
$37.50
Chapter 55
Yongqing Ma, Warwick Clegg, Ann O’Brien
In this entry, we review the history, development and current status of digital library (DL) courses and programmes now being offered, mainly by... Sample PDF
A Review of Progress in Digital Library Education
$37.50
Chapter 56
Chang Chew-Hung, John G. Hedberg
While the prospect of using digital libraries for learning becomes more appealing with growing repositories of resources, it is not clear what... Sample PDF
The Future of Learning with Digital Libraries
$37.50
Chapter 57
Michael B. Twidale, David M. Nichols
This chapter discusses the role of technology in digital library education. It explores how elements of computer science and library science can be... Sample PDF
Computational Sense for Digital Librarians
$37.50
Chapter 58
Soh Whee Kheng Grace
Library digitization on a global basis is essential in the twenty-first century. The digital library development initiatives in most countries... Sample PDF
Digital Libraries Overview and Globalization
$37.50
About the Contributors