Towards Multimedia Digital Libraries

Cláudio de Souza Baptista (University of Campina Grande, Brazil) and Ulrich Schiel (University of Campina Grande, Brazil)
DOI: 10.4018/978-1-59904-879-6.ch037
A multimedia digital library copes with the storage and retrieval of resources of different media such as video, audio, maps, images, and text documents. The main improvement with regard to textual digital libraries is the possibility of retrieving documents in different media combining metadata and content analysis. Content-based Indexing and Retrieval is a complex and ongoing research field with specific problem statements for each media. A prototype of a multimedia digital library is presented.
Digital libraries are a combination of available resources, coupled with services which provide access to them. Although most of the resources are in digital form, and can therefore be retrieved from a client machine, there are those which may be available only in hard-copy. In such cases, indexing and searching services are provided, enabling end-users to discover which resources are available and where they can be located. However, recent advances in multimedia technology have radically changed information systems. Multimedia involves not only the manipulation of alpha-numerical data, but also new data types such as audio, video, images, maps, and text. These new data types are known as multimedia data and the development of information systems that cope with them has become a highly attractive research area. One of such information systems are multimedia digital libraries.

A multimedia digital library copes with the storage and retrieval of resources of different media such as video, audio, maps, images, and text documents. Previously, searching and indexing procedures were restricted to alpha-numeric data types. In the context of textual resources, this is acceptable and efficient, but is not true for multimedia data types where interpretation of their semantics is required for effective indexing and searching. Furthermore, there are specific domains, such as spatial and temporal applications, which require tailored searching, browsing, and indexing mechanisms.

Multimedia digital libraries have some characteristics that make them different from other digital libraries. Some of these main characteristics are presented below:

  • Data model: Due to the high complexity of multimedia data, it is imperative to provide a model with high level of abstraction that can use a hierarchical approach in order to represent content, relationships, structure, behavior, and dynamics of objects. Furthemore, each media needs a specific data model.

  • Large objects: Multimedia digital libraries need to cope with sometimes very large objects of data. Instead of some kilobytes to store a record in a conventional system, mega- or even gigabytes of storage are required for multimedia objects.

  • Indexing: Multimedia digital libraries must provide new index techniques such as content-based information retrieval that enables not only exact match queries, but also similarity queries. In these, fuzzy operators may be needed, and a ranking list of approximate matches is given as a result. Due to the large size of objects and some special features such as continuous playing, new techniques of indexing and buffering, which require real-time constraints, and synchronization, are necessary.

  • Interface: Multimodal interfaces are required with some facilities such as visual query, browsing, audio-visual interface, and virtual reality.

  • Preprocessing: Some treatment must be given to multimedia data before using them; such procedures include compression techniques, data quality enhancement, and addition of metadata in order to deliver more semantic information to raw data.

This paper describes the main issues on designing a multimedia digital library. We discuss backgorund issues on digital libraies, and highlight the relevance of multimedia metadata.

Next section focuses on a new query paradigm based on content-based retrieval for images, video, and audio. Finally, future trends and a conclusion are addressed.

Key Terms in this Chapter

Multimedia Data: Sources in form of text, video, image, maps, or sound.

Digital Library: An environment for retrieval of digital documents. In contrast to a conventional library, documents if interest are not taken away from the library, but can be downloaded or prompted at the host of the user.

Document Indexing: The process of extracting the information contained in a document, creating an index. While the index is always a set of terms, the source document may be in form of text or other media, as video, images, sound.

Content-Based Extraction: Process of extracting information from nontext multimedia data.

Multimedia Document: A document which contains more than one media. Whereas, in hard-copy documents only text, image and maps can be combined, electronic documents can contain any combination of multimedia data.

Information Retrieval: The process of finding information in a set of documents by use of a computer.

