Exploiting Captions for Multimedia Data Mining

Neil C. Rowe

doi:10.4018/978-1-60566-014-1.ch073

Hershey, Pennsylvania

New York, New YorkBeijing, China

Special Offers
- Up to 50% off Thousands of Research Books
  From July 1st through October 31st, 2025, we are offering discounts of up to 50% across thousands of titles in Business & Management; Science, Technology, & Medicine; and Education & Social Sciences. Through this campaign, we’re committed to ensuring that our mutual library customers worldwide can continue to access high-quality, peer-reviewed content during these challenging times. If this campaign is successful, we will extend through the end of the year and beyond if there’s a benefit to all parties involved. When hosted on the InfoSci^® Platform, e-books feature no DRM, no additional cost for unlimited-user licensing, full-text PDF & HTML formats, and more. Discount is automatically added at checkout.
  Browse Titles
- IGI Global Scientific Publishing Launches International Brand Ambassador Program
  IGI Global Scientific Publishing has launched a new Ambassador Program, designed to empower research professionals to help spread scholarly resources and foster global research engagement. As a local, mid-sized publisher, this initiative offers IGI Global Scientific Publishing an exciting opportunity to expand its global presence in the academic community and foster meaningful connections among scholars around the world. With currently over 130 ambassadors worldwide, these scholarly experts are dedicated to supporting the publisher’s initiative of disseminating cutting-edge research.
  Learn More
- Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 20 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no hosting or maintenance fees, no additional cost for unlimited-user licensing, full-text PDF & HTML format, and more.
  Learn More
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all available IGI Global Scientific Publishing open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all available IGI Global Scientific Publishing open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through the IGI Global Scientific Publishing Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global Scientific Publishing to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open access endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global Scientific Publishing to publish your work under open access? Review the IGI Global Scientific Publishing open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Exploiting Captions for Multimedia Data Mining

Neil C. Rowe (U.S. Naval Postgraduate School, USA)

Source Title: Encyclopedia of Multimedia Technology and Networking, Second Edition

DOI: 10.4018/978-1-60566-014-1.ch073

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Captions are text that describes some other information; they are especially useful for describing nontext media objects (images, audio, video, and software). Captions are valuable metadata for managing multimedia, since they help users better understand and remember (McAninch, Austin, & Derks, 1992-1993) and permit better indexing of media. Captions are essential for effective data mining of multimedia data, since only a small amount of text in typical documents with multimedia—1.2% in a survey of random World Wide Web pages (Rowe, 2002)—describes the media objects. Thus standard Web browsers do poorly at finding media without knowledge of captions. Multimedia information is increasingly common in documents as computer technology improves in speed and ability to handle it, and people need multimedia for a variety of purposes like illustrating educational materials and preparing news stories. Captions are also valuable because nontext media rarely specify internally the creator, date, or spatial and temporal context, and cannot convey linguistic features like negation, tense, and indirect reference. Furthermore, experiments with users of multimediaretrieval systems show a wide range of needs (Sutcliffe, Hare, Doubleday, & Ryan, 1997), but a focus on media meaning rather than appearance (Armitage & Enser, 1997). This suggests that content analysis of media is unnecessary for many retrieval situations, which is fortunate, because it is often considerably slower and more unreliable than caption analysis. But using captions requires finding them and understanding them. Many captions are not clearly identified, and the mapping from captions to media objects is rarely easy. Nonetheless, the restricted semantics of media and captions can be exploited.

Chapter Preview

Top

Finding, Rating, And Indexing Captions

Background

Much text in a document near a media object is unrelated to that object, and even text explicitly associated with an object may often not describe it (like “JPEG picture here” or “Photo39573”). Thus, we need clues to distinguish and rate a variety of caption possibilities and words within them, allowing there may be more than one caption for an object or more than one object for a caption. Free commercial media search engines (like images.google.com, multimedia.lycos.com, and www.altavista.com/image) use a few simple clues to index media, but their accuracy is significantly lower than that for indexing text. For instance, Rowe (2005) reported that none of five major image search engines could find pictures for “President greeting dignitaries” in 18 tries. So research is exploring a broader range of caption clues and types (Mukherjea & Cho, 1999; Sclaroff, La Cascia, Sethi, & Taycher, 1999).

Sources of Captions

Some captions are explicitly attached to media objects in adding them to a digital library or database. On Web pages, HTML “alt” and “caption” tags also explicitly associate text with media objects. Clickable text links to media files are another good source of captions since the text must explain the link. The name of a media itself can be a short caption (like “socket_wrench.gif”). Less-explicit captions use conventions like centering or font changes to text. Titles and headings preceding a media object can sometimes serve as captions as they generalize over a block of information. Paragraphs above, below, or next to media can also be captions, especially short paragraphs.

Other captions are embedded directly into the media, like characters drawn on an image (Lienhart & Wernicke, 2002) or explanatory words at the beginning of audio. These require specialized processing like optical character recognition to extract. Captions can be attached through a separate channel of video or audio, as with the “closed captions” associated with television broadcasts that aid hearing-impaired viewers and students learning languages. “Annotations” can function like captions though they tend to emphasize analysis or background knowledge.

Key Terms in this Chapter

Controlled vocabulary: A limited menu of words from which metadata like captions must be constructed.

Metadata: Information describing another data object such as its size, format, or description.

HTML: Hypertext Markup Language, the base language of pages on the World Wide Web.

Caption: Text describing a media object.

Deixis: A linguistic expression whose understanding requires understanding something besides itself, as with a caption.

Media Search Engine: A Web search engine designed to find media (usually images) on the Web.

Web Search Engine: A Web site that finds other Web sites whose contents match a set of keywords, using a large index to Web pages.

“Alt” String: An HTML tag for attaching text to a media object.

Data Mining: Searching for insights in large quantities of data.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Exploiting Captions for Multimedia Data Mining

Abstract

Finding, Rating, And Indexing Captions

Background

Sources of Captions

Key Terms in this Chapter

Complete Chapter List