Exploiting Captions for Web Data Mining

Neil C. Rowe

Source Title: Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications

ISBN13: 9781599049519|ISBN10: 1599049511|EISBN13: 9781599049526

DOI: 10.4018/978-1-59904-951-9.ch084

MLA

Rowe, Neil C. "Exploiting Captions for Web Data Mining." Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications, edited by John Wang, IGI Global, 2008, pp. 1461-1485. https://doi.org/10.4018/978-1-59904-951-9.ch084

APA

Rowe, N. C. (2008). Exploiting Captions for Web Data Mining. In J. Wang (Ed.), Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications (pp. 1461-1485). IGI Global. https://doi.org/10.4018/978-1-59904-951-9.ch084

Chicago

Rowe, Neil C. "Exploiting Captions for Web Data Mining." In Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications, edited by John Wang, 1461-1485. Hershey, PA: IGI Global, 2008. https://doi.org/10.4018/978-1-59904-951-9.ch084

Export Reference

Favorite

View Full Text PDF

Abstract

We survey research on using captions in data mining from the Web. Captions are text that describes some other information (typically, multimedia). Since text is considerably easier to analyze than non-text, a good way to support access to non-text is to index the words of its captions. However, captions vary considerably in form and content on the Web. We discuss the range of syntactic clues (such as HTML tags) and semantic clues (such as particular words). We discuss how to quantify clue strength and combine clues for a consensus. We then discuss the problem of mapping information in captions to information in media objects. While it is hard, classes of mapping schemes are distinguishable, and a segmentation of the media can be matched to a parse of the caption.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Exploiting Captions for Web Data Mining

MLA

APA

Chicago

Export Reference

Abstract

Request Access