Captions are text that describes some other information; they are especially useful for describing non-text media objects (images, audio, video, and software). Captions are valuable metadata for managing multimedia, since they help users better understand and remember (McAninch, Austin, & Derks, 1992-1993) and permit better indexing of media. Captions are essential for effective data mining of multimedia data, since only a small amount of text in typical documents with multimedia—1.2% in a survey of random World Wide Web pages (Rowe, 2002)—describes the media objects. Thus, standard Web browsers do poorly at finding media without knowledge of captions. Multimedia information is increasingly common in documents, as computer technology improves in speed and ability to handle it, and as people need multimedia for a variety of purposes like illustrating educational materials and preparing news stories.