Making Image Retrieval and Classification More Accurate Using Time Series and Learned Constraints

Making Image Retrieval and Classification More Accurate Using Time Series and Learned Constraints

Chotirat “Ann” Ratanamahatana (Chulalongkorn University, Thailand), Eamonn Keogh (University of California, Riverside, USA) and Vit Niennattrakul (Chulalongkorn University, Thailand)
DOI: 10.4018/978-1-60566-174-2.ch007

Abstract

After the generation of multimedia data turning digital, an explosion of interest in their data storage, retrieval, and processing, has drastically increased in the database and data mining community. This includes videos, images, and handwriting, where we now have higher expectations in exploiting these data at hand. We argue however, that much of this work’s narrow focus on efficiency and scalability has come at the cost of usability and effectiveness. Typical manipulations are in some forms of video/image processing, which require fairly large amounts for storage and are computationally intensive. In this work, we will demonstrate how these multimedia data can be reduced to a more compact form, that is, time series representation, while preserving the features of interest, and can then be efficiently exploited in Content-Based Image Retrieval. We also introduce a general framework that learns a distance measure with arbitrary constraints on the warping path of the Dynamic Time Warping calculation. We demonstrate utilities of our approach on both classification and query retrieval tasks for time series and other types of multimedia data including images, video frames, and handwriting archives. In addition, we show that incorporating this framework into the relevance feedback system, a query refinement can be used to further improve the precision/recall by a wide margin.
Chapter Preview
Top

1 Introduction

Much of the world’s data is in the form of time series, and many other types of data, such as video, image, and handwriting, can also be trivially transformed into time series. Generally, we can use various image processing techniques (Deselaers, Keysers, & Ney, 2003; Käster, Wendt, & Sagerer, 2003; Krishnamachari & Abdel-Mottaleb, 1999; Wang, Yang, & Acharya, 1997; Yeung & Liu, 1995) to complete multimedia data mining tasks, by measuring similarities among the raw images, using certain features such as color, texture, or shape. However, time series representation of these data can significantly help speed up the process since they can be compared much easier and faster using distance measurements. This fact has fueled enormous interest in time series retrieval in the database and data mining community. We argue, however, that much of this work’s narrow focus on efficiency and scalability has come at the cost of usability and effectiveness. For example, the lion’s share of previous work has utilized the Euclidean distance metric, presumably because it is very amenable to indexing (Agrawal, Lin, Sawhney, & Shim, 1995; Chan, Fu, & Yu, 2003; Faloutsos, Ranganathan, & Manolopoulos, 1994). However, there is increasing evidence that the Euclidean metric’s sensitivity to small differences in the time axis makes it unsuitable for most real world problems (Aach & Church, 2001; Bar-Joseph, Gerber, Gifford, Jaakkola, & Simon, 2002; Diez & González, 2000; Kadous, 1999; Schmill, Oates, & Cohen, 1999).

It has long been known that Dynamic Time Warping (DTW) is superior to Euclidean distance for classification and clustering of time series. However, until lately, most research has utilized Euclidean distance because it is more efficiently calculated. A recently introduced technique that greatly mitigates DTW demanding CPU time has sparked a flurry of research activities. However, the technique and its many extensions still only allow DTW to be applied to moderately large datasets. In addition, almost all of the research on DTW has focused exclusively on speeding up its calculation; there has been relatively little work done on approving its accuracy.

In this work, we target the accuracy aspect of correct classification and introduce a new framework that learns arbitrary constraints on the warping path of the DTW calculation. Apart from improving the accuracy of content-based image retrieval, our technique as a side effect speeds up DTW by many orders of magnitude as well. We show the utility of our approach on datasets from diverse domains and demonstrate significant gains in accuracy and efficiency. Moreover, some additional training or human intervention can also be further incorporated into the classic information retrieval technique of relevance feedbackrelevance feedback to achieve even much superior results.

Complete Chapter List

Search this Book:
Reset