Following the ever-growing sizes of image databases, effective methods for visualising such databases and navigating through them are much sought after. These methods should provide an “overview” of a complete database together with the possibility to zoom into certain areas during a specific search. It is crucial that the user interacts in an intuitive way with such a system in order to effectively arrive at images of interest. In this chapter, we look at several techniques that have been presented in the literature and allow for browsing and navigation of large image databases.
Principal Component Analysis (Pca)
Principal component analysis transforms a number of high dimensional correlated variables into a smaller number of uncorrelated variables called principal components and allows to reduce the dimensionality whilst preserving the “essence” of the data. High dimensional data is normally vast in size and ungraspable by the human mind, making some form of representation necessary.
In order to calculate the principal components, the mean vector of the data (which also defines the first principal component) is calculated and subtracted from the samples (hence, resulting in a distribution centred around the origin). Using singular value decomposition (SVD), the remaining components are obtained by producing a diagonal matrix with eigenvalues in descending order. Each singular value is proportional to the square root of the variances and the corresponding eigenvectors are the principal components. Once these have been calculated, all samples (i.e., images) in the database can be projected onto the principal components and the projection weights be used for assigning coordinates for the display of each image thumbnail (i.e., for the display in a two-dimensional space, such as a monitor the first two principal components would be used).
Key Terms in this Chapter
Multidimensional Scaling (MDS): A dimensionality reduction technique used for projecting high-dimensional data into a low-dimensional space in an optimal way that introduces as little distortion to the original data as possible.
Query-by-Example Retrieval: Retrieval paradigm in which a query is provided by the user and the system retrieves instances similar to the query.
Principal Component Analysis (PCA): An orthogonal linear transform used to transform data into a new coordinate system which maximises the variance captured by the first few base vectors (the principal components).
Image Database Navigation: The browsing of a complete image collection based, for example, on CBIR concepts.
Content-Based Image Retrieval (CBIR): Retrieval of images based not on keywords or annotations, but based on features extracted directly from the image data.