PCA as Dimensionality Reduction for Large-Scale Image Retrieval Systems

PCA as Dimensionality Reduction for Large-Scale Image Retrieval Systems

Mohammed Amin Belarbi (Abdelhamid Ibn Badiss University, Faculty of Exact Science and Computer Science, Mostaganem, Algeria), Saïd Mahmoudi (University of Mons, Faculty of Engineering, Mons, Belgium) and Ghalem Belalem (Ahmed Ben Bella University, Faculty of Exact and Applied Science, Oran, Algeria)
Copyright: © 2017 |Pages: 14
DOI: 10.4018/IJACI.2017100104
OnDemand PDF Download:
No Current Special Offers


Dimensionality reduction in large-scale image research plays an important role for their performance in different applications. In this paper, we explore Principal Component Analysis (PCA) as a dimensionality reduction method. For this purpose, first, the Scale Invariant Feature Transform (SIFT) features and Speeded Up Robust Features (SURF) are extracted as image features. Second, the PCA is applied to reduce the dimensions of SIFT and SURF feature descriptors. By comparing multiple sets of experimental data with different image databases, we have concluded that PCA with a reduction in the range, can effectively reduce the computational cost of image features, and maintain the high retrieval performance as well
Article Preview


Nowadays, with the development of cloud computing and multimedia content creation and storage, a lot of applications exploiting image, and video content are daily used (Cheng, Zhuo, & Zhang, 2013). It is not uncommon to find multimedia databases containing thousands or even tens of thousands of images, videos and sounds, whether targeted for a professional field (medical, security, journalism, tourism, education, museums, etc.) or just for individuals which accumulate personal data such as: memories, travels, family, events, movie collections, etc. (Murray, Qiao, Lee, Fallon, & Karunakar, 2011). These applications generate a huge volume of multimedia data.

In order to quickly access to the desired images for users of these huge databases, efficient use, efficient access to multimedia contents has become a crucial task associated with Big Data fields (Patra et al., 2016).

Large-scale content based multimedia retrieval is one of most important technological fields using Big Data (Jain & Bhatnagar, 2016). Content-Based Image Retrieval (CBIR) presents now the most used method allowing to detect the visual characteristics of images by using processing techniques.

Each retrieval system generally computes visual functions from a given query and compares them to a set of image characteristics stored in the database. As result, a list of similar images are shown to the user.

CBIR methods mainly include two phases: feature extraction and similarity measures. When the number of existing images in the database increases, the number of images features become very important and they are expressed in a high dimensional space. In this case, if we process the data directly, we can face to the “Curse of Dimensionality” phenomenon which cannot ameliorate the research algorithms performances (Belarbi, Mahmoudi, & Belalem, 2016).

One of the powerful methods used to solve these problems is the dimensionality reduction. The idea behind this approach is that image characteristics are pre-processed by reducing the characteristics sizes to a lower dimensional space. This method can play important and significant role to overcome the “Curse of Dimensionality”.

This method can be applied by using two kind of approaches: supervised and unsupervised methods. The unsupervised methods are used to reduce the loss of data information, where supervised methods are used when the information of inter-class can be maximized. The unsupervised methods generally used are PCA (Hotelling, 1933)(Kriti, Virmani, Dey, & Kumar, 2016), Multidimensional Scaling (MDS) (Brandes & Pich, 2007) and kernel PCA (KPCA) (Schölkopf, Smola, & Müller, 1998). The objective of PCA is to find the optimal projection matrix. The goal of MDS methods is to measure the Euclidean distance between original data after dimensionality reduction. KPCA method is based on an improved PCA. On the other hand, Fisher Linear Discriminant Analysis (FLDA) (Belhu1meur, Hespanha, & Kriegman, 1997), and Local Fisher Discriminant Analysis (LFDA) (Rahulamathavan, Phan, Chambers, Parish, & others, 2013) are supervised methods.

In this paper, we explore a study of PCA as a method of dimensionality reduction when the volume data increases. Indeed, SIFT and SURF characteristics are computed, after that we apply the PCA as dimensionality reduction method in order to reduce the dimension of image features. PCA has been applied to SIFT and SURF features of different dimensionality. The performances of these descriptors were comparable to the original descriptors, with reduction ranges from 10% to 90% dimensions. We applied PCA to SIFT and SURF features with a dimension reduction in the range of 10% to 90%. The analysis of the performances of different compression ratio were conducted by using the recall-precision curves, and computing time.

Complete Article List

Search this Journal:
Open Access Articles
Volume 13: 6 Issues (2022): Forthcoming, Available for Pre-Order
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing