Multimedia Image Retrieval System by Combining CNN With Handcraft Features in Three Different Similarity Measures

Multimedia Image Retrieval System by Combining CNN With Handcraft Features in Three Different Similarity Measures

Maher Alrahhal (CSE Department, JNTUH Collage of Engineering, Hyderabad, India) and Supreethi K.P. (CSE Department, JNTUH Collage of Engineering, Hyderabad, India)
Copyright: © 2020 |Pages: 23
DOI: 10.4018/IJCVIP.2020010101
OnDemand PDF Download:
No Current Special Offers


The authors propose WNAHVF, a combined weighted and normalized AlexNet with handcrafted visual features for extracting features from images and using those vectors for image retrieval and classification. The authors test the WNAHVF method on two general datasets, Corel-1k and Corel-10k, and one medical dataset. The outcomes demonstrate combining Bag of Features and Local Neighbor patterns with AlexNet enhances the accuracy and gives better results in general and medical image datasets in retrieval and classification problems. This algorithm gives results that are superior to existing strategies.
Article Preview

1. Introduction

The multimedia content assumes a significant role in a wide scope of areas like social networking, investigation, medical care and so forth. This yields a pressing requirement for building up able retrieval frameworks to engage human needs. The multimedia system information contains a large quantity of knowledge of various kinds like texts, sounds, images and recordings. Image retrieval (IR) is the field of the study that involved with searching and retrieving digital images from a broad database. Image retrieval systems can be characterized by text-based or content-based image retrieval. TBIR is the procedure of physically adding explanation or depiction to the image in the database to portray the substance of images and at times for depicting other metadata of images. This strategy has numerous disservices, for example, the mistake rate is high and a lot of labor and material resources are needed (He et al., 2018). Additionally, including a portrayal of the image rely upon our perspectives, or how we comprehend the image.

CBIR is one amongst the foremost difficult analysis areas in a decade ago because of the visual multifaceted nature of images and the vast size of the image databases. CBIR is seen as a dynamic and fast-advancing research area in image retrieval domain. It’s a way for retrieving images from a group of images by the closeness of similarity. The retrieving images depends on the features extracted consequently from the images themselves. CBIR has been utilized in a several fields, for example, satellite images, remote detecting, therapeutic imaging, fingerprints checking and biodiversity data frameworks. CBIR strategies are being utilized in the zone of satellite images to discover earth minerals, airborne study, for checking horticulture, to produce climate forecasts and for tracking surface objects.

Medical imaging is one amongst the outstanding areas of utilization of CBIR for which may be used for observation patient health reports, to assist diagnosing by recognizing the comparable past cases and so forth. At the point when given a unique fingerprint image CBIR frameworks can be utilized to extract the similar finger impression images that outcomes in check of a person. Fingerprints are utilized in the banking part, universities, corporate organizations, and criminological labs.

A significant number of CBIR frameworks, which depends on features descriptors, are designed and developed. A feature is characterized as catching a specific visual property of an image. A descriptor encodes an image in a manner that enables it to be contrasted and coordinated with different images. A human can comprehend and decipher image content while the machine can’t. There’s an enormous gap between human perception and machine description referred to as a Semantic Gap (SG). Different strategies have been created to decrease the SG between human high-level perception and machine low-level portrayal. For this drawback, the specialists and scholars have conjointly been studied; however, the result isn't satisfactory. The most important stage is feature extraction which significantly control in the system and its accuracy (Shah et al., 2017). Image features descriptors can be either low level or high level depending on the way to extract the contents of the images. The global feature descriptors depict the visual content of the complete image, whereas local feature depicts describing a patch inside an image (i.e. a tiny group of pixels) of the image content (Alkhawlani et al., 2015). From global low level features we use local neighbor pattern which gave very good results comparing with LTrP and other methods special in texture databases. For local low level features we use Bag of Features with SURF and for make our proposed method working in high level features we use deep neural network (AlexNet) and some supervised machine learning techniques to reduce semantic gaps between humans and machines in image retrieval domain.

Complete Article List

Search this Journal:
Volume 12: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 2 Issues (2015)
Volume 4: 2 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing