Design of a MobilNetV2-Based Retrieval System for Traditional Cultural Artworks

Design of a MobilNetV2-Based Retrieval System for Traditional Cultural Artworks

Zhenjiang Cao, Zhenhai Cao
Copyright: © 2024 |Pages: 17
DOI: 10.4018/IJGCMS.334700
Article PDF Download
Open access articles are freely available for download

Abstract

Aiming at the problem that it is difficult for art teachers to take into account each student in the art appreciation education in colleges and universities, this paper proposes a retrieval system for traditional cultural works of art. Dense connections are used to replace residual connections between bottlenecks in MobileNetV2 network and gradient transmission in the network. The dilution factor is used to control the size of the network to solve the problem of the rapid increase in the number of network channels. In addition, the non-local attention mechanism is effectively combined with the improved MobileNetV2 network structure, which effectively improves the classification accuracy of the network. Compared with VGG16, ResNet18, and ResNet34, the classification accuracy is increased by 21.3%, 9.2%, and 3%, respectively. The method in this paper has achieved good results in the classification of art works. According to the images of art works to be appreciated, it helps students understand the relevant cultural knowledge independently and reduce the burden of teachers.
Article Preview
Top

1. Introduction

New media technologies, represented by computer vision, are increasingly becoming new variables that affect people's lives and work. As a new technology leading to changes in people's lives and work, artificial intelligence has the same disruptive potential in the work of education and teaching. In current Chinese university education, art education is responsible for enhancing students' aesthetic skills and passing on traditional culture, an area in which science and technology students are undoubtedly lacking compared to those majoring in art (Feng et al, 2021). Art majors have systematic professional art teaching, usually in small classes, and teachers can generally take care of every student. However, as students of science and technology, due to the different training requirements, art education in schools is usually given in the form of elective courses, mainly in the form of appreciation of art works, and the excessive number of elective courses makes it difficult to ensure that each student has a detailed understanding of the works to be appreciated in class. Utilizing computer vision technology to categorize artworks featuring traditional culture holds immense significance in imparting associated cultural knowledge within college art education. Nonetheless, artworks embedding traditional cultural elements often feature diminutive representations of the cultural essence within the image, thereby heightening the challenge for network recognition.

Neural networks exhibit commendable performance in tackling nonlinear conundrums, exemplified by their proficiency in tasks like image recognition. However, they are not without their design challenges, including the intricacies of determining network architecture, the selection of training parameters, and the delicate calibration of network weights. These quandaries necessitate substantial human involvement in the neural network design process. Within the realm of neural network structure exploration, the parameters constituting the search space play a pivotal role in defining the network's structure (Chen et al, 2023). This encompasses critical aspects such as the number of network layers, the dimensions of convolutional kernels, and the choice of operators for each layer. The quest for an optimal network structure has emerged as a focal point in deep learning research in recent years. By employing efficient and cost-effective search algorithms, it becomes possible to automatically ascertain a neural network structure endowed with robust generalization capabilities and hardware-friendly attributes. This approach not only obviates the need for manual network design but also outperforms early, labor-intensive design paradigms across a spectrum of application scenarios (Zhong et al, 2023).

Feature extraction from images constitutes a pivotal phase in the realm of traditional image classification techniques. Researchers invariably find themselves in the position of crafting custom feature extraction methods tailored to the unique characteristics of the images within a specific classification task. A case in point is the Scale-Invariant Feature Transform (SIFT) (Kasiselvanathan, 2020), which excels in discerning object features for the purpose of object recognition. SIFT leverages the presence of salient points in an image, regardless of whether they are distant or nearby, blurred or sharp. Similarly, the Local Binary Pattern (LBP) operator (Cheng et al, 2022) is a stalwart in extracting edge information from facial images, primarily for face recognition. LBP capitalizes on the relationships between pixel gradients and their immediate neighbors to uncover valuable facial features.

For pedestrian detection, the Histogram of Oriented Gradients (HOG) feature (Yang & Wei, 2022) becomes the method of choice. HOG operates by capturing alterations in directional gradients at the juncture of the target object and its background. This approach effectively teases out the edge gradient features of the human body in images, making it indispensable for pedestrian detection tasks.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 1 Issue (2023)
Volume 14: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing