Application of Gesture Recognition Based on Spatiotemporal Graph Convolution Network in Virtual Reality Interaction

Ting Liao

Source Title: Journal of Cases on Information Technology (JCIT) 24(5)

DOI: 10.4018/JCIT.295246

Article PDF Download Open access articles are freely available for download

Abstract

Aiming at the low recognition rate of traditional gesture, a gesture recognition algorithm based on spatiotemporal graph convolution network is proposed in this paper. Firstly, the dynamic gesture data were preprocessed, including removing invalid gesture frames, completing gesture frame data and normalization of joint length. Then, the key frame of the gesture is extracted according to the given coordinate information of the hand joint. A connected graph is constructed according to the natural connection of time series information and gesture skeleton. A spatio-temporal convolutional network with multi-attention mechanism is used to learn spatio-temporal features to predict gestures. Finally, experiments are carried out on 14 types of gesture dataset in DHG-14 dynamic gesture dataset. Experimental results show that this method can recognize gestures accurately.

Article Preview

Top

1 Introduction

With the development of related disciplines such as virtual reality and machine learning, the way people interact with computers is moving in a more natural and pervasive direction. There is an urgent need to use natural actions, rather than traditional dedicated input devices that send commands to control systems or interact with digital content in virtual environments. Human-computer interaction is changing from computer-centered to user-centered. Of all the body parts, the human hand plays an important role in interaction as a dexterous and effective executive organ. In daily life, people need to use their hands a lot to manipulate objects or communicate with others. The aim of gesture estimation is to recover the complete motion posture of hand in calculator system. Then, make the computer or other equipment can sense the spatial posture of the hand, so as to execute according to the instruction of the person. Accurate gesture estimation can not only build realistic virtual hand movements, but also enhance user experience in human-computer interaction (Chakraborty B K, Sarma D, Bhuyan M K, et al.2018). This helps computers better understand human behavior, which in turn makes interactions between humans and intelligent systems more intelligent.

As an important interactive way in computer graphics, virtual reality and human-computer interaction, gesture interaction provides a convenient, intuitive, simple and convenient interactive experience. Gesture interaction and recognition are of great significance to virtual reality interaction (De Smedt Q, Wannous H, Vandeborre J P.2016).3D motion sensing game (Duan H, Sun Y, Cheng W, et al.2021), assisted medical surgery (Gao X, Jin Y, Dou Q, et al.202) and other applications. However, due to the high degree of freedom of different gestures (Hussain S, Saxena R, Han X, et al.2017), the acquired gesture image data is usually characterized by low resolution, chaotic background, blocked hands, different finger shapes and sizes, and individual differences. This makes it difficult to accurately represent different gesture features, thus bringing difficulties and challenges to gesture recognition (Hou J, Wang G, Chen X, et al.2018).

Traditional gesture recognition is usually based on camera photos and 2D gestures for recognition and classification. Literature (Jiang D, Li G, Sun Y, et al.2019) analyzes the images of various resolutions in the image pyramid successively from low to high according to the changes in geometric dimensions of different parts of the hands obtained by segmentation. Literature (Li Y, He Z, Ye X, et al.2019)(Moin A, Zhou A, Rahimi A, et al.2021)proposed an effective Distance measure FEMD (finger-earth Mover's Distance) by using gesture shapes. This measure compares the shape differences of different gestures. Literature (Nasri N, Orts-Escolano S, Cazorla M.2020)proposed a gesture recognition method based on gesture main direction and Hausdorff-like distance template matching. This method has a high limitation on the main direction of gestures, which requires that the main direction of gestures obtained be consistent with the main direction of similar gestures in the training library, which limits the applicability of the method.

In recent years, many researchers have fully integrated data modeling and graph structure according to the characteristics of gesture sequence data, and proposed the idea of using Graph convolutional Networks (GCN)(Rudi F F Y, Yuniarno E M.) to predict actions. Because GCN can make full use of the spatial relationship of gestures, the performance of this method is greatly improved. However, a fixed topology is not the best choice for describing a diverse sample of actions, limiting the scope of messaging between nodes. Therefore, a graph structure that can be dynamically adjusted according to data samples is more suitable for modeling diverse gestures. In addition, the previous GCN ignored the importance of different channels. Often, the features produced by some channels are very important for motion recognition, while the features in some channels only play a minor role. In the process of feature extraction, we should pay more attention to those important channel features and ignore the unimportant channel information. In order to dynamically adjust the graph structure according to the data samples, a gesture recognition algorithm based on spatio-temporal graph convolution network is proposed in this paper.

Complete Article List

Search this Journal:

Reset

Volume 26: 1 Issue (2024)

Volume 25: 1 Issue (2023)

Volume 24: 5 Issues (2022)

Volume 23: 4 Issues (2021)

Volume 22: 4 Issues (2020)

Volume 21: 4 Issues (2019)

Volume 20: 4 Issues (2018)

Volume 19: 4 Issues (2017)

Volume 18: 4 Issues (2016)

Volume 17: 4 Issues (2015)

Volume 16: 4 Issues (2014)

Volume 15: 4 Issues (2013)

Volume 14: 4 Issues (2012)

Volume 13: 4 Issues (2011)

Volume 12: 4 Issues (2010)

Volume 11: 4 Issues (2009)

Volume 10: 4 Issues (2008)

Volume 9: 4 Issues (2007)

Volume 8: 4 Issues (2006)

Volume 7: 4 Issues (2005)

Volume 6: 1 Issue (2004)

Volume 5: 1 Issue (2003)

Volume 4: 1 Issue (2002)

Volume 3: 1 Issue (2001)

Volume 2: 1 Issue (2000)

Volume 1: 1 Issue (1999)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Application of Gesture Recognition Based on Spatiotemporal Graph Convolution Network in Virtual Reality Interaction

Abstract

1 Introduction

Complete Article List