Article Preview
Top1. Introduction
Human Activity Recognition (HAR) is a supervised learning model that can predict human body activities from the trained snapshot of sensor data. Human activity and action recognition is challenging due to the variety of activities, the speed of operation, dynamic capturing, and the wide range of application areas (F Kulsoom, S Narejo, Z Mehmood, H N Chaudhry, A K Bashir, 2022). Human activity analysis, recognition and understanding are critical fields of research in computer vision, internet of things (IoT). HAR methods are categorized into two types: vision-based (video/camera-based) and wearable / IoT sensor based i.e. wearable sensors with gyroscopes, accelerometers, and other sensors (M A R Ahad, 2020). The HAR system is utilised extensively in numerous real-time applications, including surveillance, virtual reality interaction, entertainment, rehabilitation systems, autonomous driving etc. HAR is a hard time series classification challenge used to pinpoint a person's precise action or movement using remotely recorded sensor data.
To accurately generate features from the raw data in order to fit machine learning or a Deep learning model, it generally takes deep domain understanding and methodologies from signal processing. Movements are commonly indoor activities including walking, standing, sitting, and problem-based activities like exercising, cooking, or working on a factory floor. Predicting a person's movement, frequently indoors, using sensor data from devices like a smartphone's accelerometer is known as activity recognition. Data collection methods include carrying specialised hardware or smart phones with accelerometers and gyroscopes, which can immediately record the subject's motion. Formerly gathering sensor data for activity detection required specialized hardware, which was complex and complicated. Smart phones and other portable trackers for fitness and health monitoring are now affordable and widely available. As a result, the general activity recognition problem has made substantial use of sensor data from these devices because it is less expensive to acquire, more prevalent, and easier to use. Long short-term memory (LSTM) networks are best suited for sequential data, while convolutional neural networks (CNNs) work well with image data or image classification problems. However, when we combine the two, we could even resolve issues like human action recognition by learning features from raw sensor data and predicting the affiliated movements.
Figure 1 illustrates the various categories of human activity, which range from basic gestures to collaborative activities. Additionally, sensor based HAR systems use electromagnetic, electrocardiographs, and inertial sensors for data acquisition for activity recognition. HAR system has a wide range of applications including surveillance, ATM security, gaming, virtual reality interaction, online advertisement, entertainment, medical patient monitoring systems as well as their behaviour monitoring, rehabilitation system, autonomous driving, in offices to find how effectively the employees are working and in schools etc.(Li, D., et.al. 2019 ; Liu, Y., et. al. 2021). Applications for the HAR system are numerous that including surveillance systems, ATM security, gaming, virtual reality interaction, digital advertising, entertainment, medical patient monitoring systems as well as their behaviour monitoring, rehabilitation system, autonomous driving, used in offices to assess how productively the employees are working, and students even in schools, etc. (Chen et al., 2012; Lara & Labrador, 2012). It involves atomic activities, human-object along with human-human interactions, behaviours and group actions, etc.(Lee, M. T., et. al. 2022)
Figure 1.
Types of Human Activities (Vrigkas, Nikou & Kakadiaris, 2015)
A HAR system must possess three functionalities (Alzughaibi, Hakami & Chaczko, 2015):