Article Preview
TopIntroduction
One of the most widespread researches in computer vision field is Human Activity Recognition (HAR), because of its extensive significances. To name a few are, surveillance of video, entertainment through human and computer, health care analysis, social welfare and so on. The main purpose of HAR remains in examining and identifying human activities by machines. In spite of consequential efforts have been made in the past several years, identifying human action precisely is still a demanding problem.
Human activity recognition using smartphones benefits several applications. To name a few are, healthcare system, body fitness, smart home and so on. Instead of using smart phones that are invasive and necessitate additional cost, using smart phones integrated with different types of sensors are found to be of extensive use. In a previous research work by Chen et al. (2017), Coordinate Transformation and Principal Component Analysis (CT-PCA) was investigated for robust human activity recognition with respect to direction, position.
The CT-PCA scheme addressed one of the most significant issues with respect to the effect of orientation variations in HAR based on smart phone. In addition to improve the performance of recognition, an efficient Online-independent Support Vector Machine (OSVM) algorithm was designed. Despite improvement found in recognition performance of the system, however, with complex tasks, representative features are not selected in a proper manner. This in turn results in error and therefore recognition accuracy for human activity recognition was said to be compromised. Potential solution is to introduce machine learning algorithms that extract representative features even for complex tasks, therefore addressing the error rate.
In Posture Tendency Descriptor (PTD) Yao et al., (2018) an intuitive dividing algorithm was first designed that efficiently divided the action sequence into various action snippets. The purpose of applying the intuitive dividing algorithm was the geometric structure within the activity was preserved, resulting in the improvement in the recognition accuracy. With the obtained action snippets, an interpretable and discriminative posture tendency descriptor (PTD) was built with the purpose of representing one action snippet.
Finally, multiple PTDs were integrated in a vertical and time-related order resulting in the human activity representation, therefore reducing the computational complexity involved in human activity recognition. Despite improvement addressed in terms of recognition accuracy, with the focus on the global dynamic tendency by measuring the covariance of the entire body joint locations, could increase the computational complexity or the computational time involved.
Potential solution is to classify only the key frames by applying the Nearest Neighbor with the confidence of the nearest frames, therefore minimize the computational time involved during human action recognition. A comparative study on human activity recognition was carried out by Wang et al. (2016), with novel feature selection approach resulting in improvement in terms of time.
Most prevailing techniques make the unfeasible speculation that persons walk along a stable orientation or a presumed path. Previously, a sparse reconstruction based metric learning method was investigated by Lu et al. (2014), thereby reducing the reconstruction error. Scale Invariant Feature Transform (SIFT) was used by Moussa et al., (2015) for detecting interest points for HAR. Yet another trajectory-based representation was designed by Haiam et al., (2015), to extract effective trajectories in realistic conditions. This in turn had positive impact on the recognition accuracy. However, anticipation with respect to trajectory remained unaddressed. To solve this issue, Anticipatory Temporal Conditional Random Field (ATCRF) was applied in a previous research work by Hema et al., (2016), resulting in the improvement in anticipation time.