Reference Hub30
Spatio-Temporal Analysis for Human Action Detection and Recognition in Uncontrolled Environments

Spatio-Temporal Analysis for Human Action Detection and Recognition in Uncontrolled Environments

Dianting Liu, Yilin Yan, Mei-Ling Shyu, Guiru Zhao, Min Chen
Copyright: © 2015 |Volume: 6 |Issue: 1 |Pages: 18
ISSN: 1947-8534|EISSN: 1947-8542|EISBN13: 9781466677135|DOI: 10.4018/ijmdem.2015010101
Cite Article Cite Article

MLA

Liu, Dianting, et al. "Spatio-Temporal Analysis for Human Action Detection and Recognition in Uncontrolled Environments." IJMDEM vol.6, no.1 2015: pp.1-18. http://doi.org/10.4018/ijmdem.2015010101

APA

Liu, D., Yan, Y., Shyu, M., Zhao, G., & Chen, M. (2015). Spatio-Temporal Analysis for Human Action Detection and Recognition in Uncontrolled Environments. International Journal of Multimedia Data Engineering and Management (IJMDEM), 6(1), 1-18. http://doi.org/10.4018/ijmdem.2015010101

Chicago

Liu, Dianting, et al. "Spatio-Temporal Analysis for Human Action Detection and Recognition in Uncontrolled Environments," International Journal of Multimedia Data Engineering and Management (IJMDEM) 6, no.1: 1-18. http://doi.org/10.4018/ijmdem.2015010101

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Understanding semantic meaning of human actions captured in unconstrained environments has broad applications in fields ranging from patient monitoring, human-computer interaction, to surveillance systems. However, while great progresses have been achieved on automatic human action detection and recognition in videos that are captured in controlled/constrained environments, most existing approaches perform unsatisfactorily on videos with uncontrolled/unconstrained conditions (e.g., significant camera motion, background clutter, scaling, and light conditions). To address this issue, the authors propose a robust human action detection and recognition framework that works effectively on videos taken in controlled or uncontrolled environments. Specifically, the authors integrate the optical flow field and Harris3D corner detector to generate a new spatial-temporal information representation for each video sequence, from which the general Gaussian mixture model (GMM) is learned. All the mean vectors of the Gaussian components in the generated GMM model are concatenated to create the GMM supervector for video action recognition. They build a boosting classifier based on a set of sparse representation classifiers and hamming distance classifiers to improve the accuracy of action recognition. The experimental results on two broadly used public data sets, KTH and UCF YouTube Action, show that the proposed framework outperforms the other state-of-the-art approaches on both action detection and recognition.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.