Static Shot based Keyframe Extraction for Multimedia Event Detection

Static Shot based Keyframe Extraction for Multimedia Event Detection

S. Kaavya (VIT University, Vellore, India) and G. G. Lakshmi Priya (VIT University, Vellore, India)
Copyright: © 2016 |Pages: 13
DOI: 10.4018/IJCVIP.2016010103


Nowadays, processing of Multimedia information leads to high computational cost due its larger size especially for video processing. In order to reduce the size of the video and to save the user's time in spending their attention on whole video, video summarization is adopted. However, it can be performed using keyframe extraction from the video. To perform this task, a new simple keyframe extraction method is proposed using divide and conquer strategy in which, Scale Invariant Feature Transform (SIFT) based feature representation vector is extracted and the whole video is categorized into static and dynamic shots. The dynamic shot is further processed till it becomes static. A representative frame is extracted from every static shot and the redundant keyframes are removed using keyframe similarity matching measure. Experimental evaluation is carried out and the proposed work is compared with related existing work. The authors' method outperforms existing methods in terms of Precision (P), Recall (R), F-Score (F). Also, Fidelity measure is computed for proposed work which gives better result.
Article Preview

Many researchers focus on the keyframe extraction which can be used for various applications like video summarization, video retrieval, and video event detection and so on. As keyframe extraction has become a basic step for various domains based applications, it can be extracted and can be used efficiently for detection of events in the video sequences. Keyframes can be extracted using low level and high level features. Low level features includes color, edge, texture, motion features etc. Whereas high level features include semantic information. Various high level features are discussed in the paper (Chen et al., 2011), in which, according to the camera movement selection and automatic scene analysis, keyframes are extracted from the specific domain (basketball) videos. In the paper (Mezaris et al., 2013), high level features are traced using low level interest point detection and description which is suitable to generate Bag of Spatiotemporal Words (BOSW). Whereas, in the paper (Younessian et al., 2012), the author addresses the semantic relatedness based on textual, auditory and visual clues. The major disadvantage of using high level features is, it is not applicable for all domains due its complexity in nature.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 11: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 10: 4 Issues (2020): 3 Released, 1 Forthcoming
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 2 Issues (2015)
Volume 4: 2 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing