Content Coverage and Redundancy Removal in Video Summarization

Content Coverage and Redundancy Removal in Video Summarization

Hrishikesh Bhaumik (RCC Institute of Information Technology, India), Siddhartha Bhattacharyya (RCC Institute of Information Technology, India) and Susanta Chakraborty (Indian Institute of Engineering Science and Technology, India)
Copyright: © 2017 |Pages: 23
DOI: 10.4018/978-1-5225-0498-6.ch013
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Over the past decade, research in the field of Content-Based Video Retrieval Systems (CBVRS) has attracted much attention as it encompasses processing of all the other media types i.e. text, image and audio. Video summarization is one of the most important applications as it potentially enables efficient and faster browsing of large video collections. A concise version of the video is often required due to constraints in viewing time, storage, communication bandwidth as well as power. Thus, the task of video summarization is to effectively extract the most important portions of the video, without sacrificing the semantic information in it. The results of video summarization can be used in many CBVRS applications like semantic indexing, video surveillance copied video detection etc. However, the quality of the summarization task depends on two basic aspects: content coverage and redundancy removal. These two aspects are both important and contradictory to each other. This chapter aims to provide an insight into the state-of-the-art approaches used for this booming field of research.
Chapter Preview
Top

1. Introduction

With the ever decreasing cost of digital storage devices and advancement of technology in recent years, video recorders have gained immense popularity. A large number of videos are produced and uploaded at an ever increasing rate. Digital libraries and video repositories like YouTube, DailyMotion, MyVideo etc. enable users to upload, retrieve, view and share videos over the internet. This has also been facilitated by a many fold increase in communication bandwidth provided by Internet Service Providers (ISPs). Due to its inherent structure, a video encompasses the other three media types as well, i.e. text, image and audio, combining them into a single data stream. As a consequence, the analysis and retrieval of videos has attracted much attention from researchers and application developers. The research issues include feature extraction, similarity/dissimilarity measures, segmentation (temporal and semantic), key-frame extraction, indexing, annotation, classification and retrieval of videos. Video summarization is an application which lies on the crossroads of these research issues. The objective of video summarization is to reproduce and represent a video to the user in a concise manner such that its overall semantic meaning is preserved. The time and bandwidth constraints besetting the user are balanced to a great extent by such summarization task. Real-time video summarization helps in immediate indexing of videos for facilitating retrieval from the repository. Also it facilitates the user in assessing whether watching the entire video would be useful or not. Video summarization and representation through key frames have been frequently adopted as an efficient method to preserve the overall contents of the video with a minimum amount of time. The types of video summarization include static (also known as storyboard representation) and dynamic (also called video skimming). Storyboard or static representation is produced by means of selecting key-frames from a set of video frames, obtained after decomposing a video. Audio clips have temporal characteristics and are therefore not an integral part of static summaries. However, some keywords in the form of meta-data may be tagged with the static video frames to assist in keyword based indexing and retrieval. The set of key-frames serve as features of the video and these key-frames help in indexing task. During the process of retrieval, the key-frames corresponding to a video may be matched with the key-frames extracted from the query video. As such, the task of video mining is facilitated through this process.

Complete Chapter List

Search this Book:
Reset