A Comparative Study of Unsupervised Video Shot Boundary Detection Techniques Using Probabilistic Fuzzy Entropy Measures

A Comparative Study of Unsupervised Video Shot Boundary Detection Techniques Using Probabilistic Fuzzy Entropy Measures

Biswanath Chakraborty (RCC Institute of Information Technology, India), Siddhartha Bhattacharyya (RCC Institute of Information Technology, India) and Susanta Chakraborty (Bengal Engineering and Science University, India)
DOI: 10.4018/978-1-4666-2518-1.ch009

Abstract

The performance of video shot boundary detection technique in unsupervised video sequence can be improved by the use of different probabilistic fuzzy entropies. In this chapter, the authors present a new technique for identifying as to whether there are any appreciable changes from one video context to another in the available sequence of image frames extracted from a mixture of a numbers of video files. They then compared their technique with an existing technique and found improved performance of the video shot boundary detection techniques using probabilistic fuzzy entropies.
Chapter Preview
Top

Introduction

A video consists of a several images/frames which are played consecutively at a constant speed of around 20 to 30 frames per second for smooth visualization (Thakar. et al). Video shot boundary detection means to segment a video by detecting boundaries between two camera shots (frames) which are the first and the most important step for content-based video retrieval. A video can be divided in the hierarchy (Figure 1) as follows: video → scenes → shots → frames →pixels (Das. et. al 2008). So, a shot can be defined as consecutive frames recorded from a single camera operation, representing a continuous action in time and space and typically relevant shots are grouped into a higher level unit called a scene. Shot boundary detection means to find the locations of shot transitions, which are mainly of two types, viz., cut and gradual transitions. Gradual shot transition occurs over a few video frames. Fade in/out and dissolves are the two most common gradual shot transitions. A fade is a gradual transition between a scene and a constant image which is called fade out and that between a constant image and a scene which is called fade in. A dissolve is a gradual transition from one scene to another (where two consecutive shots temporally superimpose).

Figure 1.

Overview of shot boundary detection

Video content classification is an important task in the computer vision community as far as intelligent analysis of video content is concerned. Typical applications include video content mining, video surveillance and defense applications to name a few. A score of literatures exists in this regard providing a detailed comparative study of the different techniques in vogue (Boreczky et al. 1996). Video content classification primarily entails the detection of the cuts in the video sequences through classical as well as non-classical techniques. The non-classical techniques resort to computational intelligence perspectives in ascertaining the video shot boundaries by using the inherent information distribution of the video content. Hence, larger video frames/sequences obviously require a lot of computational overhead in this shot detection task. Since, video data are essentially continuous in nature, dynamic estimation of the data distribution is a prerequisite in the commonly used shot boundary detection and video analysis techniques. The commonly used estimation techniques include:

Key Terms in this Chapter

RCI: The acronym stands for Relative Change Index

AVI: The acronym stands for Audio Video Interleave; it is a common video format.

BMP: The acronym stands for Bitmap Image File. It contains bitmapped image information in Little Endian format.

API: The acronym stands for Application Program Interface. It is system software for an operating system or environment, which contains various functions and procedures which may from any method to gain extra functionality without making extra code

UVSBD: The acronym stands for Unsupervised Video Shot Boundary Detection. It is a method of detecting video shot boundaries without having a priori knowledge regarding the video context.

RGB: The acronym stands for Red, Green and Blue. RGB (red, green and blue) color model refers to a system for representing the colors to be used on a visual display unit. Red, Green and Blue can be combined in various proportions to obtain any color in the visible spectrum. The name of the model comes from the initials of the three additive primary colors, red, green, and blue.

FH: The acronym stands for Fuzzy Histogram. Fuzzy histograms are a fuzzy generalization of ordinary crisp histograms. Given fuzzy data the concept of histograms has to be extended to so-called fuzzy histograms in order to incorporate fuzziness into considerations. For these fuzzy histograms the height over a class itself is fuzzy.

Complete Chapter List

Search this Book:
Reset