A Novel Solution for Scaling Video Shot Boundary Detection Based on Hadoop

A Novel Solution for Scaling Video Shot Boundary Detection Based on Hadoop

Ahmed Dib (Badji Mokhtar, Annaba University, Annaba, Algeria) and Mokhtar Sellami (Badji Mokhtar, Annaba University, Annaba, Algeria)
Copyright: © 2018 |Pages: 14
DOI: 10.4018/IJDST.2018070103

Abstract

Shot Boundary Detection (SBD) is an important step required by CBVIR systems. In order to perform scalable SBD, a MapReduce based solution is proposed. So, instead of handling consecutive frames in a sequential manner, they can be processed in a fully parallel way. Usually, in the sequential case, descriptors of consecutive frames are compared and shot boundaries are detected if significant variations have occurred. It seems simple, but it can take centuries to processes immense multimedia datasets. Then, based on the transitivity of similarity relation, resemblance measurement between distant frames is calculated, and shout boundaries are extracted respectively by Mapper and Reducer routines. The experiment results show that the proposed solution outperforms the sequential traditional methods and can be applied to a large-scale multimedia datasets.
Article Preview

Introduction

The rapid advancement of technology has created new challenges for processing multimedia data systems. Actually, technology allows a flexibility to manipulate multimedia data. This has led to generate a huge number of images and videos on web-scale datasets. Thus, several methods and techniques have been developed to provide efficient means for manipulating this amount of multimedia data. On the other hand, Video is a recording pictures and sound, usually as a digital file. Among the many possible video representations, the shot is the most fundamental semantic element. It consists of series of interrelated and consecutive frames taken continuously by a single camera to represent a continuous action in time and space. According to the manner of transition between adjacent shots, shot boundaries are categorized into abrupt (hard cuts) or gradual transition (fades, wipes, and dissolves). Video SBD (VSBD) is an important early step for video browsing, retrieval, compression, and for many other important applications. However, for practical use of multimedia applications on large datasets, many studies have been carried out in order to speed up VSBD process. Most of studies focus on methods and algorithms to use. In video compressed domain, Sugano, Nakajima, Yanagihara, and Yoneyama (1998) proposed a technique to accelerate VSBD by applying frame spatiotemporally sub-sampling with MPEG coding parameter obtained in Variable Length Decoding (VLD) stage. Another technique proposed by Brandt, Trotzky, and Wolf et al. (2008), they reuse pre-existing measures consisting in Discrete Cosine Transform (DCT) values and motion information of frames. These measures are directly accessible in the compressed domain. Hence shot boundary computing time is improved. Otherwise, in video uncompressed domain (pixel domain), VSBD is done by inspecting variations of measurement between consecutive frames. This technique is more adapted for real-time video process application. Pixel difference (Luo, DeMenthon, & Doermann, 2004), color histograms differences (Mas & Fernandez, 2003), Eigen values (Benni, Dinesh, Punitha, & Rao, 2015), and many other measures. In (Lu & Shi, 2013), Singular Value Decomposition (SVD) is applied to the matrix of features that are extracted from the color histogram of video frames. Thereby, the feature vector is reduced, and computation cost is improved. In (Birinci & Kiranyaz, 2014), a novel VSBD method based on human perceptual rules and the well-known “Information Seeking Mantra” is proposed. This avoids unnecessary processing and provides a significant computing boost. Other clustering-based approaches such as pre-processing (Li, Lu, & Niu, 2009) or frame skipping techniques (Gao, Yong, & Cheng, 2011; De Bruyne et al., 2008) are used to select candidate frames, where a possible shot boundary may be present. Hence, the frame count is reduced and computing time is improved. However, sequential solutions for multimedia processing problem are not well adapted for web-scale video datasets. New challenges have emerged to meet high storage and processing demands, the implementation of efficient algorithms must be accompanied by scalable solution based on powerful computing technologies. Indeed, aggregating resources was enabled, two or more processors can be attached to form a multi-core processor or a computer cluster allowing more effective and simultaneous processing. In an earlier study, to parallelize the process of shot boundaries detection, Li et al. (2006) proposed a hybrid approach of parallelization based on task level and data slicing level. Similar studies on media mining applications based on multi-core processors have been conducted in (Yu & Wei, 2007; Li, Tong, Wang, Zhang, & Chen, 2009), but such schemes should be augmented by a data management system for large-scale data replication and secure transfer. Furthermore, Over the past decade, new commercially supported data center model (Clouds computing) has emerged. Clouds seem to be more appropriate for big data paradigm. Many frameworks based on MapReduce (Dean & Ghemawat, 2008) were developed in order to provide the best control of computing and storage resources on the cloud infrastructure. Various studies have focused on MapReduce programming model to enhance media mining processing performances on large-scale datasets. In order to perform a YouTube scale video annotation, Morsillo, Mann, and Pal (2010) proposed a MapReduce-based algorithm to construct a distributed nearest neighbour vector tree. Cheng, Chen, Wang, and Chen (2013) proposed a MapReduce-based algorithm to perform a parallel processing on multiple servers, they combine weak classifiers to accelerate image background subtraction algorithm. In another MapReduce-based video information retrieval application, Jing-zhai, Xiang-Dong, and Peng-zhou (2013) proposed the use of Locality-sensitive hashing (LSH) to calculate the similarity between video clips. Many other works focus on using MapReduce for video processing, but there is no study talking about performing VSBD based on MapReduce programming model, however, it can further scale up and enhance SBD process.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 11: 4 Issues (2020): Forthcoming, Available for Pre-Order
Volume 10: 4 Issues (2019): 2 Released, 2 Forthcoming
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing