Article Preview
TopIntroduction
Over the last decade, video data has increased dramatically, especially with the development of smartphones and the evolution of social media. As an example, the statistics published by Facebook, about the 4th quarter of 2018 indicates that, there are over 2.32 billion monthly active users (Facebook, 2018), watching 100 million hours of video every day (Smith, 2019). Sometimes, the same video content is posted or shared several times, and since the storage range is limited, it is worthwhile to search for similar content to delete it, use a quota, or classify it. Since the large volume of data leads to a difficult retrieve, the approach and the infrastructure used will certainly help to curb some difficulties. Various Content-Based Video Retrieval systems (CBVR) that allow efficiently retrieving similar videos to a query video from a database have been developed. Quellec et al. (2011) proposed a CBVR system for real-time retrieval of similar videos with application to computer-aided retinal surgery. They use Dynamic Time Warping (DTW) technique to measure distance between video subsequences. Loukas (2018) presented a content-based video analysis of surgical operations. He reviews recent developments and analyzes future directions in the field of content-based video analysis especially on surgical operations. Ishtiaq et al. (2018) proposed a method that receives video content and metadata associated with video content. The method then extracts visual, audio, and textual features of the video content based on the metadata. A set of video segments of the video content is identified based on the composite features of the video content. Thereafter the segments will be identified based on a user query. A CBVR system called Bounded Coordinate of Motion Histogram (BCMH), with offline and online processing parts is presented by El Ouadrhiri et al. (2017). To characterize videos, they used vector motions and residual data, and to calculate the similarity, they used the Bounded Coordinate System (BCS). The similarity measurement accuracy of BCMH system is interesting, but the offline processing step requires a long time to complete the learning phase with large videos. Likewise, the online processing part is unable for similarity measurement in real-time. Therefore, optimizing the BCMH computation time is the challenge in the current scenario. In this paper, the authors propose a batch-oriented computing based on Apache Hadoop to improve the time processing during the offline step of the BCMH system. The batch processing is, in general, efficient in processing large volumes of data, and Apache Hadoop is the most common framework used. In addition, a real-time oriented computing based on Apache Storm topologies was proposed to achieve the real-time response for the online step.
The outline of this paper is as follows. In the section II, authors introduce the basic concepts and related work of CBVR systems using Hadoop distributed platform and the real-time video processing using Apache Storm. Section III presents materials and methods. The next section defines the approach. Section V describes the architecture and the implementation of the system. Section VI presents the experimental results and analysis. Finally, Section VII concludes this paper.