With the progress of electronic equipments and computer technology for taking motion pictures and processing huge data, an increasing number of people now own and use camcorders to make home videos that capture their experiences and document their lives. Home video has no time limits and no restriction in content (Lienhart, 2000), so these videos easily add up to many hours of material. However, the organization and edition of the large amount of information contained in home videos present technical challenges due to the lack of efficient tools. Though a number of prototype systems for content-based video analysis and retrieval have been constructed, for example, as shown in (Wactlar, 1996; Chang, 1998), the development of tools and systems specialized for addressing home video, that is for extracting, representing, organizing, browsing, querying and retrieving video, is just on a preliminary stage (Huang, 2005; Wu, 2005). Several tasks are needed to confront to make the organization of home video possible and feasible. Home video has certain particular characteristics. The organization of home video should be based on the understanding of video structures, and by taking advantages of this structure. Home video are completed and stored straight in compressed domain. In order to save both time and space, techniques that manipulate home videos directly in compressed domain should be considered (Wang, 2003). Some typical techniques working on compressed domain could be found in (Taskiran, 2004). Home video are made by shot after shot without storyline, these shots may or may not have immediate relationship. To group shots, the visual features should be extracted from every shot (Gatica-Perez, 2003) Facing these tasks and difficulties, a novel technique is described in this article. It is based on the analysis of characteristics of home video, on the detection of motion attention regions in compressed domain, on the time weighting based on camera motion, and on a novel two-layer shot clustering approach and organization strategy. Experiments made on two home videos from MPEG-7 data set provide encouraging results.
Video analysis is an important branch of content-based video retrieval (CBVR). Compared to other types of video programs, home video has some particularities according to the persons in shoot and objects to be screened [Lienhart 1997]. The study on home video analysis may benefit from its unique characteristics.
In general, a typical home video has certain structure characteristics: it contains a set of scenes, each composed of ordered and temporally adjacent shots that can be organized in clusters conveying semantic meaning. The fact is that home video recording imposes temporal continuity. Unlike other video programs, home video just records the life but not composes story, so every shot (clip of video captured in one place without interruption) may have the equal importance. In addition, filming home video with a temporal back-and-forth structure is rare. For example, on a vacation trip, people do not usually visit the same site twice. In other words, the content tends to be localized in time. Consequently, discovering the scene structure above shot level plays a key role in home video analysis. Video content organization based on shot clustering provides an efficient way of semantic video accessing and fast video editing.
Key Terms in this Chapter
Scene: In video analysis, it is composed of a series of consecutive shots that are coherent from the narrative point of view.
Tilt: One type of camera motion forms in capturing the scene image. It consists of the movement of camera around the horizontal axis in the imaging plan.
Image Engineering: An integrated discipline/subject comprising the study of all the different branches of image and video techniques.
Content-Based Video Retrieval (CBVR): A process framework for efficiently retrieving required clip from video. The retrieval relies on the organization of video and non-linear search techniques.
Pan: One type of camera motion forms in capturing the scene image. It consists of the movement of camera around the vertical axis in the imaging plan.
MPEG-7: An international standard named “Multimedia content description interface” (ISO/IEC 15938). It provides a set of audiovisual description tools, descriptors and description schemes for effective and efficient access (search, filtering and browsing) to multimedia content.
Content-Based Visual Information Retrieval (CBVIR): A combination of CBIR and CBVR.
Object Segmentation: A process of image analysis. Its purpose is to extract the interesting region from image (corresponding to the interesting objects in scene).
Content-Based Image Retrieval (CBIR): A process framework for efficiently retrieving images from a collection by similarity. The retrieval relies on extracting the appropriate characteristic quantities describing the desired contents of images. In addition, suitable querying, matching, indexing and searching techniques are required.