Visual Information Analysis for Interactive TV Applications
Evlampios Apostolidis (Information Technologies Institute, Centre for Research and Technology Hellas, Greece), Panagiotis Sidiropoulos (Information Technologies Institute, Centre for Research and Technology Hellas, Greece), Vasileios Mezaris (Information Technologies Institute, Centre for Research and Technology Hellas, Greece) and Ioannis Kompatsiaris (Information Technologies Institute, Centre for Research and Technology Hellas, Greece)
Copyright: © 2015
Since its introduction, television has been providing to millions of users a non-interactive experience, in which viewers can only participate as passive consumers of audiovisual content. Recently, the extreme proliferation and success of the Internet and the widespread appraisal of the interaction possibilities that it offers gave rise to the idea of the Interactive TV: a television broadcast in which users do not only passively consume the content, but similarly to the Web, they can navigate across multiple pieces of content, following links that are similar in nature to the hypertext links between textual documents.
In this article, we will discuss the visual information analysis technologies and tools that are necessary for supporting the interlinking of visual content in a fashion that allows users to navigate between fragments of the content. We will cover analysis technologies that range from video content fragmentation into temporal units (shots, scenes), to the labeling of visual content via concept-based annotation and re-detection of objects of interest in the video. Such technologies are necessary for empowering video hyperlinking, so that e.g. an object of interest in one video segment can be linked to other relevant segments of the same video, or also to entirely different videos that relate to it. For these key-enabling analysis technologies, we will review the state-of-the-art and we will further elaborate on and provide indicative results for specific techniques that are particularly relevant to TV content.
Key Terms in this Chapter
Object Re-Detection: Special case of image matching, aiming at finding occurrences of specific objects in a video or a collection of videos.
Visual Content Labeling: Associating visual content with descriptive labels for supporting similarity estimation and indexing/retrieval tasks.
Interactive TV: A new TV content presentation and consumption paradigm that will allow users to freely navigate between interlinked pieces of multimedia content.
Temporal Video Segmentation: Partitioning a video into elementary temporal units such as shots and scenes.
Video Scene: The basic story-telling unit that consists of shots with semantic similarity and temporal continuity.
Concept Detection: Automatic extraction of high-level concepts by using low-level audiovisual features and machine learning.