Video Representation and Processing for Multimedia Data Mining

Video Representation and Processing for Multimedia Data Mining

Amr Ahmed (University of Lincoln, UK)
Copyright: © 2009 |Pages: 31
DOI: 10.4018/978-1-60566-188-9.ch001
OnDemand PDF Download:


Video processing and segmentation are important stages for multimedia data mining, especially with the advance and diversity of video data available. The aim of this chapter is to introduce researchers, especially new ones, to the “video representation, processing, and segmentation techniques”. This includes an easy and smooth introduction, followed by principles of video structure and representation, and then a state-of-the-art of the segmentation techniques focusing on the shot-detection. Performance evaluation and common issues are also discussed before concluding the chapter.
Chapter Preview


With the advances, which are progressing very fast, in the digital video technologies and the wide availability of more efficient computing resources, we seem to be living in an era of explosion in digital video. Video data are now widely available, and being easily generated, in large volumes. This is not only on the professional level. It can be found everywhere, on the internet, especially with the video uploading sites, with the personal digital cameras and camcorders, and with the camera mobile phones that became almost the norm.

This is because the available techniques and tools for accessing, searching, and retrieving video data are not on the same level as for other traditional data, such as text. The advances in the video access, search, and retrieval techniques have not been progressing with the same pace as the digital video technologies and its generated data volume. This could be attributed, at least partly, to the nature of the video data and its richness, compared to text data. But it can also be attributed to the increase of our demands. In text, we are no longer just satisfied by searching for exact match of sequence of characters or strings, but need to find similar meanings and other higher level matches. We are also looking forward to do the same on video data. But the nature of the video data is different.

Video data is more complex and naturally larger in volume than the traditional text data. They usually combine visual and audio data, as well as textual data. These data need to be appropriately annotated and indexed in an accessible form for search and retrieval techniques to deal with it. This can be achieved based on either textual information, visual and/or audio features, and more importantly on semantic information. The textual-based approach is theoretically the simplest. Video data need to be annotated by textual descriptions, such as keywords or short sentences describing the contents. This converts the search task into the known area of searching in the text data, where the existing relatively advanced tools and techniques can be utilized. The main bottleneck here is the huge time and effort that are needed to accomplish this annotation task, let alone any accuracy issues. The feature-based approach, whether visual and/or audio, depends on annotating the video data by combinations of their extracted low-level features such as intensity, color, texture, shape, motion, and other audio features. This is very useful in doing a query-by-example task. But still not very useful in searching for specific event or more semantic attributes. The semantic-based approach is, in one sense, similar to the text-based approach. Video data need to be annotated, but in this case, with high-level information that represents the semantic meaning of the contents, rather than just describing the contents. The difficulty of this annotation is the high variability of the semantic meaning, of the same video data, among different people, cultures, and ages, to name just a few. It will depend on so many factors, including the purpose of the annotation, the domain and application, cultural and personal views, and could even be subject to the mood and personality of the annotator. Hence, generally automating this task is highly challenging. For specific domains, carefully selected combinations of the visual and/or audio features correlate to useful semantic information. Hence, the efficient extraction of those features is crucial to the high-level analysis and mining of the video data.

In this chapter, we focus on the core techniques that facilitate the high-level analysis and mining of the video data. One of the important initial steps in segmentation and analysis of video data is the shot-boundary detection. This is the first step in decomposing the video sequence to its logical structure and components, in preparation for analysis of each component. It is worth mentioning that the subject is enormous and this chapter is meant to be more of an introduction, especially for new researchers. Also, in this chapter, we only focus on the visual modality of the video. Hence, the audio and textual modalities are not covered.

Complete Chapter List

Search this Book:
Table of Contents
Dacheng Tao, Dong Xu, Xuelong Li
Chapter 1
Amr Ahmed
Video processing and segmentation are important stages for multimedia data mining, especially with the advance and diversity of video data... Sample PDF
Video Representation and Processing for Multimedia Data Mining
Chapter 2
Sébastien Lefèvre
Video processing and segmentation are important stages for multimedia data mining, especially with the advance and diversity of video data... Sample PDF
Image Features from Morphological Scale-Spaces
Chapter 3
Huiyu Zhou, Yuan Yuan, Chunmei Shi
The authors present a face recognition scheme based on semantic features’ extraction from faces and tensor subspace analysis. These semantic... Sample PDF
Face Recognition and Semantic Features
Chapter 4
Haibin Ling, David W. Jacobs
Computer-aided foliage image retrieval systems have the potential to dramatically speed up the process of plant species identification. Despite... Sample PDF
Shape Matching for Foliage Database Retrieval
Chapter 5
Shaohua Kevin Zhou, Jie Shao, Bogdan Georgescu, Dorin Comaniciu
Motion estimation necessitates an appropriate choice of similarity function. Because generic similarity functions derived from simple assumptions... Sample PDF
Similarity Learning for Motion Estimation
Chapter 6
Jian Cheng, Kongqiao Wang, Hanqing Lu
Relevance feedback is an effective approach to boost the performance of image retrieval. Labeling data is indispensable for relevance feedback, but... Sample PDF
Active Learning for Relevance Feedback in Image Retrieval
Chapter 7
Juliusz L. Kulikowski
Visual data mining is a procedure aimed at a selection from a document’s repository subsets of documents presenting certain classes of objects; the... Sample PDF
Visual Data Mining Based on Partial Similarity Concepts
Chapter 8
Jinhui Tang, Xian-Sheng Hua, Meng Wang
The insufficiency of labeled training samples is a major obstacle in automatic semantic analysis of large scale image/video database.... Sample PDF
Image/Video Semantic Analysis by Semi-Supervised Learning
Chapter 9
Shuqiang Jiang, Yonghong Tian, Qingming Huang, Tiejun Huang, Wen Gao
With the explosive growth in the amount of video data and rapid advance in computing power, extensive research efforts have been devoted to... Sample PDF
Content-Based Video Semantic Analysis
Chapter 10
Hossam A. Gabbar, Naila Mahmut
Semantic mining is an essential part in knowledgebase and decision support systems where it enables the extraction of useful knowledge form... Sample PDF
Applications of Semantic Mining on Biological Process Engineering
Chapter 11
Gerald Schaefer, Simon Ruszala
Efficient and effective techniques for managing and browsing large image databases are increasingly sought after. This chapter presents a simple yet... Sample PDF
Intuitive Image Database Navigation by Hue-Sphere Browsing
Chapter 12
Rong Yan, Apostol Natsev, Murray Campbell
Although important in practice, manual image annotation and retrieval has rarely been studied by means of formal modeling methods. In this chapter... Sample PDF
Formal Models and Hybrid Approaches for Efficient Manual Image Annotation and Retrieval
Chapter 13
Meng Wang, Xian-Sheng Hua, Jinhui Tang, Guo-Jun Qi
This chapter introduces the application of active learning in video annotation. The insufficiency of training data is a major obstacle in... Sample PDF
Active Video Annotation: To Minimize Human Effort
Chapter 14
Xin-Jing Wang, Lei Zhang, Xirong Li, Wei-Ying Ma
Although it has been studied for years by computer vision and machine learning communities, image annotation is still far from practical. In this... Sample PDF
Annotating Images by Mining Image Search
Chapter 15
Yonghong Tian, Shuqiang Jiang, Tiejun Huang, Wen Gao
With the rapid growth of image collections, content-based image retrieval (CBIR) has been an active area of research with notable recent progress.... Sample PDF
Semantic Classification and Annotation of Images
Chapter 16
Arun Kulkarni, Leonard Brown
With advances in computer technology and the World Wide Web there has been an explosion in the amount and complexity of multimedia data that are... Sample PDF
Association-Based Image Retrieval
Chapter 17
Gerald Schaefer
Image retrieval and image compression have been typically pursued separately. Only little research has been done on a synthesis of the two by... Sample PDF
Compressed-Domain Image Retrieval Based on Colour Visual Patterns
Chapter 18
M. Singh, X. Cheng, X. He
Discovery of the multimedia resources on network is the focus of the many researches in post semantic web. The task of resources discovery can be... Sample PDF
Resource Discovery Using Mobile Agents
Chapter 19
Multimedia Data Indexing  (pages 449-475)
Zhu Li, Yun Fu, Junsong Yuan, Ying Wu, Aggelos Katsaggelos, Thomas S. Huang
The rapid advances in multimedia capture, storage and communication technologies and capabilities have ushered an era of unprecedented growth of... Sample PDF
Multimedia Data Indexing
About the Contributors