Semantic Mining Technologies for Multimedia Databases

Semantic Mining Technologies for Multimedia Databases

Dacheng Tao (Hong Kong Polytechnic University, Hong Kong), Dong Xu (Columbia University, USA) and Xuelong Li (University of London, UK)
Release Date: April, 2009|Copyright: © 2009 |Pages: 550
ISBN13: 9781605661889|ISBN10: 1605661880|EISBN13: 9781605661896|DOI: 10.4018/978-1-60566-188-9


Multimedia searching and management have become popular due to demanding applications and competition among companies. Despite the increase in interest, there is no existing book covering basic knowledge on state-of-the-art techniques within the field.

Semantic Mining Technologies for Multimedia Databases provides an introduction to the most recent techniques in multimedia semantic mining necessary to researchers new to the field. This book serves as an important reference in multimedia for academicians, multimedia technologists and researchers, and academic libraries.

Topics Covered

The many academic areas covered in this publication include, but are not limited to:

  • Active video annotation
  • Association-based image retrieval
  • Content-based video semantic analysis
  • Face recognition and semantic features
  • Intuitive image database navigation
  • Multimedia data indexing
  • Multimedia information representation
  • Multimedia resource annotation
  • Resource discovery using mobile agents
  • Semantic classification of images
  • Visual Data Mining

Reviews and Testimonials

This publication details how current semantic mining tasks play an important role in may fields including random sampling techniques and support vector machine for human computer interaction, manifold learning and subspace methods for data visualization, discriminant analysis for feature selection, and classification trees for data indexing.

– Dacheng Tao, Hong Kong Polytechnic University, Hong Kong

Table of Contents and List of Contributors

Search this Book:


With the explosive growth of multimedia databases in terms of both size and variety, effective and efficient indexing and searching techniques for large-scale multimedia databases have become an urgent research topic in recent years.

For data organization, the conventional approach is based on key words or text description of a multimedia datum. However, it is tedious to give all data text annotation and it is almost impossible for people to capture as well. Moreover, the text description is also not enough to precisely describe a multimedia datum. For example, it is unrealistic to utilize words to describe a music clip; an image says more than a thousand words; and keywords-based video shot description cannot characterize the contents for a specific user. Therefore, it is important to utilize the content based approaches (CbA) to mine the semantic information of a multimedia datum.

In the last ten years, we have witnessed very significant contributions of CbA in semantics targeting for multimedia data organization. CbA means that the data organization, including retrieval and indexing, utilizes the contents of the data themselves, rather than keywords provided by human. Therefore, the contents of a datum could be obtained from techniques in statistics, computer vision, and signal processing. For example, Markov random fields could be applied for image modeling; spatial-temporal analysis is important for video representation; and the Mel frequency cepstral coefficient has been shown to be the most effective method for audio signal classification.

Apart from the conventional approaches mentioned above, machine learning also plays an indispensable role in current semantic mining tasks, e.g., random sampling techniques and support vector machine for human computer interaction, manifold learning and subspace methods for data visualization, discriminant analysis for feature selection, and classification trees for data indexing.

The goal of this IGI book is to provide an introduction about the most recent research and techniques in multimedia semantic mining for new researchers, so that they can go step by step into this field. As a result, they can follow the right way according to their specific applications. The book is also an important reference for researchers in multimedia, a handbook for research students, and a repository for multimedia technologists.

The major contributions of this book are in three aspects: 1) collecting and seeking the recent and most important research results in semantic mining for multimedia data organization, 2) guiding new researchers a comprehensive review on the state-of-the-art techniques for different tasks for multimedia database management, and 3) providing technologists and programmers important algorithms for multimedia system construction.

This edited book attracted submissions from eight countries including Canada, China, France, Japan, Poland, Singapore, UK, and USA. Among these submissions, 19 have been accepted. We strongly believe that it is now an ideal time to publish this edited book with the 19 selected chapters. The contents of this edited book will provide readers with cutting-edge and topical information for their related research.

Accepted chapters are solicited to address a wide range of topics in semantic mining from multimedia databases and an overview of the included chapters is given below.

This book starts from new multimedia information representations (Video Representation and Processing for Multimedia Data Mining) (Image Features from Morphological Scale-spaces) (Face Recognition and Semantic Features), after which learning in multimedia information organization, an important topic in semantic mining, is studied by four chapters (Shape Matching for Foliage Database Retrieval) (Similarity Learning For Motion Estimation) (Active Learning for Relevance Feedback in Image Retrieval) (Visual Data Mining Based on Partial Similarity Concepts). Thereafter, four schemes are presented for semantic analysis in four chapters (Image/Video Semantic Analysis by Semi-Supervised Learning) (Content-Based Video Semantic Analysis) (Semantic Mining for Green Production Systems) (Intuitive Image Database Navigation by Hue-sphere Browsing). The multimedia resource annotation is also essential for a retrieval system and four chapters provide interesting ideas (Hybrid Tagging and Browsing Approaches for Efficient Manual Image Annotation) (Active Video Annotation: To Minimize Human Effort) (Image Auto-Annotation by Search) (Semantic Classification and Annotation of Images). The last part of this book presents other related topics for semantic mining (Association-Based Image Retrieval) (Compressed-domain Image Retrieval based on Colour Visual Patterns) (Multimedia Resource Discovery using Mobile Agent) (Multimedia Data Indexing).

Author(s)/Editor(s) Biography

Dacheng Tao received the B.Eng. degree from the University of Science and Technology of China (USTC), the MPhil degree from the Chinese University of Hong Kong (CUHK), and the PhD degree from the University of London (Lon). Currently, he is an assistant professor with the Department of Computing in the Hong Kong Polytechnic University, is a visiting professor in the Xi'Dian University, and holds a visiting position at Birkbeck in Lon. His research interests include artificial intelligence, computer vision, data mining, geoinformatics, machine learning, multimedia, remote sensing, statistics, and visual surveillance. He has published more than 80 scientific articles extensively at IEEE TPAMI, TIP, TKDE, TMM, TCSVT, TSMC, CVPR, ICDM; ACM TKDD, Multimedia, KDD etc. with best paper award and nominations. Previously he gained several Meritorious Awards from the International Interdisciplinary Contest in Modeling, which is the highest level mathematical modeling contest in the world, organized by COMAP. He is an associate editor of Neurocomputing (Elsevier) and the Official Journal of the International Association for Statistical Computing -- Computational Statistics & Data Analysis (Elsevier). He authored/edited five books and seven special issues, including CVIU, PR, PRL, SP, and Neurocomputing. He (co-)chaired special sessions, invited sessions, workshops and conferences. He served for around 60 major international conferences including CVPR, ICCV, ECCV, ICDM, KDD, and Multimedia, and around 20 top international journals including TPAMI, TOIS, TIP, TCSVT, TMM, TIFS, TSMC-B, Computer Vision and Image Understanding (CVIU), and Information Science.
Dong Xu is currently an assistant professor at Nanyang Technological University at Singapore. He received the B.Eng. and PhD degrees from the Electronic Engineering and Information Science Department, University of Science and Technology of China, in 2001 and 2005, respectively. During his PhD study, he worked at Microsoft Research Asia and The Chinese University of Hong Kong for more than two years. He also worked at Columbia University for one year as a postdoctoral research scientist. His research interests include computer vision, pattern recognition, statistical learning and multimedia content analysis. He has published more than 20 papers in top venues including T-PAMI, T-IP, T-CSVT, T-SMC-B and CVPR. He is an associate editor of Neurocomputing (Elsevier). He is the guest editors of three special issues on video and event analysis in IEEE Transactions on Circuits Systems for Video Technology (T-CSVT), Computer Vision and Image Understanding (CVIU) and Pattern Recognition Letters (PRL), and a coauthor of a forthcoming book entitled "Semantic Mining Technologies for Multimedia Databases". He was awarded a Microsoft Fellowship in 2004.
Xuelong Li holds a permanent post at Birkbeck College, University of London and a visiting/guest professorship at Tianjin University and University of Science and Technology of China. His research focuses on cognitive computing, image/video processing, pattern recognition, and multimedia. His research activities are partly sponsored by EPSRC, the British Council, Royal Society, and the Chinese Academy of Sciences. He has over a hundred scientific publications with several Best Paper Awards and finalists. He is an author/editor of four books, an associate editor of IEEE Trans. on Image Processing, IEEE Trans. on Circuits and Systems for Video Technology, IEEE Trans. on Systems, Man and Cybernetics Part B, and IEEE Trans. on Systems, Man and Cybernetics Part C. He is also an associate editor (editorial board member) of ten other international journals and a guest co-editor of eight special issues. He has served as a chair of around twenty conferences and a program committee member for more than eighty conferences. He has been a reviewer for over a hundred journals and conferences, including eleven IEEE transactions. He is a academic committee member of the China Society of Image and Graphics, a senior member of the IEEE, the chair of IEEE Systems, Man and Cybernetics Society Technical Committee on Cognitive Computing, and a member of several other technical committees of IEEE Systems, Man and Cybernetics Society and IEEE Signal Processing Society Technical Committee on Machine Learning for Signal Processing (MLSP). He is a Chapters Coordinator of the IEEE Systems, Man and Cybernetics Society.