Multimodal Semantics and Affective Computing from Multimedia Content

Multimodal Semantics and Affective Computing from Multimedia Content

Rajiv Ratn Shah, Debanjan Mahata, Vishal Choudhary, Rajiv Bajpai
Copyright: © 2018 |Pages: 24
DOI: 10.4018/978-1-5225-5246-8.ch014
(Individual Chapters)
No Current Special Offers


Advancements in technologies and increasing popularities of social media websites have enabled people to view, create, and share user-generated content (UGC) on the web. This results in a huge amount of UGC (e.g., photos, videos, and texts) on the web. Since such content depicts ideas, opinions, and interests of users, it requires analyzing the content efficiently to provide personalized services to users. Thus, it necessitates determining semantics and sentiments information from UGC. Such information help in decision making, learning, and recommendations. Since this chapter is based on the intuition that semantics and sentiment information are exhibited by different representations of data, the effectiveness of multimodal techniques is shown in semantics and affective computing. This chapter describes several significant multimedia analytics problems such as multimedia summarization, tag-relevance computation, multimedia recommendation, and facilitating e-learning and their solutions.
Chapter Preview


The advent of the social media websites, advancements in smartphones, and affordable network infrastructures have enabled anyone with an Internet connection and a smartphone to easily express their ideas, opinions, and content (e.g., photos, videos, and texts) with millions of other people around the world. Thus, the amount of user-generated content (UGC) on websites has increased rapidly in recent years. Emotions and sentiments play a crucial role in our everyday lives. They aid decision-making, learning, communication, and situation awareness in human-centric environments. Over the past two decades, researchers in artificial intelligence have been attempting to endow machines with cognitive capabilities to recognize, infer, interpret and express emotions and sentiments. All such efforts can be attributed to affective computing, an interdisciplinary field spanning computer science, psychology, social sciences and cognitive science. Sentiment analysis and emotion recognition also become a new trend in social media, avidly helping users understand opinions being expressed on different platforms. Moreover, it is evident from an interesting recent trend is that the most social media websites such as Flickr, YouTube, and Twitter create opportunities for users to generate content, instead of creating content by themselves. Since many users spend their significant time on such social media websites, companies are interested in sensing users’ behaviors to provide personalized services and recommendations. Moreover, semantics and affective information computed from UGC are very useful in providing an efficient search, retrieval, and recommendation. For instance, they are useful in several significant social media analytics problems such as tag recommendation and ranking for photos, music recommendation for photos and videos, and recommending items to users based on their behaviors on social media websites. Thus, to benefit users and companies from an automatic semantics and affective understanding of UGC, this chapter focuses on developing efficient algorithms for semantics and affective computing.

Despite knowledge structures derived from the semantics and sentiment computing of user-generated content are beneficial for both users and companies in an efficient search, retrieval, and recommendation, it is difficult to get correct semantics and sentiment information. It is because real-world UGC is complex, and extracting the semantics and affective information from only content is very difficult. Since suitable concepts for sentiment and sentiment analysis are exhibited by different modalities, it is important to exploit the multimodal information of UGC (Shah, 2016c; Shah, 2016e; Shah and Zimmermann, 2017). For an efficient semantics computing, they leveraged both content and contextual information of user-generated content. Due to the increasing popularity of social media websites and advancements in technology, it is possible now to collect a significant amount of important contextual information (e.g., spatial, temporal, preference, and opinion information). Similarly, for an efficient affective computing, they exploited textual modality with information from other modalities such as audio and visual. Earlier contributions on the semantics and affective computing either work in unimodal setting or leverage limited information from other modalities. For instance, most work in semantics and sentiment computing do not leverage many of the contextual, audio, visual, gaze, and other information together. This chapter describes many multimodal techniques that augment knowledge structures for an efficient semantics and sentiment understanding.

Complete Chapter List

Search this Book: