Argo: Intelligent Advertising Made Possible from Images

Argo: Intelligent Advertising Made Possible from Images

Xin-Jing Wang (Microsoft Research Asia, China), Mo Yu (Harbin Institute of Technology, China), Lei Zhang (Microsoft Research Asia, China) and Wei-Ying Ma (Microsoft Research Asia, China)
Copyright: © 2011 |Pages: 17
DOI: 10.4018/978-1-60960-189-8.ch005
OnDemand PDF Download:
List Price: $37.50


In this chapter, we introduce the Argo system which provides intelligent advertising made possible from user generated photos. Based on the intuition that user-generated photos imply user interests which are the key for profitable targeted ads, Argo attempts to learn a user’s profile from his shared photos and suggests relevant ads accordingly. To learn a user interest, in an offline step, a hierarchical and efficient topic space is constructed based on the ODP ontology, which is used later on for bridging the vocabulary gap between ads and photos as well as reducing the effect of noisy photo tags. In the online stage, the process of Argo contains three steps: 1) understanding the content and semantics of a user’s photos and auto-tagging each photo to supplement user-submitted tags (such tags may not be available); 2) learning the user interest given a set of photos based on the learnt hierarchical topic space; and 3) representing ads in the topic space and matching their topic distributions with the target user interest; the top ranked ads are output as the suggested ads. Two key challenges are tackled during the process: 1) the semantic gap between the low-level image visual features and the high-level user semantics; and 2) the vocabulary impedance between photos and ads. We conducted a series of experiments based on real Flickr users and products (as candidate ads), which show the effectiveness of the proposed approach.
Chapter Preview

1. Introduction

Due to the popularity of digital cameras and mobile phones in human life, billions of user-generated images were and are being uploaded to the Web. Photo sharing has become an everyday behavior of users. Using Flickr1 as an example, as of November 2008, it claims to host more than 3 billion images. The availability of the huge amount of user-generated content (UGC) has motivated many interesting data mining researches and applications (Crandall, & Backstrom, et al., 2009; Plangprasopchok & Lerman, 2009).

Despite such a precious media format which glues people, how to monetize user-generated photos is still a seldom-touched question, which leaves the photos as well as photo sharing websites an uncovered gold mine. Li et al. (2008) proposed a method to deliver online ads along with Web images utilizing the downloading time of Web images, particularly when a Web image has a large file size and the network bandwidth is limited. The focus of this work is to develop an innovative way to non-intrusively embed ads into images in a visually pleasant manner, but it does not touch the relevance problem of advertising. Hua et al. (2008), on the other hand, proposed a so-called image Adsense approach. However, their main focus is intelligent placement of ads, which identifies an unimportant region of an image and displays an ad photo/logo in this region. The relevant ads, in their approach, are identified simply by matching the ads descriptions with the surrounding texts of the carrying image.

Motivated by the huge potential of monetizing user-generated photos on the Web, we attempt to develop an effective approach for intelligent advertising. The contribution of this chapter is twofold: Firstly, it is the first work which takes image visual content into account and attempts to suggest relevant ads based on both image content analysis and tag mining. It is known that many Web photos do not have any user-submitted tags or surrounding texts, therefore text-based ads selection (Hua, Mei, & Li, 2008) can only handle a very small portion of Web photos. Although image auto-annotation has been studied for decades, the problem of image understanding for advertising has many new challenges (Li, Zhang, & Ma, 2008). For example, even if users have tagged their photos, many tags are still not ready for use: not only there are many typos or unparsed phrases which are typical in UGC data, but also users prefer to use personal vocabulary to summarize the same main concepts, which results in the sparseness of tags. Moreover, the vocabulary of user-submitted tags is different from the vocabulary of ads, which is typically called vocabulary impedance (Ribeiro-Neto, & Cristo, M., et al., 2005), so that the recall of ads matching tends to be very low. Secondly, it suggests targeted ads based on user interest modeling from image content features rather than directly based on content or context analysis (Hua, Mei, & Li, 2008; Murdock, Ciaramita, & Plachouras, 2007; Lacerda, Cristo, M., et al., 2006).

In fact, user-generated photos tell about user interest either explicitly or implicitly. Generally people are unwilling to share their interest publicly by typing keywords and filling forms; however, the photos they took reveal their interest, because people tend to capture various objects or scenes that attract them. Figure 1 shows the screen shots of part of the photo galleries of three Flickr users, which illustrates that different users have very diverse interest.

Figure 1.

Photos shared by three Flickr users, which shows that different users have different interest

Complete Chapter List

Search this Book: