Recommending Related Microblogs

Recommending Related Microblogs

Lin Li, Huifan Xiao, Guandong Xu
DOI: 10.4018/978-1-4666-2806-9.ch013
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Computing similarity between short microblogs is an important step in microblog recommendation. In this chapter, the authors utilize three kinds of approaches—traditional term-based approach, WordNet-based semantic approach, and topic-based approach—to compute similarities between micro-blogs and recommend top related ones to users. They conduct experimental study on the effectiveness of the three approaches in terms of precision. The results show that WordNet-based semantic similarity approach has a relatively higher precision than that of the traditional term-based approach, and the topic-based approach works poorest with 548 tweets as the dataset. In addition, the authors calculated the Kendall tau distance between two lists generated by any two approaches from WordNet, term, and topic approaches. Its average of all the 548 pair lists tells us the WordNet-based and term-based approach have generally high agreement in the ranking of related tweets, while the topic-based approach has a relatively high disaccord in the ranking of related tweets with the WordNet-based approach.
Chapter Preview
Top

Background

Many twitter related work and problems have been investigated in the literature. Kwak et al. (2010) have made the first quantitative study on the entire Twitter sphere and information diffusion on it. They studied the topological characteristics of Twitter and its power as a new medium of information sharing and have found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks. Yin et al. (2011) analyze link formation in micro-blogs. They found that 90 percent of new links are to people just two hops away and the dynamics of new link creation are affected by the users account age. Their experimental results showed that in the very beginning (within 100 days), the users add many friends and then for the older users (100-400 days), their friends seem more stable, while for much older users (more than 500 days), their number of new friends is larger and larger. Results also showed that the older the user, the larger the increase in followers.

On the other hand, computation of short text similarity has also been studied in many researches through various points of view. Many techniques have been proposed to overcome the vocabulary mismatch problem, including stemming (Krovetz, 1993; Porter, 1980), LSI (Deerwester, Dumais, Landauer, Furnas, & Harshman, 1990), translation models (Berger & Lafferty, 1999), and query expansion (Zhai & Lafferty, 2001; Lavrenko & Croft, 2001). Query expansion is a common technique that used to convert an initial, typically short, query into a richer representation of the information need (Zhai & Lafferty, 2001; Lavrenko & Croft, 2001; Rocchoi, 1971). This is accomplished by adding terms that are likely to appear in relevant or pseudo-relevant documents to the original query representation. Sahami and Heilman proposed a method of enriching short text representations that can be construed as a form of query expansion (Sahami & Heilman, 2006). Their proposed method expands short segments of text using Web search results. The similarity between two short segments of texts can then computed in the expanded representation space.

Complete Chapter List

Search this Book:
Reset