Multi-View Meets Average Linkage: Exploring the Role of Metadata in Document Clustering

Multi-View Meets Average Linkage: Exploring the Role of Metadata in Document Clustering

Divya Teja Ravoori, Zhengxin Chen
Copyright: © 2015 |Volume: 5 |Issue: 2 |Pages: 17
ISSN: 2155-6377|EISSN: 2155-6385|EISBN13: 9781466679221|DOI: 10.4018/IJIRR.2015040102
Cite Article Cite Article

MLA

Ravoori, Divya Teja, and Zhengxin Chen. "Multi-View Meets Average Linkage: Exploring the Role of Metadata in Document Clustering." IJIRR vol.5, no.2 2015: pp.26-42. http://doi.org/10.4018/IJIRR.2015040102

APA

Ravoori, D. T. & Chen, Z. (2015). Multi-View Meets Average Linkage: Exploring the Role of Metadata in Document Clustering. International Journal of Information Retrieval Research (IJIRR), 5(2), 26-42. http://doi.org/10.4018/IJIRR.2015040102

Chicago

Ravoori, Divya Teja, and Zhengxin Chen. "Multi-View Meets Average Linkage: Exploring the Role of Metadata in Document Clustering," International Journal of Information Retrieval Research (IJIRR) 5, no.2: 26-42. http://doi.org/10.4018/IJIRR.2015040102

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Inspired by the success of a recently developed algorithm MVSC-IR, the authors embed the idea of Multi-Viewpoint Based Similarity Measure for clustering (MVSC) into a hierarchical clustering method, i.e., average linkage clustering, to overcome the problem of initiation with random seeds, resulting in a new algorithm, referred to as MVSC-HAC. The improved performance of this new algorithm encouraged us to further explore the impact of metadata in document clustering. In this paper, after reviewing two existing algorithms, the authors describe their new algorithm and present experimental results on various sizes of data sets at two different levels: the one using the entire context of documents and the one using existing meta tags of the documents. The result shows MVSC-HAC excels at both levels. The authors analyze the results, and provide a discussion based on other observation on the role of metadata in document clustering.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.