Constructing Temporal Equivalence Partitionings for Keyword Sets

Constructing Temporal Equivalence Partitionings for Keyword Sets

Parvathi Chundi (Department of Computer Science, Peter Kiewit Institute, University of Nebraska-Omaha, Omaha, NE, USA) and Mahadevan Subramaniam (Department of Computer Science, Peter Kiewit Institute, University of Nebraska-Omaha, Omaha, NE, USA)
Copyright: © 2015 |Pages: 18
DOI: 10.4018/ijkbo.2015070101
OnDemand PDF Download:
$37.50

Abstract

Identifying keyword associations from text and search sources is often used to facilitate many tasks such as understanding relationships among concepts, extracting relevant documents, matching advertisements to web pages, expanding user queries, etc. However, these keyword associations change continually change with time. In this paper, the authors define an equivalence relationship among keywords and develop methods to construct a temporal view of the equivalence relationship by constructing optimal temporal equivalence partitionings for keyword sets. They describe efficient algorithms to construct an optimal temporal equivalence partitioning for a keyword pair. They use the fact that the equivalence relationship is transitive to extend these algorithms to obtain an optimal temporal equivalence partitioning for a larger set of keywords. The authors show the effectiveness of the approach by constructing the temporal equivalence partitionings of several sets of keywords from the Multi-Domain Sentiment data set and the ICWS2009 Spinn3r data set.
Article Preview

Introduction

Keyword search is by far the most prevalent method for finding relevant information on the Web today. Users express their information need as a simple list of keywords. Documents containing one or more of search keywords or related keywords are then presented as a result of the search. This simple and popular method is adequate in most cases. However, the information sources on the Web are constantly evolving. New information is continually added and old information is updated. This flux in information makes it challenging for users to choose the appropriate keywords to search for relevant documents.

Almost all of the content available today is time-stamped. These time stamps provide a way to temporally relate information extracted from the content. In this paper, we describe a novel approach to extract evolving temporal relationships among keywords from a time stamped document set. We focus on one particular relation among keywords which we call equivalence. Intuitively, a set of keywords are equivalent if they appear in the same context in the document set. We then capture the temporal changes in the equivalence relationship of a set of keywords by constructing a partitioning of the time period of the given document set into a sequence of maximum length intervals such that, in each interval, the equivalence relationship is preserved.

Our approach analyzes a set of documents D published over a time period T to extract equivalence among a given set of keywords. We assume that T is represented as a list of time points of some base granularity day, month, etc. Each document in D is assigned to one of the time points in T based on its time stamp. Each document is represented as set of keywords. We use the frequent item set computation (Agarwal and Srikant, 1994) to compute the sets of keywords that are frequent in a document set. The context of a keyword a in a document set is the set of all frequent keyword sets that contain the keyword a. We first define the equivalence of two keywords a and b as follows – a and b are equivalent in a document set D if the contexts of a and b in D are the same a is substituted by b or vice versa. In that case, we refer to the contexts of a and b as synonymous contexts. Keywords a and b are temporally equivalent in a time point (or an interval) if a and b are equivalent in the document set of that time point (or interval). The notion of equivalence can be naturally extended to an arbitrary set of keywords G (|G| > 2) as the following: Keywords in G are equivalent in D if every pair of keywords in G is equivalent in D.

To identify temporal changes in the equivalence of keywords, we introduce the notion of an equivalence preserving and non-equivalence preserving interval. Informally, given an interval Ti, Ti is equivalence preserving for a keyword pair a and b if a and b are temporally equivalent everywhere in Ti with the same synonymous contexts. An equivalence preserving interval Ti is maximal if there is no other interval Tj that properly contains Ti and Tj is an equivalence preserving interval. An interval Tk is non-equivalence preserving for a and b if the context of a and b is unchanged in Tk and a and b are not equivalent anywhere in Tk. Maximal non-equivalence preserving interval can be defined similar to that of a maximal equivalence preserving interval. Equivalence and non-equivalence preserving intervals for the set of keywords G can be similarly defined.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing