Top-k Relevant Term Suggestion Approach for Relational Keyword Search

Top-k Relevant Term Suggestion Approach for Relational Keyword Search

Xiangfu Meng (Liaoning Technical University, China), Xiaoyan Zhang (Liaoning Technical University, China) and Chongchun Bi (Liaoning Technical University, China)
DOI: 10.4018/978-1-4666-8767-7.ch001
OnDemand PDF Download:
No Current Special Offers


This chapter proposes a novel approach, which can provide a list of keywords that both semantically related to the application domain and the given keywords by analyzing the correlations between query keywords and database terms. The database term is first modeled as and suppose each query keyword can map into a database term. Then, a coupling relationship measuring method is proposed to measure both term intra- and inter-couplings, which can reflect the explicit and implicit relationships between terms in the database. Based on the coupling relationships between terms, for a given keyword query, an order of all terms in database is created for each query keyword and then the threshold algorithm (TA) is leveraged to expeditiously generate top-k ranked semantically related terms. The experiments demonstrate that our term coupling relationship measuring method can efficiently capture the semantic correlations between query keywords and terms in database.
Chapter Preview

1. Introduction

Keyword query is becoming a very popular way to obtain the information from the relational database along with its wide spread use on the Web. In real applications, however, most of common Web database users usually have insufficient knowledge about the database content and schema, and they are also lack of keywords related to the searching domain. Thus, it is not easy for them to find appropriate keywords to express their query intentions. To explore the database, the user may issue a query with a few general keywords at first, and then gradually refines the query through observing the query results. In such an iteration, the user needs to check each result to identify whether it is related to his interest or not, which is a time-consuming and tedious work.

Consider a DBLP database consisting of 3 relations connected through primary-foreign-key relationships shown in Figure 1.

Figure 1.

An example of DBLP database


Suppose a master student who is a XML beginner just knows a few keywords about XML research field and wants to find chapters about the XML search techniques from DBLP website. Based on the DBLP database, he/she would issue a query Q containing keywords “XML, search”. On receiving the query Q, the traditional keyword search approach will return a set of minimal total joint networks (MTJNTs), each of which

  • 1.

    Is obtained from a single relation or by joining several relations, and

  • 2.

    Contains all the query keywords.

Since there are too many chapters containing keywords “XML” and “search” in DBLP dataset, there are too many MTJNTs in the query results. In such a case, the user would like the system suggest a list of keywords that are semantically related to Q in order to reduce the searching scope. From Figure 1, it is clearly that the author “Jeffrey” and keywords “XPath”, “XQuery”, and “twig pattern” are very relevant to Q. That means these terms can refine Q to formulate a more selective query. As an example, the user would execute a query Q’=[Jeffrey, XML, search] to retrieve only the chapters of author Jeffrey on XML searching and the query results are “a1978-1-4666-8767-7.ch001.m01w1978-1-4666-8767-7.ch001.m02p1” and “a1978-1-4666-8767-7.ch001.m03w2978-1-4666-8767-7.ch001.m04p4”. Additionally, the tuples p2 and p3 containing “full-text”, “semi-structured data”, and “twig pattern” are also related to the query Q. While, these tuples would not be returned by the system due to the terms they contained are not specified explicitly by the user query. If the user is also interested in these topics, he/she can choose the keyword “full-text”, “semi-structured data”, and/or “twig pattern” to explore the database. Hence, it is necessary to provide a list of semantically related terms to the given query and then the user can refine or reformulate his/her query according to the terms in the list.

Key Terms in this Chapter

Coupling Relationship: There exists a coupling relationship between objects a and b if object a (or b ) has an influence on object b (or a ) or they interact with each other.

Web Database: Web database is the online database that can be only accessed using the web form-based interface.

Term: It is a combination of the attribute and its corresponding value consisted in the database, such as <Author, Jeffrey> is a term extracted from the database.

Query Keyword: A term or phrase contained in a user query.

Top-k Ranking: A set of k answer items with the highest ranking score according to the given scoring function.

Precision: It is a measure that allows knowing how good or bad is an answer set in terms of effectiveness and completeness for a given keyword query.

Complete Chapter List

Search this Book: