An Approach to Trie Based Keyword Search for Search Engines

An Approach to Trie Based Keyword Search for Search Engines

Pranav Murali (Vivekanand Education Society's Institute of Technology (VESIT), Maharashtra, India)
Copyright: © 2017 |Pages: 16
DOI: 10.4018/IJLIS.2017010101

Abstract

Search Engines use indexing techniques to minimize the time taken to find the relevant information to a search query. They maintain a keywords list that may reside either in the memory or in the external storage, like a hard disk. While a pure binary search can be used for this purpose, it suffers from performance issue when keywords are stored in the external storage. Some implementations of search engines use a B-tree and sparse indexes to reduce access time. This paper aims at reducing the keyword access time further. It presents a keyword search technique that utilizes a combination of trie data structure and a new keyword prefixing method. Experimental results show good improvement in performance over pure binary search. The merits of incorporating trie based approach into contemporary indexing methods is also discussed. Keyword prefixing method is described and some salient steps in the process of keyword generation are outlined.
Article Preview

Scope Of The Work

This paper presents a technique for building a search engine using a combination of trie data structure and a unique ‘keyword prefixing’ method. The primary aim of the paper is to reduce keyword access times. It is shown that a trie based approach in conjunction with binary search has a better performance over pure binary search on a lexicographically sorted list of keywords. The paper also discusses the merits of using trie based approach in conjunction with B-Tree based indexing methods that are commonly used in today’s search engines. The method of keyword prefixing is described and the process of keyword list generation is outlined.

Before presenting the technique, some important characteristics of user search and the search keywords are described under this section, as they are central to the search technique developed here.

Search Characteristic

Although we as users use general purpose search engines for day to day information needs, most of our queries are confined to a specific domain of knowledge. When a user sends a query to the search engine, it is almost always about a particular subject. For example, it could be about medicine or about computers or a recent event or something to do with finance or politics. It rarely cuts across several domains. In other words, a single query is domain-specific most of the time.

A new prefixing method that will be described shortly makes use of this property to achieve better search performance.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 8: 2 Issues (2019): Forthcoming, Available for Pre-Order
Volume 7: 2 Issues (2018): 1 Released, 1 Forthcoming
Volume 6: 2 Issues (2017)
View Complete Journal Contents Listing