Enhancement of TOPSIS for Evaluating the Web-Sources to Select as External Source for Web-Warehousing

Enhancement of TOPSIS for Evaluating the Web-Sources to Select as External Source for Web-Warehousing

Hariom Sharan Sinha (Jawaharlal Nehru University, New Delhi, India)
Copyright: © 2018 |Pages: 14
DOI: 10.4018/IJRSDA.2018010108
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In this paper, the main concern is to evaluate the web-sources, which are to be selected as an external source for web-warehousing. In order to identify the web sources, they are evaluated on the basis of their multiple features. For it, Multi-Criteria Decision Making (MCDM) approach is used. In this paper, among all the MCDM approach, the focus is on “Technique for Order Preference by Similarity to Ideal Solution” (TOPSIS) approach and proposing an enhancement in this method. The traditional TOPSIS approach uses Euclidean Distance to measure the similarity. Here, Jeffrey Divergence has been proposed instead of Euclidean Distance to compute the similarity measure which includes asymmetric and symmetric distances during computation. Experimental analysis of both the variations of TOPSIS approach have been conducted and the result shows the enhancement in the selection of web sources.
Article Preview

Introduction

Web-warehouse consists of both the technologies viz. Web Technology and Data Warehouse Technology (Tan et al., 2003; Ng et al., 1998). Comprehensively, Web-warehouse is an approach to develop the system, which have primary objective to identify, catalog, retrieve, store and analyze the data, available in the form of text, graphics, image, sounds, videos and other multimedia form, with the help of web technologies, in order to help the user to find and analyze the information effectively (Martinez, 2008; Tan et al., 2003). Despite being Internet a prominent source of information as well as an open platform to share and retrieve the data, the data available on web is not properly structured. So, it is not acceptable by traditional data warehouse. But data analytics requires more data for decision support system, so compels the traditional data warehouse to update itself to web-warehouse (Zhu & Buchmann, 2002; Ng et al., 1998).

To design a web-warehouse, the architect has to tackle many challenges because of strict nature of warehouse (Inmon, 2009; Ponniah, 2001) and open nature of Web. Web-data has dynamic and complex nature. Besides there are millions of web-sources available on web. So, to find the relevant and consistent data on web is like searching a needle in haystack. Thus, the very first task of web-warehousing approach is to ascertain the relevant web-sources as external data sources for warehousing. To ascertain the relevancy, the web-sources is evaluated on the basis of various features. These features have been classified into three categories viz. web source stability, web data quality and contextual issues of web data (Zhu & Buchmann, 2002). As we know, MCDM (Velasquez & Hester, 2013; Triantaphyllou et al., 1998) is an approach to find out the best among all the alternatives using multiple features. Before comprehensive explanation of MCDM, little more description of the set of features is here in this section.

The first category of features set explicates that in addition to numerous availability of web-sources, the web-data changes frequently and a large number of web-sources are adding up to the web. Thus, present available web-source may change or disappear (Zhu & Buchmann, 2002).

Second category of features set elaborates the quality of web-data. As a large amount of data available on web is not properly checked before made it available on web. Since web is an open and independent platform. So, inconsistent, ill-structured, incomplete and wrong data is often available on web (Zhu & Buchmann, 2002).

Third category of features set explains the context of data. As data available on web is browsing centric not analytics centric. Context of data not only explains challenges in the terms of relevance of data for warehousing, but also the easiness for extraction of data and metadata such as data definition, data derivation etc. (Zhu & Buchmann, 2002).

All these features are taken into account to evaluate the web-sources, while designing an efficient web-warehouse. Zhu et al. proposed features, as evaluation parameters and used MCDM approach for evaluation of web-sources (Zhu & Buchmann, 2002). In this paper, among all the MCDM approach, the main focus is on TOPSIS approach and especially on the enhancement of TOPSIS approach (Velasquez & Hester, 2013; Triantaphyllou et al., 1998; Zionts, 1979) by replacing the Euclidean Distance measure by Jeffrey Divergence measure (Ullah, 1996; Johnson & Sinanovic, 2001; Cha, 2007; Bayarri & García‐Donato, 2008).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 5: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 4: 4 Issues (2017)
Volume 3: 4 Issues (2016)
Volume 2: 2 Issues (2015)
Volume 1: 2 Issues (2014)
View Complete Journal Contents Listing