Hyperlink Structure of Electronic Commerce Websites

Rathimala Kannan (Multimedia University, Malaysia) and Kannan Ramakrishnan (Multimedia University, Malaysia)
DOI: 10.4018/978-1-4666-9787-4.ch056
The World Wide Web is a complex and growing structure that exhibits strong patterns regardless of its dynamic nature and diversity. A systematic study about the WWW gives a macro level picture of the Web. To understand the Web to its micro level, studies are needed at the level of individual Websites. Understanding of Website structure is considered as a reverse engineering process that attempts to discover automatically the layout and hyperlink pattern of a Website. This knowledge helps in improving the architecture of the Website layout and the organisation patterns of the Web pages. Recently Web structure mining research has gained increasing attention in the light of the on-going growth of WWW, particularly in the commerce domain. This chapter reviews the existing studies on the applications of Web structure mining (WSM) and critically evaluates the methods and techniques that are applied to understand the hyperlink structure of e-commerce Websites.

Web Structure Mining (WSM) Research

WSM represents mining the hyperlinks between inter and intra Websites to understand the overall structure to extract patterns in the information architecture. Exploitation of hyperlink structure of the Web allows the identification of interesting information, such as the patterns linking the Web pages, and Websites, ranking and classifying of Web pages based on hyperlinks (Lappas, 2007). Hyperlink structure analysis of the Web is used as a basis to obtain search results (Espadas, Calero, & Piattini, 2008). Search engines use the hyperlinks to identify more relevant Web pages by ranking based on the number of links pointed to a Web page (Kosala & Blockeel, 2000). One of the well-known examples is the commercial search engine Google’s, page rank algorithm which uses hyperlinks to rank the Web pages for a specific keyword search (Brin & Page, 1998). The Web page that receives more incoming links relatively is ranked higher than other similar Web pages.

Hyperlink structure has been studied and used in information retrieval algorithms to rank search results on the Web (Vaughan & You, 2005; L. Yan, Wei, Gui, & Chen, 2011). Statistical analysis of Web page links has been conducted by (Becchetti, Castillo, Donato, Baeza-Yates, & Leonardi, 2008) and they proposed spam detection techniques that are only based on the link structure of the Web page irrespective of the contents. Some studies have exploited commerce domain Websites’ hyperlinks and found that the number of links pointing to a commerce company’s Website is correlated with the company’s business information such as revenue, profit and research expenses (Romero-Frias & Vaughan, 2010; Vaughan & Wu, 2004; Vaughan, 2004a). The findings of these research works suggest that hyperlinks to commercial Websites could be used as a business performance indicator. This knowledge is useful for competitive business intelligence and Web data mining.

Key Terms in this Chapter

External Outlink: Hyperlink that is pointing to other Websites.

Webometric Study: The study of Web-based content with primarily quantitative methods for social science research goals using techniques that are not specific to one field of study.

Web Structure Mining (WSM): WSM represents mining the hyperlinks between inter and intra Websites to understand the overall structure to extract patterns in the information architecture.

Total External Inlinks: Total number of links pointing to this Website from other Websites excluding self-links of the Website.

Total Number of Pages: The number of pages in a Website indexed by a popular search engines.

External Hyperlink Structure: The external hyperlink structure of a Website is created when there are links pointing to or from the target Website to other Websites.

Total Inlinks: Hyperlinks count that is pointing to a Website including the self or internal links of the Website.

External Inlink: Hyperlink that points to a Website from other Websites.

Website Visibility: Visibility is defined as the extent to which a user is likely to come across a reference to a company’s Website in his or her online or offline environment.

