Object Grouping and Replication on a Distributed Web Server System

Object Grouping and Replication on a Distributed Web Server System

Amjad Mahmood (University of Bahrain, Kingdom of Bahrain) and Taher S.K. Homeed (University of Bahrain, Kingdom of Bahrain)
DOI: 10.4018/978-1-60566-418-7.ch011
OnDemand PDF Download:
$37.50

Abstract

Object replication is a well-known technique to improve performance of a distributed Web server system. This paper first presents an algorithm to group correlated Web objects that are most likely to be requested by a given client in a single session so that they can be replicated together, preferably, on the same server. A centralized object replication algorithm is then proposed to replicate the object groups to a cluster of Web-server system in order to minimize the user perceived latency subject to certain constraints. Due to dynamic nature of the Web contents and users’ access patterns, a distributed object replication algorithm is also proposed where each site locally replicates the object groups based on the local access patterns. The performance of the proposed algorithms is compared with three well-known algorithms and the results are reported. The results demonstrate the superiority of the proposed algorithms.
Chapter Preview
Top

Introduction

The phenomenal growth in the World Wide Web (Web) has brought about a huge increase in the traffic to poplar Web sites. This traffic occasionally reaches the limits of the sites’ capacity, causing servers to be overloaded (Chen, Mohapatra, & Chen, 2001). As a result, end users either experience a poor response time or denial of a service (time-out error) while accessing these sites. Since these sites have a competitive motivation to offer better service to their clients, the system administrators are constantly faced with the need to scale up the site capacity. There are generally two different approaches to achieving this (Zhuo, Wang, & Lau, 2003). The first approach, generally referred to as hardware scale-up, is the use of powerful servers with advanced hardware support and optimized server software. While hardware scale-up relieves short-term pressure, it is neither a cost effective nor a long-term solution, considering the steep growth in clients’ demand curve. Therefore, the issue of scalability and performance may persist with ever increasing user demand.

The second approach, which is more flexible and sustainable, is to use a distributed Web-server system (DWS). A DWS is not only cost effective and more robust against hardware failure, but it is also easily scalable to meet increased traffic by adding additional servers when required. In such systems, an object (a Web page, a file, etc.) is requested from various geographically distributed clients. As the DWS spreads over a MAN or WAN, movement of documents between server nodes is an expensive operation (Zhuo, Wang, & Lau, 2003). Maintaining multiple copies of objects at various locations in a DWS is an approach for improving system performance, such as latency, throughput, availability, hop counts, link cost, and delay (Kalpakis, Dasgupta, & Wolfson, 2001; Zhuo, Wang, & Lau, 2003).

There are two techniques used in maintaining multiple copies of an object: caching and replication. In Web caching, a copy of an object is temporarily stored at a site that accesses the object. The intermediate sites and proxies also may cache an object when it passes through them en route to its destination site. The objective of Web caching is to reduce network latency and traffic by storing commonly requested documents as close to the clients as possible. Since Web caching is not based on users’ access patterns, the maximum cache hit ratio achievable by any caching algorithm is bounded under 40-50% (Abrams, Standridge, Abdulla, Williams, & Fox, 1995). In addition, cached data have a time to live (TTL), after which the requests are brought back to the original site. Object replication, on the other hand, stores copies of an object at predetermined locations to achieve a defined performance level. The number of replica to be created and their locations are determined by users’ access patterns. Therefore, the number of replicas and their locations may change in a well-controlled fashion in response to changes in the access patterns.

In most existing DWS, each server keeps the entire set of Web documents/objects managed by the system. Incoming requests are distributed to the Web server nodes via DNS servers or request dispatchers (Cardellini, Colajanni, & Yu, 1999; Colajanni & Yu, 1988; Kwan, Mcgrath, & Reed, 1995; Baker &. Moon, 1999). Although such systems are simple to implement, they could easily result in uneven load among the server nodes, due to caching of IP addresses on the client side. To achieve better load balancing as well as to avoid disk wastage, one can replicate part of the documents on multiple server nodes, and requests can be distributed to achieve better performance (Li & Moon, 2001; Karlsson & Karamanolis, 2004; Riska, Sun, Smimi, & Ciardo, 2002). Choosing the right number of replicas and their locations is a nontrivial and nonintuitive exercise. It has been shown that deciding how many replicas to create and where to place them to meet a performance goal is an NP-hard problem (Karlsson & Karamanolis, 2004; Tenzakhti, Day, & Olud-Khaoua, 2004). Therefore, all the replica placement approaches proposed in the literature are heuristics that are designed for certain systems and work loads.

Complete Chapter List

Search this Book:
Reset
Editorial Advisory Board
Table of Contents
Chapter 1
Olivier Berger, Christian Bac, Benoît Hamet
Libre software provides powerful applications ready to be integrated for the build-up of platforms for internal use in organizations. We describe... Sample PDF
Integration of Libre Software Applications to Create a Collaborative Work Platform for Researchers at GET
$37.50
Chapter 2
James Howison, Megan Conklin, Kevin Crowston
This paper introduces and expands on previous work on a collaborative project, called FLOSSmole (formerly OSSmole), designed to gather, share and... Sample PDF
FLOSSmole: A Collaborative Repository for FLOSS Research Data and Analyses
$37.50
Chapter 3
Luis López-Fernández, Gregorio Robles, Jesus M. Gonzalez-Barahona, Israel Herraiz
Source code management repositories of large, long-lived libre (free, open source) software projects can be a source of valuable data about the... Sample PDF
Applying Social Network Analysis Techniques to Community-Driven Libre Software Projects
$37.50
Chapter 4
Walt Scacchi, Chris Jensen, John Noll, Margaret Elliott
Understanding the context, structure, activities, and content of software development processes found in practice has been and remains a challenging... Sample PDF
Multi-Modal Modeling, Analysis, and Validation of Open Source Software Development Processes
$37.50
Chapter 5
B. B. Rossi, M. Scotto, A. Sillitti, G. Succi
The aim of the paper is to report the results of a migration to Open Source Software (OSS) in one Public Administration. The migration focuses on... Sample PDF
An Empirical Study on the Migration to OpenOffice.org in a Public Administration
$37.50
Chapter 6
Claudio Agostino Ardagna, Fulvio Frati, Gabriele Gianini
Business and recreational activities on the global communication infrastructure are increasingly based on the use of remote resources and services... Sample PDF
Open Source in Web-Based Applications: A Case Study on Single Sign-On
$37.50
Chapter 7
Qusay H. Mahmoud, Zakaria Maamar
Conventional desktop software applications are usually designed, built, and tested on a platform similar to the one on which they will be deployed... Sample PDF
Engineering Wireless Mobile Applications
$37.50
Chapter 8
G. Sivaradje, R. Nakkeeran, P. Dananjayan
In this paper, a novel prediction technique is proposed, which uses road topology information for prediction. The proposed scheme uses real time... Sample PDF
A Prediction Based Flexible Channel Assignment in Wireless Networks using Road Topology Information
$37.50
Chapter 9
Hesham A. Ali, Tamer Ahmed Farrag
Due to the rapidly increasing of the mobile devices connected to the internet, a lot of researches are being conducted to maximize the benefit of... Sample PDF
High Performance Scheduling Mechanism for Mobile Computing Based on Self-Ranking Algorithm (SRA)
$37.50
Chapter 10
Khaldoon Al-Zoubi
This paper proposes hierarchal scheduling schemes for Grid systems: a self-discovery scheme for the resource discovery stage and an adaptive child... Sample PDF
Hierarchical Scheduling in Heterogeneous Grid Systems
$37.50
Chapter 11
Amjad Mahmood, Taher S.K. Homeed
Object replication is a well-known technique to improve performance of a distributed Web server system. This paper first presents an algorithm to... Sample PDF
Object Grouping and Replication on a Distributed Web Server System
$37.50
Chapter 12
Saher S. Manaseer, Mohamed Ould-Khaoua, Lewis M. Mackenzie
In wireless communication environments, backoff is traditionally based on the IEEE binary exponential backoff (BEB). Using BEB results in a high... Sample PDF
On the Logarithmic Backoff Algorithm for MAC Protocol in MANETs
$37.50
Chapter 13
Xunhua Wang, David Rine
Domain Name System (DNS) is the system for the mapping between easily memorizable host names and their IP addresses. Due to its criticality, the... Sample PDF
Secure Online DNS Dynamic Updates: Architecture and Implementation
$37.50
Chapter 14
Osama H.S. Khader
In mobile ad hoc networks, routing protocols are becoming more complicated and problematic. Routing in mobile ad hoc networks is multi-hop because... Sample PDF
FSR Evaluation Using the Suboptimal Operational Values
$37.50
Chapter 15
Suet Chun Lee
Software product line (SPL) is a software engineering paradigm for software development. A software product within a product line often has specific... Sample PDF
Modeling Variant User Interfaces for Web-Based Software Product Lines
$37.50
Chapter 16
M. Brian Blake, Lisa Singh, Andrew B. Williams, Wendell Norman, Amy L. Sliva
Organizations are beginning to apply data mining and knowledge discovery techniques to their corporate data sets, thereby enabling the... Sample PDF
Experience Report: A Component-Based Data Management and Knowledge Discovery Framework for Aviation Studies
$37.50
Chapter 17
A. F. Tappenden, T. Huynh, J. Miller, A. Geras, M. Smith
This article outlines a four-point strategy for the development of secure Web-based applications within an agile development framework and... Sample PDF
Agile Development of Secure Web-Based Applications
$37.50
Chapter 18
D. Xuan Le, J. Wenny Rahayu, David Taniar
This paper proposes a data warehouse integration technique that combines data and documents from different underlying documents and database design... Sample PDF
Web Data Warehousing Convergence: From Schematic to Systematic
$37.50
Chapter 19
Haya El-Ghalayini, Mohammed Odeh, Richard McClatchey
This paper studies the differences and similarities between domain ontologies and conceptual data models and the role that ontologies can play in... Sample PDF
Engineering Conceptual Data Models from Domain Ontologies: A Critical Evaluation
$37.50
Chapter 20
John D. Ferguson, James Miller
It is now widely accepted that software projects utilizing the Web (e-projects) face many of the same problems and risks experienced with more... Sample PDF
Modeling Defects in E-Projects
$37.50
Chapter 21
Jaime Gomez, Alejandro Bia, Antonio Parraga
This paper describes the engineering foundations of VisualWADE, a CASE tool to automate the production of Web applications. VisualWADE follows a... Sample PDF
Tool Support for Model-Driven Development of Web Applications
$37.50
About the Editors