Speeding up the Internet: Exploiting Historical User Request Patterns for Web Caching

Speeding up the Internet: Exploiting Historical User Request Patterns for Web Caching

Chetan Kumar (California State University San Marcos, USA)
DOI: 10.4018/978-1-61520-611-7.ch096
OnDemand PDF Download:
$37.50

Abstract

The Internet has witnessed a tremendous growth in the amount of available information, and this trend of increasing traffic is likely to continue. According to a Cisco Systems forecast report (2008) the growth in Internet traffic is to be driven by Web 2.0 technologies such as video and social networking and collaboration. Some excerpts of the Cisco forecast report (2008) are as follows.
Chapter Preview
Top

Introduction

The Internet has witnessed a tremendous growth in the amount of available information, and this trend of increasing traffic is likely to continue. According to a Cisco Systems forecast report (2008) the growth in Internet traffic is to be driven by Web 2.0 technologies such as video and social networking and collaboration. Some excerpts of the Cisco forecast report (2008) are as follows.

  • “Global Internet Protocol (IP) traffic will increase by a factor of six from 2007 to 2012, reaching 44 exabytes per month in 2012, compared to fewer than 7 exabytes per month in 2007.

  • Total IP traffic for 2012 will amount to more than half a zettabyte (or 522 exabytes). A zettabyte is a trillion gigabytes.

  • Monthly global IP traffic in December 2012 will be 11 exabytes higher than in December 2011, a single-year increase that will exceed the amount by which traffic increased in the eight years since 2000” (Cisco forecast report 2008).

Despite technological advances this traffic increase can lead to significant user delays in web access (Datta et al. 2003, Mookherjee and Tan 2002, Watson et al. 1999). Web caching is one approach to reduce such delays. Caching involves temporary storage of web object copies at locations that are relatively close to the end user. As a result user requests can be served faster than if they were served directly from the origin web server (Hosanagar and Tan 2004, Davison 2007).

Caching can be performed at different levels in a computer network. Proxy caches are situated at computer network access points for web users (Davison 2007). Other locations where caching may be performed include browser and web-server levels (Davison 2001, Kumar and Norris 2008). Proxy caches can store copies of web objects and directly serve requests for them in the network, consequently avoiding repeated requests to origin web servers. As a result there is reduced network traffic, load on web servers, and average delays experienced by web users (Cao and Irani 1997, Datta et al. 2003). Kumar (2009) illustrate the benefit of a network of proxy caches using an example of the IRCache network (www.att.com). The following are two illustrations, adapted from Davison (2007), of how some firms may practically benefit from caching. In one case a company such as Intel may employ a proxy cache near its network gateway to serve its many users (e.g., clients within Intel) with cached objects from many servers. As a result Intel reduces the bandwidth required over expensive dedicated Internet connections. In another scenario a content provider such as Yahoo can place a proxy cache directly in front of a particular server to reduce the number of requests that the server must handle. This service to speed up content delivery, also called reverse caching as a proxy node may cache objects for many clients but from usually only one server, is professionally provided by CDN firms such as Akamai. In both scenarios access delays are reduced thereby benefitting all Internet users (Davison 2007). Of course in choosing caching solutions, as in any IT investment decision, firms have to evaluate costs of an implementation versus its benefit, before deciding on the appropriate caching service. In this article we discuss some proxy caching approaches that exploit historical user request patterns to reduce user request delays (Kumar and Norris 2008, Zeng et al. 2004).

Key Terms in this Chapter

Origin Web Server: The server where web content originates. User requests that are satisfied by the origin server typically have the longest waiting times.

Least Recently Used (LRU) Caching Policy: LRU is a popular cache replacement strategy where the least recently requested object is evicted from the cache to make space for a new one.

Static Documents: Those web documents that are unaltered in content and size.

Proxy Caches: These caches are located at computer network access points for web users. Proxy caches can store copies of web objects and directly serve requests for them in the network. Therefore they reduce user delays by avoiding repeated requests to origin web servers.

Web 2.0 Technologies: Web traffic that primarily consists of user generated content such as video and social networking and collaboration.

Historical User Request Patterns: These are user web object request patterns that have previously been observed. For example, at proxy level users typically re-access documents on a daily basis, and demand for a document spikes in multiples of 24 hours.

Dynamic Documents: Web documents such as website front pages that often change contents.

Web Caching: This involves temporary storage of web object copies at locations that are relatively close to the end user. Consequently user requests can be served faster than if they were served directly from the origin web server.

Complete Chapter List

Search this Book:
Reset