A Survey of Efficient Resource Discovery Techniques on DHTs

A Survey of Efficient Resource Discovery Techniques on DHTs

Carlos Abalde (University of A Coruña, Spain), Víctor M. Gulías (University of A Coruña, Spain) and Javier París (University of A Coruña, Spain)
DOI: 10.4018/978-1-61520-686-5.ch004
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Much recent work on building scalable peer-to-peer (P2P) systems has focused on distributed hash tables (DHTs), which have become in a powerful building block when designing distributed decentralized behaviors. However, the lookup in the DHT structure itself, rather than the key-based lookup in the data stored in the DHT introduces a series of new challenges. The purpose of this chapter is to review the most outstanding research efforts, many of them from the Grid community, addressing the problem of resource discovery on DHTs. First, the authors present a simple taxonomy and the key internal design and behavior of each contribution, identifying their main strengths and weaknesses. Later, some discussion will be presented and finally, the authors’ personal approach to the problem will be described and compared with previous ones.
Chapter Preview
Top

Background

It is widely accepted that DHTs are the building block for next-generation large scale decentralized systems. However, its lack of flexibility on efficient non identifier-based lookups is a well-known problem. Moreover, the problem becomes even more complex when lookups are demanded in the DHT structure itself, rather than just in the data stored in the DHT. This is precisely the problem addressed by resource discovery techniques on DHTs.

Discoverable resources (i.e. peers interconnected in the DHT overlay network) are characterized by a set of physical and logical attributes. However, the values of the attributes describing the state of a peer cannot be moved to or replicated into other peers on-demand, unlike, for example, the data stored in the peers of a DHT-based file-sharing system. Their locations are fixed, and even worse, unlike files, the values of the attributes are usually constantly changing. Some examples of attributes are the storage capacity of a peer, its network bandwidth, CPU speed, free memory, installed software, available services, etc. Therefore, since the values of the attributes describing a peer are strictly attached to the described peer, from now on, we will use the terms attribute, resource and intra-resource interchangeably to refer to those values and also to the peer where they are located.

Key Terms in this Chapter

Cache: Storage area where frequently accessed pieces of information are temporary stored in order to avoid fetching them from storage devices, or to avoid re-computing their values, both more costly processes than reading them from the cache.

Grid: Pool of networked, heterogeneous, non-dedicated, geographically dispersed and loosely coupled computers available on-demand for solving problems too intensive for any stand-alone standard machine. Grids are used as a replacement of expensive supercomputers or as a tool for maximizing the computing resources available in an organization.

Distributed Hash Table (DHT): General term used to refer to a family of P2P algorithms which mimic the features of traditional hash tables when its data buckets are distributed among participating peers. DHTs are typically designed to scale to a large number of peers which are organized in a structured and algorithm-dependent overlay network topology. DHTs perform all overlay maintenance (peer arrivals, departures and failures) and data management (insertion and key-based lookup of data items) operations in an efficient and fully decentralized fashion.

Resource Discovery: The process of finding a suitable resource in a network of computing elements to perform a task.

Overlay Network: Logical network of peers built on top of one or several existing physical networks with the purpose of deploying a service that is not available in the underlying structure.

Quality of Service (QoS): Computing term referring to the ability to guarantee some measurable and pre-established factors (usually, network or disk throughput) associated with a task during the whole execution time of such task. Service factors are usually ensured, pre-scheduling necessary resources to handle tasks and/or establishing priorities among tasks.

Complete Chapter List

Search this Book:
Reset