Querical Data Networks

Querical Data Networks

Cyrus Shahabi (University of Southern California, USA) and Farnoush Banaei-Kashani (University of Southern California, USA)
DOI: 10.4018/978-1-60566-242-8.ch083
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Recently, a family of massive self-organizing data networks has emerged. These networks mainly serve as large-scale distributed query processing systems. We term these networks Querical Data Networks (QDN). A QDN is a federation of a dynamic set of peer, autonomous nodes communicating through a transient-form interconnection. Data is naturally distributed among the QDN nodes in extra-fine grain, where a few data items are dynamically created, collected, and/or stored at each node. Therefore, the network scales linearly to the size of the dataset. With a dynamic dataset, a dynamic and large set of nodes, and a transient-form communication infrastructure, QDNs should be considered as the new generation of distributed database systems with significantly less constraining assumptions as compared to their ancestors. Peer-to-peer networks (Daswani, 2003) and sensor networks (Estrin, 1999, Akyildiz, 2002) are well-known examples of QDN.
Chapter Preview
Top

Introduction

Recently, a family of massive self-organizing data networks has emerged. These networks mainly serve as large-scale distributed query processing systems. We term these networks Querical Data Networks (QDN). A QDN is a federation of a dynamic set of peer, autonomous nodes communicating through a transient-form interconnection. Data is naturally distributed among the QDN nodes in extra-fine grain, where a few data items are dynamically created, collected, and/or stored at each node. Therefore, the network scales linearly to the size of the dataset. With a dynamic dataset, a dynamic and large set of nodes, and a transient-form communication infrastructure, QDNs should be considered as the new generation of distributed database systems with significantly less constraining assumptions as compared to their ancestors. Peer-to-peer networks (Daswani, 2003) and sensor networks (Estrin, 1999, Akyildiz, 2002) are well-known examples of QDN.

QDNs can be categorized as instances of “complex systems” (Bar-Yam, 1997) and studied using the complex system theory. Complex systems are (mostly natural) systems hard (or complex) to describe information-theoretically, and hard to analyze computationally. QDNs share the same characteristics with complex systems, and particularly, bear a significant similarity to a dominating subset of complex systems most properly modeled as large-scale interconnection of functionally similar (or peer) entities. The links in the model represent some kind of system-specific entity-to-entity interaction. Social networks, a network of interacting people, and cellular networks, a network of interacting cells, are two instances of such complex systems. With these systems, complex global system behavior (e.g., a social revolution in a society, or food digestion in stomach!) is an emergent phenomenon, emerging from simple local interactions. Various fields of study, such as sociology, physics, biology, chemistry, etc., were founded to study different types of initially simple systems and have been gradually matured to analyze and describe instances of incrementally more complex systems. An interdisciplinary field of study, the complex system theorya, is recently founded based on the observation that analytical and experimental concepts, tools, techniques, and models developed to study an instance of complex system at one field can be adopted, often almost unchanged, to study other complex systems in other fields of study. More importantly, the complex system theory can be considered as a unifying meta-theory that explains common characteristics of complex systems. One can extend application of the complex system theory to QDNs by:

  • 1.

    Adopting models and techniques from a number of impressively similar complex systems to design and analyze QDNs, as an instance of engineered complex systems; and

  • 2.

    Exporting the findings from the study of QDNs (which are engineered, hence, more controllable) to other complex system studies.

This article is organized in two parts. In the first part, we provide an overview, where we 1) define and characterize QDNs as a new family of data networks with common characteristics and applications, and 2) review possible database-like architectures for QDNs as query processing systems and enumerate the most important QDN design principles. In the second part of the article, as the first step toward realizing the vision of QDNs as complex distributed query-processing systems, we focus on a specific problem, namely the problem of effective data location (or search) for efficient query processing in QDNs. We briefly explain two parallel approaches, both based on techniques/models borrowed from the complex system theory, to address this problem.

Top

Background

Here, we enumerate the main componental characteristics and application features of a QDN.

Key Terms in this Chapter

Percolation Theory: Assume a grid of nodes where each node is occupied with probability p and empty with probability (1-p). Percolation theory is a quantitative (statistical-theoretical) and conceptual model for understanding and analyzing the statistical properties (e.g., size, diameter, shape, etc.) of the clusters of occupied nodes as the value of p changes. Many concepts associated with complex systems such as clustering, fractals, diffusion, and particularly phase transitions are modeled as percolation problem. The significance of the percolation model is that many different problems can be mapped to the percolation problem; e.g., forest-fire spread, oil field density estimation, diffusion in disordered media, etc.

Distributed Hash Tables (DHTs): A distributed index structure with hash table-like functionality for information location in the Internet-scale distributed computing environments. Given a key from a pre-specified flat identifier space, DHT computes (in a distributed fashion) and returns the location of the node that stores the key.

Small World Models: It is believed that almost any pair of people in the world can be connected to one another by a short chain of intermediate acquaintances, of typical length about six. This phenomenon is colloquially referred to as the “six degrees of separation,” or equivalently, the “small-world” effect. Sociologists propose a number of topological network models, the small-world models, for the social network to explain this phenomenon.

Complex Systems: Complex Systems is a new field of science studying how parts of a complex system give rise to the collective behaviors of the system. Complexity (information-theoretical and computational) and emergence of collective behavior are the two main characteristics of such complex systems. Social systems formed (in part) out of people, the brain formed out of neurons, molecules formed out of atoms, the weather formed out of air flows are all examples of complex systems. The field of Complex Systems cuts across all traditional disciplines of science, as well as engineering, management, and medicine.

Sensor Networks: A sensor network is a network of low-power, small form-factor sensing devices that are embedded in a physical environment and coordinate amongst themselves to achieve a larger sensing task.

Content-Centric Networks: A content-centric network is a network where various functionalities such as naming, addressing, routing, storage, etc., are designed based on the content. This is in contrast with classical networks that are node-centric.

Peer-to-Peer (P2P) Networks: A peer-to-peer network is a distributed, self-organized federation of peer entities, where the system entities collaborate by sharing resources and performing cooperative tasks for mutual benefit. It is often assumed that such a federation lives, changes, and expands independent of any distinct service facility with global authority.

Complete Chapter List

Search this Book:
Reset