Design of an Integrated Digital Library System Based on Peer-to-Peer Data Mining

Design of an Integrated Digital Library System Based on Peer-to-Peer Data Mining

Mohammed Ammari (Mohammadia School of Engineers, University Mohammed V - Agdal, Rabat, Morocco) and Dalila Chiadmi (Mohammadia School of Engineers, University Mohammed V - Agdal, Rabat, Morocco)
Copyright: © 2012 |Pages: 14
DOI: 10.4018/ijcee.2012070101
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Traditional libraries will evolve to digital libraries which are clearly superior at: Dissemination, sharing, linking, storing, and information variety. Therefore, one can say that electronic libraries have specific needs in terms of content, services and long-term preservation. In contrast, digital libraries suffer from several inherent constraints: storage limitation, performance, relevancy, decentralization, lack of semantic, fault tolerance, scalability. The main intention of this paper is to present a design of an integrated digital library system based on peer-to-peer data mining. This article aims also to prove that peer-to-peer mining, an emerging branch of distributed data mining, is a hot research area well suited to overcome intrinsic problems of digital libraries.
Article Preview

1. Introduction

Data mining (Han & Kamber, 2006) has attracted a great deal of attention in the information industry and in society as a whole in recent years, due to the wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge. The information and knowledge gained can be used for applications ranging from market analysis, fraud detection, and customer retention, to production control and science exploration.

Distributed data mining (DDM) consists of mining data using distributed resources. DDM has emerged as a hot research area. Careful attention in the usage of distributed resources of data, computing, communication, storage and human factors in a near optimal fashion are paid by DDM.

Distributed Data Mining can be applied in several domains (Kargupta, 2007):

  • Mining Large Databases from Distributed Sites: Grid data mining in Earth Science, Astronomy, Counter-terrorism, Bioinformatics

  • Monitoring Multiple Time Critical Data Streams: Monitoring vehicle data streams in real-time, Monitoring physiological data streams

  • Analyzing Data in Lightweight Sensor Networks and Mobile Devices: Limited network bandwidth, Limited power supply

  • Preserving Privacy: Security and safety related applications

  • Peer-to-Peer Data Mining: Large decentralized asynchronous environments

DDM is gaining attention in peer-to-peer (P2P) systems which are emerging as a new distributed computing paradigm for many novel applications that involve connecting nodes via largely ad hoc connections without central coordinator. Grid-computing, massive data storage, instant messaging, videoconferencing, voice over IP, multi-player, P2P search engines, social networks, digital libraries are some examples. We can classify peer to peer networks as structured or unstructured, depending on the way they are connected and how the data they contain is arranged. In a structured network the connections between nodes are of some regular structure, which allows deterministic and optimal lookup hops (typically O (log N)). In contrast to structured networks, nodes in unstructured networks do not share a regular structure and a unified identifier space. Lookups are thus normally achieved by flooding and using replication in the network.

The main intention of this paper is to prove that P2P data mining applications may play a key role in the next generation of digital libraries. P2P data mining is, in fact, well suited to overcome intrinsic problems of digital libraries: storage limitation, performance, search pertinence, decentralization, lack of semantic, scalability. In the following section we discuss some related works. In the third section we present the motivation of the application of P2P data mining in the field of digital libraries, before closing on illustrative example.

Our work can be classified in the field of Peer-to-Peer data mining:

In a work done by Datta et al. (2006), they offer an overview of DDM applications and algorithms for P2P environments, focusing particularly on local algorithms that perform data analysis by using computing primitives with limited communication overhead. The authors describe both exact and approximate local P2P data mining algorithms that work in a decentralized and communication-efficient manner.

Sunny T et al. (2012) provide an overview of DDM and P2P Data Mining. Their paper discusses the need for DDM, taxonomy of DDM architectures, various DDM approaches, DDM related works in P2P systems and issues and challenges in P2P data mining.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 4: 2 Issues (2016)
Volume 3: 2 Issues (2014)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing