Predictive File Replication on the Data Grids

Predictive File Replication on the Data Grids

ChenHan Liao (Cranfield University, UK), Na Helian (University of Hertfordshire, UK), Sining Wu (Cranfield University, UK) and Mamunur M. Rashid (Cranfield University, UK)
Copyright: © 2010 |Pages: 18
DOI: 10.4018/jghpc.2010092805
OnDemand PDF Download:
$37.50

Abstract

Most replication methods either monitor the popularity of files or use complicated functions to calculate the overall cost of whether or not a replication decision or a deletion decision should be issued. However, once the replication decision is issued, the popularity of the files is changed and may have already impacted access latency and resource usage. This article proposes a decision-tree-based predictive file replication strategy that forecasts files’ future popularity based on their characteristics on the Grids. The proposed strategy has shown superb performance in terms of mean job time and effective network usage compared with the other two replication strategies, LRU and Economic under OptorSim simulation environment.
Article Preview

File Replication Strategies And Simulation Tools

Various replication strategies have been proposed by research communities so far. Least Frequently Used (LFU) and Least Recently Used (LRU) are two popular methods, which maintain an access history on each Grid site to monitor the popularity of files. The LRU and LFU strategies will always replicate files to local sites where computing elements request the files for executing the jobs. Through browsing the replica catalogue, where the detail information of all the replicas is stored, it chooses the replica that can be accessed within shortest time. The difference between LFU and LRU is that when the local storage is full, deleting methods are different. LFU deletes the file least frequently accessed while LRU deletes the file least recently accessed in order to create space for new replicas. The simplistic of LRU and LFU make them successful and popular for file replication strategy design, however, the “always-replicate” policy leads to unbalance between job time and resource usage.

The Economic model (Cameron, Carvajal-Schiaffino, Millar, Nicholson, & Stockinger 2003; Carman, Zini, Serafini, & Stockinger, 2002) was developed to estimate the file values that are used to decide files’ future popularity, based on binomial distribution. The model assumes that the popularity of files obeys certain distributions, and then finds the file value where it lies on that distribution. The Economic model also employs a reverse auction strategy to message Grid sites in order to obtain the “cheapest” replicas. This strategy models the local sites and remote sites as replica “buyers” and “sellers,” where the value of file replicas is evaluated by their historic “purchase” prices. According to the Economic model, a typical replication process is modelled as an investment on the market. If the candidate replica has greater value than the least-valued file in the local storage, then the least-valued file is then replaced by the new replica. Similar with LRU and LFU, if local storage is not full, replication decision will always be issued.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing