Desktop Grids: From Volunteer Distributed Computing to High Throughput Computing Production Platforms

Desktop Grids: From Volunteer Distributed Computing to High Throughput Computing Production Platforms

Franck Cappello (INRIA & UIUC, France), Gilles Fedak (LIP/INRIA, France), Derrick Kondo (ENSIMAG - antenne de Montbonnot, France), Paul Malecot (Universite Paris-Sud, France) and Ala Rezmerita (Universite Paris-Sud, France)
Copyright: © 2010 |Pages: 31
DOI: 10.4018/978-1-60566-661-7.ch003
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Desktop Grids, literally Grids made of Desktop Computers, are very popular in the context of “Volunteer Computing” for large scale “Distributed Computing” projects like SETI@home and Folding@home. They are very appealing, as “Internet Computing” platforms for scientific projects seeking a huge amount of computational resources for massive high throughput computing, like the EGEE project in Europe. Companies are also interested of using cheap computing solutions that does not add extra hardware and cost of ownership. A very recent argument for Desktop Grids is their ecological impact: by scavenging unused CPU cycles without increasing excessively the power consumption, they reduce the waste of electricity. This book chapter presents the background of Desktop Grid, their principles and essential mechanisms, the evolution of their architectures, their applications and the research tools associated with this technology.
Chapter Preview
Top

Origins And Principles

Nowadays, Desktop Grids are very popular and are among the largest distributed systems in the world: the BOINC platform is used to run over 60 Internet Computing projects and scale up to 4 millions of participants. To arrive at this outstanding result, theoretical and experimental projects and researches have investigated on how to take advantage of idle CPU’s and derived the principles the of Desktop Grids.

Origins of Desktop Grids

The very first paper discussing a Desktop Grid like system (Shoch & Hupp, 1982) presented the Worm programs and several key ideas that are currently investigated in autonomous computing (self replication, migration, distributed coordination, etc). Several projects preceded the very popular SETI@home. One of the first application of Desktop Grids was cracking RSA keys. Another early system, in 1997, gave the name of “distributed computing” used sometimes for Desktop Grids: distributed.net. The aim of this project was finding prime numbers using the Mersen algorithm. The folding@home project was one of the first project with SETI@home to gather thousands of participants in the first years of 2000. At that time folding@home used the COSM technology. The growing popularity of Desktop Grids has raised a significant interest in the industry. Companies like Entropia (Chien, Calder, Elbert, Bhatia, 2003), United Device1, Platform2, Mesh Technologies3 and Data Synapse have proposed Desktop Grid middleware. Performance demanding users are interested by these platforms, considering their cost-performance ratio which is even lower than the one of clusters. As a mark of success, several Desktop Grid platforms are daily used in production by large companies in the domains of pharmacology, petroleum, aerospace, etc.

The origin of Desktop Grids came from the association of several key concepts: 1) cycle stealing, 2) computing over several administration domains and 3) the Master-Worker computing paradigm.

Desktop Grids inherit the principle of aggregating inexpensive, often already in place, resources, from past research in cycle stealing. Roughly speaking, cycle stealing consists of using the CPU’s cycles of other computers. This concepts is particularly relevant when the target computers are idle. Mutka and al. demonstrated in 1987 that the CPU’s of workstations are mostly unused (M. W. Mutka & Livny, 1987), opening the opportunity for high demanding users to scavenge these cycles for their applications. Due to its high attractiveness, cycle stealing has been studied in many research projects like Condor (Litzkow, Livny, Mutka, 1988), Glunix (Ghormley, Petrou, Rodrigues, Vahdat, Anderson, 1998) and Mosix (Barak, Guday, 1993), to cite a few. In addition to the development of these computing environments, a lot of research has focused on theoretical aspects of cycle stealing (Bhatt, Chung, Leighton, Rosenberg, 1997).

Early cycle stealing systems where bounded to the limits of a single administration domain. To harness more resources, techniques were proposed to cross the boundaries of administration domains. A first approach was proposed by Web Computing projects such as Jet (Pedroso, Silva, Silva, 1997), Charlotte (Baratloo, Karaul, Kedem, Wyckoff, 1996), Javeline (P. Cappello et al., 1997), Bayanihan (Sarmenta & Hirano, 1999), SuperWeb (Alexandrov, Ibel, Schauser, Scheiman, 1997), ParaWeb (Brecht, Sandhu, Shan, Talbot, 1996) and PopCorn (Camiel, London, Nisan, Regev, 1997). These projects have emerged with Java, taking benefit of the virtual machine properties: high portability across heterogeneous hardware and OS, large diffusion of virtual machine in Web browsers and a strong security model associated with bytecode execution. Performance and functionality limitations are some of the fundamental motivations of the second generation of Global Computing systems like COSM4, BOINC (Anderson, 2004) and XtremWeb (Fedak, Germain, Néri, Cappello, 2001). These systems use some firewall and NAT traversing protocols to transport the required communications.

The Master-Worker paradigm is the third enabling concept of Desktop Grids. The concept of Master-Worker programming is quite old (Mattson, Sanders, Massingill, 2004), but its application to large scale computing over many distributed resources has emerged few years before 2000 (Sarmenta & Hirano, 1999). The Master-Worker programming approach essentially allows the implementing of non trivial (bag of tasks) parallel applications on loosely coupled computing resources. Because it can be combined with simple fault detection and tolerance mechanisms, it fits extremely well with the Desktop Grid platforms that are very dynamic by essence.

Key Terms in this Chapter

Master-Worker Paradigm: Consists in two entities: a master and several workers. The master decomposes the problem into smaller tasks and distributes them among workers. The worker receives the task from the master, executes it and sends back the result to the master.

Cycle Stealing: Consists in using the unused cycles of desktop workstations. Participating workstations also donate some supporting amount of disk storage space, RAM, and network bandwidth, in addition to raw CPU power. The volunteer must get back full usage of its resources with no delay when it request them.

Desktop Grid: A computing environment making use of Desktop computers connected via the Internet. Desktop Grids are not used only for voluntary computing projects, but also for enterprise Grids. connected via non dedicated network connection

Volunteer Computing: An arrangement in which computer owners provide there computing resources to one or more projects that are using them to do distributed computing. Those Desktop Grids are made of plenty tiny and uncontrollable administrative domains.

Result Certification: In distributed computing the result certification is a mechanism that aims to validate the results computed by volatile and possibly malicious hosts. The most common mechanisms for result validation are: the majority voting, spot-checking and credibility-based technique.

Complete Chapter List

Search this Book:
Reset