Database Integration in the Grid Infrastructure

Database Integration in the Grid Infrastructure

Emmanuel Udoh (Sullivan University, USA)
DOI: 10.4018/978-1-60566-026-4.ch152
OnDemand PDF Download:
$37.50

Abstract

The capability of the Web to link information over the Internet has popularized computer science to the public. But it is the grid that will enable the public to exploit data storage and computer power over the Internet analogous to the electric power utility (a ubiquitous commodity). The grid is considered the fifth generation computing architecture after client-server and multitier (Kusnetzky & Olofson, 2004) that integrates resources (computers, networks, data archives, instruments, etc.) in an interoperable virtual environment (Berman, Fox, & Hey, 2003). In this vein, grid computing is a new IT infrastructure that allows modular hardware and software to be deployed collectively to solve a problem or rejoined on demand to meet changing needs of a user. Grid computing is becoming popular in the enterprise world after its origin in the academic and research communities (e.g., SETI@home), where it was successfully used to share resources, store data in petabytes/exabytes, and ultimately lower costs. There are several reasons for the embrace of the enterprise grids. In the nineties, the IT world was confronted with the high cost of maintaining smaller, cheaper and dedicated servers such as UNIX and Windows. According to Oracle (2005), there was the problem of application silos that lead to underutilized hardware resources; monolithic and unwieldy systems that are expensive to maintain and change; and fragmented and disintegrated information that cannot be fully exploited by the enterprise as a whole. Various surveys put the average utilization of servers in a typical enterprise to often much less than 20% (Goyal & Lawande, 2006; Murch, 2004). But with the increasingly available cheaper, faster and affordable hardware such as server blades, and operating systems like the open source Linux, the IT world embraced grid computing to save money on hardware and software. With the growing importance of grid computing, it is easy to conjecture why many new terms have been coined for it. In the literature and industry, other terms used interchangeably for grid computing are utility computing, computing on demand, N1, hosted computing, adaptive computing, organic computing and ubiquitous computing (Goyal & Lawande, 2006; Murch, 2004; Oracle, 2005 ). The grid is an all-encompassing, 21st century computing infrastructure (Foster, 2003; Joseph & Fellenstein, 2004) that integrates several areas of computer science and engineering. A database is an important component of the application stack in the industry and is increasingly being embedded in the grid infrastructure. This article focuses on integration of database grids or grid-accessible databases in the industry using Oracle products as examples. Vendors like Oracle and IBM are providing grid-enabled databases that are supposed to make enterprise systems unbreakable and highly available. Oracle has been in the forefront in this endeavor with its database products. In recognition of the significant new capabilities required to power grid computing, Oracle has named its new technology products Oracle 10g (g for grid). Oracle provides seamless availability through its database products with such features like streams, transportable tablespaces, data hubs, ultra-search and real application clusters. Although companies will not like to distribute resources randomly on the Internet, they will embrace enterprise database grids, as they embraced Internet in the form of Intranets. To the business world, database grids will help achieve high hardware utilization and resource sharing, high availability, flexibility, incrementally scalable low cost components and reduced administrative overhead (Kumar & Burleson, 2005; Kusnetzky & Olofson, 2004).
Chapter Preview
Top

Introduction

The capability of the Web to link information over the Internet has popularized computer science to the public. But it is the grid that will enable the public to exploit data storage and computer power over the Internet analogous to the electric power utility (a ubiquitous commodity). The grid is considered the fifth generation computing architecture after client-server and multitier (Kusnetzky & Olofson, 2004) that integrates resources (computers, networks, data archives, instruments, etc.) in an interoperable virtual environment (Berman, Fox, & Hey, 2003). In this vein, grid computing is a new IT infrastructure that allows modular hardware and software to be deployed collectively to solve a problem or rejoined on demand to meet changing needs of a user.

Grid computing is becoming popular in the enterprise world after its origin in the academic and research communities (e.g., SETI@home), where it was successfully used to share resources, store data in petabytes/exabytes, and ultimately lower costs. There are several reasons for the embrace of the enterprise grids. In the nineties, the IT world was confronted with the high cost of maintaining smaller, cheaper and dedicated servers such as UNIX and Windows. According to Oracle (2005), there was the problem of application silos that lead to underutilized hardware resources; monolithic and unwieldy systems that are expensive to maintain and change; and fragmented and disintegrated information that cannot be fully exploited by the enterprise as a whole. Various surveys put the average utilization of servers in a typical enterprise to often much less than 20% (Goyal & Lawande, 2006; Murch, 2004). But with the increasingly available cheaper, faster and affordable hardware such as server blades, and operating systems like the open source Linux, the IT world embraced grid computing to save money on hardware and software. With the growing importance of grid computing, it is easy to conjecture why many new terms have been coined for it. In the literature and industry, other terms used interchangeably for grid computing are utility computing, computing on demand, N1, hosted computing, adaptive computing, organic computing and ubiquitous computing (Goyal & Lawande, 2006; Murch, 2004; Oracle, 2005).

The grid is an all-encompassing, 21st century computing infrastructure (Foster, 2003; Joseph & Fellenstein, 2004) that integrates several areas of computer science and engineering. A database is an important component of the application stack in the industry and is increasingly being embedded in the grid infrastructure. This article focuses on integration of database grids or grid-accessible databases in the industry using Oracle products as examples. Vendors like Oracle and IBM are providing grid-enabled databases that are supposed to make enterprise systems unbreakable and highly available. Oracle has been in the forefront in this endeavor with its database products. In recognition of the significant new capabilities required to power grid computing, Oracle has named its new technology products Oracle 10g (g for grid). Oracle provides seamless availability through its database products with such features like streams, transportable table-spaces, data hubs, ultra-search and real application clusters. Although companies will not like to distribute resources randomly on the Internet, they will embrace enterprise database grids, as they embraced Internet in the form of Intranets. To the business world, database grids will help achieve high hardware utilization and resource sharing, high availability, flexibility, incrementally scalable low cost components and reduced administrative overhead (Kumar & Burleson, 2005; Kusnetzky & Olofson, 2004).

Key Terms in this Chapter

Grid Computing: A style of computing that dynamically pools IT resources together for use based on resource need. It allows organizations to provision and scale resources as needs arise, thereby preventing the underutilization of resources (computers, networks, data archives, instruments).

Semantic Web: Information processing model in which computers using resource description framework (RDF) and other technologies can explicitly associate meanings or parse relationships between data without human intervention.

Virtualization: A form of abstraction that provides location- and technology-transparent access of resources to the consumer. It decouples the tight connections between providers and consumers of resources, thus allowing sharing of the same resources by multiple users as needs arise.

Information Grid: This grid shares information across multiple consumers and applications. It unlocks fragmented data from proprietary applications by treating information as a resource to be shared across the grid.

Infrastructure Grid: This grid pools, shares and reuses infrastructure resources such as hardware, software, storage and networks across multiple applications.

Provisioning: The allocation of resources to consumers on demand. A system determines specific need of the consumer and provides the resources as requested.

Silos/Islands of Applications/Computing: Condition whereby servers or computing resources are idle most of the time when the peak load is not reached. Such IT systems are not designed to share resources with each other, thus creating islands of information and computing infrastructure within a single enterprise.

Service-Oriented Architecture (SOA): This is a form of software design that allows different applications to interact in business processes regardless of specific technology like programming languages and operating systems

Applications Grid: It shares and reuses application code but uses software technologies like service-oriented architectures that facilitate sharing business logic among multiple applications.

Complete Chapter List

Search this Book:
Reset