Implementing Distributed, Self-Managing Computing Services Infrastructure using a Scalable, Parallel and Network-Centric Computing Model

Implementing Distributed, Self-Managing Computing Services Infrastructure using a Scalable, Parallel and Network-Centric Computing Model

Rao Mikkilineni, Giovanni Morana, Ian Seyler
DOI: 10.4018/978-1-4666-1631-8.ch004
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter introduces a new network-centric computing model using Distributed Intelligent Managed Element (DIME) network architecture (DNA). A parallel signaling network overlay over a network of self-managed von Neumann computing nodes is utilized to implement dynamic fault, configuration, accounting, performance, and security management of both the nodes and the network based on business priorities, workload variations and latency constraints. Two implementations of the new computing model are described which demonstrate the feasibility of the new computing model. One implementation provides service virtualization at the Linux process level and another provides virtualization of a core in a many-core processor. Both point to an alternative way to assure end-to-end transaction reliability, availability, performance, and security in distributed Cloud computing, reducing current complexity in configuring and managing virtual machines and making the implementation of Federation of Clouds simpler.
Chapter Preview
Top

Introduction

A federation is a group of parties that collaborate to achieve a common goal. Enterprises use federated systems to collaborate and execute common business processes spanning across distributed resources belonging to different owners. A federated business model mandates a foundation of trust among the participants. Trust in terms of service collaboration mandated by the business process is often negotiated in terms of service level agreements with specific requirements of service availability, performance, security, and cost. Federated systems, in essence, are distributed computing networks of stored program control (SPC) computing elements whose resources are shared to execute business processes to accomplish a common goal. Sharing of resources and collaboration, while they provide leverage and synergy, also pose problems such as contention for same resources, issues of trust, and management of the impact of latency in communication among the participants. These problems are well articulated in literature and the discipline of distributed computing (Tanenbaum and van Steen, 2002) is devoted to addressing them. There are four major problems often cited as key issues in designing distributed systems:

  • 1.

    Connection Management: Collaboration with distributed shared resources can only be possible with a controlled way to assure connection during the period of collaboration. In addition, the reliability, availability, utilization accounting, performance, and security of the resources have to be assured so that the users can depend on the service levels. This is known as FCAPS (Fault, Configuration, Accounting, Performance and Security) management. Connection management allows proper allocation of resources to appropriate users consistent with business priorities, workload requirements, and latency constraints. It also assures that the connection maintains the service levels that are negotiated between the consumers and the suppliers of the resources.

  • 2.

    Transparency: The shared resources in a distributed system may be physically deployed in different containers and the components may be geographically separated. Any distributed systems design must provide the users and resources, access-, location- and physical container-transparency. The users must be able to specify service levels in terms of agreed upon parameters such as business priorities, workload requirements and latency constraints.

  • 3.

    Openness: In an ideal environment, resources are offered as services and users who consume services will be able to choose the right services that meet their requirements, or the consumers will specify their requirements and the service providers can tailor the services that meet consumer’s requirements. The specification and execution of services must support an open process where service can be discovered and service levels are matched to consumer requirements without depending on underlying mechanisms in which services are implemented. In addition, service composition mechanisms must be available to dynamically create new value added services by the consumers.

  • 4.

    Scalability: As the requirements in the form of business priorities, workload variations, or latency constraints change, the distributed system must be designed to scale accordingly. The scaling may involve the dialing-up or dialing-down of resources, geographically migrating them and administratively extending the reach based on policies that support centralized or locally autonomous or a hybrid management with coordinated orchestration.

Current generation server, networking, and storage equipment and their management systems have evolved from server-centric and bandwidth limited network architectures to today’s Cloud computing architecture with virtual servers and broadband networks. During last six decades, many layers of computing abstractions have been introduced to map the execution of complex computational workflows to a sequence of 1s and 0s that eventually get stored in the memory and operated upon by the CPU to achieve the desired result. These include process definition languages, programming languages, file systems, and databases, operating systems etc.

Complete Chapter List

Search this Book:
Reset