Search the World's Largest Database of Information Science & Technology Terms & Definitions
InfInfoScipedia LogoScipedia
A Free Service of IGI Global Publishing House
Below please find a list of definitions for the term that
you selected from multiple scholarly research resources.

What is Fault Tolerance

Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications
The ability of a system or an application (in software engineering) to operate properly in the event of a failure or still continue to operate with minimal impact.
Published in Chapter:
A Fault Tolerant Decentralized Scheduling in Large Scale Distributed Systems
Florin Pop (University “Politehnica” of Bucharest, ROMANIA)
DOI: 10.4018/978-1-61520-686-5.ch024
Abstract
This chapter presents a fault tolerant framework for the applications scheduling in large scale distributed systems (LSDS). Due to the specific characteristics and requirements of distributed systems, a good scheduling model should be dynamic. More specifically, it should adapt the scheduling decisions to resource state changes, which are commonly captured through monitoring. The scheduler and the monitor are two important middleware pieces that correlate their actions to ensure the high performance execution of distributed applications. The chapter presents and analyses agent based architecture for scheduling in large scale distributed systems. Then the user and resources management are presented. Optimization schemes for scheduling consider the near-optimal algorithm for distributed scheduling. The chapter presents the solution for scheduling optimization. The chapter covers and explains the fault tolerance cases for Grid environments and describes two possible scenarios for scheduling system.
Full Text Chapter Download: US $37.50 Add to Cart
More Results
Fault Tolerant Topology Design for Ad Hoc and Sensor Networks
If a network is fault tolerant or k-fault tolerant it means the network can survive under single or k node/link failures simultaneously.
Full Text Chapter Download: US $37.50 Add to Cart
Robust Unknown Input Observer-Based Fast Adaptive Fault Estimation: Application to Mobile Robot
Means to avoid service failures in the presence of faults. It is noteworthy that repair and fault tolerance are related concepts; ( Avizienis et al. 2004 ); the distinction between fault tolerance and maintenance is that maintenance involves the participation of an external agent, e.g., a repairman, test equipment, remote reloading of software. Furthermore, repair is part of fault removal (during the use phase), and fault forecasting usually considers repair situations. In fact, repair can be seen as a fault tolerance activity within a larger system that includes the system being repaired and the people and other systems that perform such repairs.
Full Text Chapter Download: US $37.50 Add to Cart
Requirements to Products and Processes for Software of Safety Important NPP I&C Systems
Is the ability of software to retain a certain functioning level during the onset of software malfunctions.
Full Text Chapter Download: US $37.50 Add to Cart
Scalable Fault Tolerance for Large-Scale Parallel and Distributed Computing
Fault tolerance is the property of a system that enables it to continue operating properly after a failure occurred in the system.
Full Text Chapter Download: US $37.50 Add to Cart
Distributed Computing for Internet of Things (IoT)
It is an attribute of the system that enables it to carry on operating despite of the failures of some system components.
Full Text Chapter Download: US $37.50 Add to Cart
Methods for Dependability and Security Analysis of Large Networks
This is the ability of a system or component to continue normal operation despite the presence of (unexpected) hardware or software faults. There are many levels of fault tolerance, the lowest being the ability to continue operation in the event of a power failure. Many fault-tolerant computer systems mirror all operations, that is, every operation is performed on two or more duplicate systems, so if one fails the other can take over.
Full Text Chapter Download: US $37.50 Add to Cart
Reliability, Fault Tolerance, and Quality-of-Service in Cloud Computing: Analysing Characteristics
It is the property that enables a system to continue operating properly in the event of failure of some of its components.
Full Text Chapter Download: US $37.50 Add to Cart
Critical Nodes Detection in IoT-Based Cyber-Physical Systems: Applications, Methods, and Challenges
Full Text Chapter Download: US $37.50 Add to Cart
Advances in Fault-Tolerant Multi-Agent Systems
A system is fault tolerant if its behavior is consistent with its specifications, despite whether any component presents a failure.
Full Text Chapter Download: US $37.50 Add to Cart
Grid Computing in 3D Electron Microscopy Reconstruction
In computing, is the ability of a code to manage and recover from a situation which, untreated, would make to application to fail and/or produce unexpected resultss.
Full Text Chapter Download: US $37.50 Add to Cart
Simultaneous MultiThreading Microarchitecture
The property that enables a system (often computer-based) to continue operating properly in the event of the failure of (or one or more faults within) some of its components.
Full Text Chapter Download: US $37.50 Add to Cart
Optimising P2P Networks by Adaptive Overlay Construction
The ability of a system to respond gracefully to an unexpected hardware or software failure.
Full Text Chapter Download: US $37.50 Add to Cart
eContent Pro Discount Banner
InfoSci OnDemandECP Editorial ServicesAGOSR