The networking technologies are moving very fast in pursuit of optimum performance, which has triggered the importance of non-conventional computing methods. In the modern world of pervasive business systems, time is money. The more the system fulfills the needs of the requesting user, the more revenue the business will generate. The modern world is service-oriented, and therefore, providing customers with reliable and fast service delivery is of paramount importance. In this article we present a scheme to increase the reliability of business systems. The arrival of ubiquitous computing has triggered the need previously mentioned even further, and people hold high exceptions from this technology. In Morikawa (2004), the authors characterize the vision of ubiquitous computing into two categories: “3C everywhere and physical interaction.” 3C consists of “computing everywhere,” “content everywhere,” and “connectivity everywhere.” “Physical interaction” connects the hidden world of ubiquitous sensors with the real world. This wide area of coverage and high scalability makes a ubiquitous system quite fragile toward not only external threats, but internal malfunctioning too. With the high probability of “abnormal behavior” it is more important to have knowledge of fault and its root causes. As described in Yau, Wang, and Karim (2002), application failures are like diseases, and there can be many types of faults with matching symptoms, thus fault localization and categorization are very important. Unlike in Hung et al. (2005) and Steglich and Arbanowski (2004), we cannot categorize all abnormal functionalities into fault tolerance or (re)configuration domains simply because faults do not have any predefined pattern; rather we have to find those pattern. Moreover, as in Steglich and Arbanowski (2004) the “without foresight” type of repair in ubiquitous systems is desired. The conventional FCAPS (Fault, Configuration, Accounting, Performance, Security), network management model categorizes management functions in one group, but we argue that categorizing management functions into different segment is mandatory in self management paradigms. Since in highly dynamic and always available very wide area networks, one fault can be atomic (caused because of one atomic reason) or it can be a set of many faults (caused because of many atomic or related reasons). It is often a good practice to break the problem into smaller atomic problems and then solve them (Chaudhry, Park, & Hong, 2006). If we classify all different types of faults (atomic, related, and composite) into one fault management category, the results would not be satisfactory, nor would the system be able to recover from the “abnormal state” well. Since the side effects of system stability and self healing actions are not yet known (Yau et al., 2002), we cannot afford to assume that running self management modules along with functional modules of the core system will not have a negative effect on the system performance. For example, if the system is working properly, there is no need for fault management modules to be active. Lastly, instead of having a fault-centric approach, we should have a recovery-centric approach because of our objective that is to increase the system availability In this article we present autonomic self healing engine (ASHE) architecture for ubiquitous smart systems. We identify the problem context through artificial immune system techniques and vaccinate (deploy solution to) the system through dynamically composed applications. The services involved in the service composition process may or may not be related, but when they are composed into an application they behave in a way it is specified in their composition scheme. The vaccines are dissolved to liberate the system resources (because they take the system’s own resources to recover it) after the system recovery. When the system is running in a normal state, all self management modules are turned off except context awareness and self optimization. These two are always on to monitor and optimize the system respectively.
In this section we will compare our work with RoSES project, and Semantic Anomaly Detection Research Project.
Key Terms in this Chapter
Self Management: The ability of entities to manage their resources by themselves.
Mesh Networks: Mesh networking is a way to route data, voice, and instructions between nodes. It allows for continuous connections and reconfiguration around blocked paths by “hopping” from node to node until a connection can be established.
MANET: The abbreviation of Mobile Ad hoc NETworks. It identifies a particular kind of ad hoc network, that is, a mobile network, where stations can move around and change the network topology.
Self Healing: The ability of entities to restore their normal functionality is called healing. The property of healing themselves is called self healing.
Autonomic Computing: Computer systems and networks that configure themselves to changing conditions and are self healing in the event of failure. “Autonomic” means “automatic responses” to unpredictable events. At the very least, “autonomic” implies that less human intervention is required for continued operation under such conditions.
Ubiquitous Computing: The branch of computing that deals with always connected, off the desktop mobile components that may or may not have certain life constraints that is, battery, computing power, mobility management and so forth.
Service Composition: The process of development of an application through binding various services together.