Article Preview
Top1. Introduction
There are many examples of similar system failures repeating and of negative side effects created by quick fixes. Introducing safety redundant mechanisms does little to reduce human errors. As pointed out by Perrow (1999, p. 260), the more redundancy is used to promote safety, the greater the chance of spurious actuation; “redundancy is not always the correct design option to use.” While instrumentation is being improved to enable operators to run their operations more efficiently and certainly with greater ease, the risk would seem to remain about the same.
Weick and Sutcliffe (2001, p. 81) explained why traditional total quality management (TQM) has failed. “We interpret efforts by organizations to embrace the quality movement as the beginning of a broader interest in reliability and mindfulness. But some research shows that quality programs have led to only modest gains...this might be the result of incomplete adoption. But we would go even further, and argue that the reason for incomplete adoption is the necessary infrastructure for reliable practice…is not in place even where TQM success stories are the rule. The conclusion is consistent with W.E. Deming’s insistence that quality comes from broad-based organizational vigilance for problems other than those found through standard statistical control methods.”
There are six stages from initial stage to cultural readjustment through catastrophic disasters (Turner & Pidgeon, 1997, p. 88). They are Stage I: Initial beliefs and norms, Stage II: Incubation period, Stage III: Precipitating event, Stage IV: Onset, Stage V: Rescue and salvage and Stage VI: Full cultural readjustment. The second stage, or incubation period, is hard to identify due to the various side effects of quick fixes (Turner & Pidgeon, 1997). Therefore the second stage is playing the crucial role to lead catastrophic disaster. Many side effects due to quick fixes of information and communication technologies (ICT) systems have been identified (Nakamura & Kijima, 2009a). There are two factors in particular that make it difficult to prevent ICT system failures: the lack of a common language for understanding system failures and the lack of a methodology for preventing future system failures. These shortcomings result in local optimization and the introduction of quick fixes as countermeasures. Habermas (1970, 1975, 1984) argued that there are two fundamental conditions underpinning the sociological life of human beings: ‘work: technical interest’ and ‘interaction: practical interest’. Disagreements between individuals and groups are just as much a threat to the socio-cultural form of life as a failure to predict and control. The core idea of intervention methodologies is to accommodate multiple stakeholders and to identify the best methodology for restoring a failed system.
We propose using the “system of system failures” (SOSF) meta-methodology to provide a common language for understanding system failures among the various stakeholders. We also propose using “total system intervention for system failure” (TSI for SF) as a methodology for preventing future system failures of the same type. The SOSF meta-methodology and a stakeholder matrix are used within the TSI for SF methodology. Application examples of ICT systems were used to demonstrate that the TSI for SF methodology is effective.