This chapter is management oriented. It first proposes a general theoretical context for IT disasters within the wider class of all types of disasters to which a business is subject—whether caused by natural or human action. After this theoretical discussion, numerous practical and proactive prevention methods then are suggested that can be applied both before and after an IT disaster. Implementation of these measures should contribute greatly to reducing both the occurrence of disasters and the damage that might be wrought by most adverse events not under our control.
Adversity planning has come to the forefront of the public’s concerns because of both the scope and frequency of news making natural disasters. “Ordinary” computer and networking failures, no matter how far reaching the consequences and importance of the entities affected, are reported with such regularity as hardly to cause a stir in the public arena—except among those affected, for example, the investment community (Campbell, Gordon, Loeb, & Zhou, 2003; Cavusoglu, Mishra, & Raghunathan, 2004). Both these realizations point to the need for reassessing the nature of past and probable future problems as well as instituting effective preventive and recovery measures. A theoretic context is provided for the subsequent discussion of disasters.
This chapter considers topics related to disasters in a natural progression:
What we think about disasters in general and why we need to revise the conventional “wisdom” about disasters
Why disaster matters so much—its consequences
Legal requirements and methods for disaster planning
How to be successful in minimizing loss from a disaster and instituting controls
Ensuring physical security
Post-disaster IT` continuity and recovery
Revising Common Assumptions About Disaster
First we need to revise our assumptions about disasters:
Catastrophes occur independently of one another. Comment: Any given disaster may be due to a cause that does not disappear after the first strike. Malware creators propagate their exploits in waves.
Disasters tend to repeat themselves with only minor differences.
In the short term, the chances of experiencing a calamity are low.
It is unlikely that a calamity like the last one will occur any time soon.
The number of calamitous events in any narrow timeframe or given place will be constant over time
With knowledge gained through bitter and repeated experience, organizations would do well to call into question the guiding assumptions mentioned above and consider the following points (which counter the above assumptions):
Weather, seismic, and especially socially caused traumatic events seem to feed off one another.
Greater communication and transportation resources are now available to ill-intentioned people. Comment: Even without any apparent communication, a relatively recently noted phenomenon is the simultaneous discovery and creation of new behaviors—both good and ill—in geographically separated locales. Ideas, which crop up in one place, may arise in another at approximately the same time. This situation is evidently part of nature. For instance, Japanese primatologists observed a single macaque monkey who learned how to wash sweet potatoes before eating them and that this learned procedure spread to the entire troupe (Narby, 2005). Other instances of primate learning have spread to other locations without any apparent communication.
Increased extremism and climatic changes (brought about by greater industrial activity, perhaps) trigger more human-made disasters.
Ideologically motivated damage and employee sabotage are increasingly the norm.
One calamitous event can be seen to stimulate a recurrence rather quickly. An instability here and now augments instability later and nearby.
The pace of adverse happenings is quickening.
New kinds of items causing disaster are emerging, for example, denial of service attacks, rootkits.
Key Terms in this Chapter
Failback: Resumption of operations at restored site of disaster.
Rootkit: Software that introduces and hides running programs, either legitimate or illegitimate from the operating system and may take control of a computer. A rootkit is notoriously difficult to detect and remove (Poulsen, 2003).
Security of a System: “An objective measure of the number of its vulnerabilities and their severity;” also the ability to detect, anticipate, and avoid attack or calamity.
Exposure: Amount of possible financial loss in a disaster.
Champion: A person of prestige in an organization who can ensure that a project will progress as expected.
Hot Site: Backup site usually operated by a service company enabling a company to resume its IT processing. Data are transmitted to the site in real time, so the site has immediately current data. The hardware is already in place to continue processing.
Warm Site: Backup location where data are transmitted only periodically instead of continuously. The hardware is already in place to continue processing from the last transmission of data.
Risk: Probability of a loss; sometimes, the probability of a loss multiplied by the exposure.
IT: Abbreviation of information technology; the functional unit in an organization that processes data to yield information.
Open Source: The actual coding statements in a program are made publically available, without charge; responsible programmers are free to suggest, or in some cases, modify or add to the program.
Failover: Ability of a disrupted system to switch to a working system with seamless continuity.
Halon Gas: An agent for extinguishing fires without causing damage to equipment that water might bring about.
Close Source: A program with a restrictive license designed to maintain a degree of secrecy about the code; only the execution modules are distributed.
Cold Site: A relatively inexpensive alternative to other backup sites, since there is no hardware or transmitted data already in place to resume operations.