Outage at UAA: A Week Without Critical Information Systems

Outage at UAA: A Week Without Critical Information Systems

Bogdan Hoanca, David Fitzgerald
Copyright: © 2013 |Pages: 8
DOI: 10.4018/jcit.2013040102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

With redundant hardware, it is rare that a disk failure results in downtime at the system level. System failures do sometimes occur, typically as a sequence of very rare events that leads to a catastrophic failure. This case describes how a combination of hardware and firmware failures, along with human error, led to the failure of a redundant disk storage unit, which in turn affected several enterprise systems at a major public university. Subsequently, a small number of conservative and seemingly “good” decisions in the process of restoring the system from backups led to negative outcomes, primarily additional downtime over the course of several days. The case illustrates how even well-considered and conservative decisions may seem flawed in hindsight. An important lesson from the case is that it is difficult to justify to management the provision of sufficient backup resources to prevent very low-probability failure events.
Article Preview
Top

Setting The Stage

Given the importance of technology in delivering both distance and blended courses, the Information Technology Services (ITS) group at UAA is committed to delivering high availability and reliability across the variety of technology platforms they manage on the main campus and the community campuses. The head of ITS, CIO Rich Whitney, arrived at UAA in the year 2000, and over the years has seen his budget shrink, even while his menu of service offerings has expanded considerably. At the same time, the university community has grown to rely more and more on ITS and to expect high availability from an increasingly complex system of interoperating application. Some of the enterprise infrastructure in the University of Alaska system is funded and hosted centrally at the University of Alaska Statewide offices in Fairbanks, but much more is hosted and operated locally at each of the three campuses in the UA system, or even hosted and operated by one campus on behalf of all three. For example, UAA has its own instance of Blackboard, and it hosts eLive! for all three campuses.

To be able to deliver quality and reliability with a shrinking budget, without adding staff positions, and with an ever-expanding menu of services, CIO Whitney flattened his organization, diligently eliminating management levels (see Figure 1). He also spent time and effort training and grooming staff, because the Anchorage community has rather limited availability of key technical personnel. Out of state recruiting is both expensive and problematic, with some newly hired staff members moving back to the Lower 48 as soon as they experience their first Alaskan winter. CIO Whitney also implemented the ITIL (Information Technology Infrastructure Library) framework, first in the Call Center, and then across the entire ITS organization (Hoanca & Whitney, 2010), and has seen the typical increases in efficiency as a result (Feldman, 2006).

Figure 1.

UAA IT services organizational chart

jcit.2013040102.f01

Complete Article List

Search this Journal:
Reset
Volume 26: 1 Issue (2024)
Volume 25: 1 Issue (2023)
Volume 24: 5 Issues (2022)
Volume 23: 4 Issues (2021)
Volume 22: 4 Issues (2020)
Volume 21: 4 Issues (2019)
Volume 20: 4 Issues (2018)
Volume 19: 4 Issues (2017)
Volume 18: 4 Issues (2016)
Volume 17: 4 Issues (2015)
Volume 16: 4 Issues (2014)
Volume 15: 4 Issues (2013)
Volume 14: 4 Issues (2012)
Volume 13: 4 Issues (2011)
Volume 12: 4 Issues (2010)
Volume 11: 4 Issues (2009)
Volume 10: 4 Issues (2008)
Volume 9: 4 Issues (2007)
Volume 8: 4 Issues (2006)
Volume 7: 4 Issues (2005)
Volume 6: 1 Issue (2004)
Volume 5: 1 Issue (2003)
Volume 4: 1 Issue (2002)
Volume 3: 1 Issue (2001)
Volume 2: 1 Issue (2000)
Volume 1: 1 Issue (1999)
View Complete Journal Contents Listing