Katana: Towards Patching as a Runtime Part of the Compiler-Linker-Loader Toolchain

Katana: Towards Patching as a Runtime Part of the Compiler-Linker-Loader Toolchain

Sergey Bratus (Dartmouth College, USA), James Oakley (Dartmouth College, USA), Ashwin Ramaswamy (Dartmouth College, USA), Sean W. Smith (Dartmouth College, USA) and Michael E. Locasto (George Mason University, USA)
DOI: 10.4018/978-1-4666-1580-9.ch012
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The mechanics of hot patching (the process of upgrading a program while it executes) remain understudied, even though it offers capabilities that act as practical benefits for both consumer and mission-critical systems. A reliable hot patching procedure would serve particularly well by reducing the downtime necessary for critical functionality or security upgrades. However, hot patching also carries the risk—real or perceived—of leaving the system in an inconsistent state, which leads many owners to forgo its benefits as too risky; for systems where availability is critical, this decision may result in leaving systems un-patched and vulnerable. In this paper, the authors present a novel method for hot patching ELF binaries that supports synchronized global data and code updates, and reasoning about the results of applying the hot patch. In this regard, the Patch Object format was developed to encode patches as a special type of ELF re-locatable object file. The authors then built a tool, Katana, which automatically creates these patch objects as a by-product of the standard source build process. Katana also allows an end-user to apply the Patch Objects to a running process.
Chapter Preview
Top

1. Introduction

It is somewhat ironic that users and organizations hesitate to apply patches — whose stated purpose is to support availability or reliability — precisely because the process of doing so can lead to downtime (both from the patching process itself as well as unanticipated issues with the patch). Periodic reboots in desktop systems — irrespective of the vendor — are at best annoying. Reboots in enterprise environments (e.g., trading, e-commerce, core network systems), even for a few minutes, imply large revenue loss — or require an extensive backup and failover infrastructure with rolling updates to mitigate such loss.

We question whether this de facto acceptance of significant downtime and redundant infrastructure should not be abandoned in favor of a reliable hot patching process.

Software, the product of an inherently human process, remains a flawed and incomplete artifact. This reality leads to the uncomfortable inevitability of future fixes, upgrades, and enhancements. Given the way such fixes are currently applied (i.e., patch and reboot), developers accept downtime as a foregone conclusion even as the software is released — and deployers who resist downtime resist the patches.

While patches themselves are a necessity, we believe that the process of applying them remains rather crude. First, the target process is terminated; the new binary and corresponding libraries (if any) are then written over the older versions; the system is restarted if necessary; and finally the upgraded application begins execution. Besides the appreciable loss in uptime, all context held by the application is also lost, unless the application had saved its state to persistent storage (Candea & Fox, 2003;Brown & Patterson, 2002) and later restored it (which is expensive to design for, implement, and execute). In the case of mission-critical services, even after a major flaw is unveiled and a patch subsequently created, administrators must choose between security (applying a patch) and availability. This conundrum serves as our motivation for hot patching, without restarting the program and losing state and time. We focus on systems, such as those found in the cyber infrastructure for the power grid, which require high availability and which store significant state (that would be lost on a restart).

Complete Chapter List

Search this Book:
Reset