Byzantine Fault Tolerant Monitoring and Control for Electric Power Grid

Byzantine Fault Tolerant Monitoring and Control for Electric Power Grid

Wenbing Zhao
DOI: 10.4018/978-1-4666-5888-2.ch261
(Individual Chapters)
No Current Special Offers

Chapter Preview



The data communication infrastructure for electric power grid is in urgent need of transformation to modern computer networking technologies for a number of reasons:

  • The recent deregulation would allow many independent parties to enter the utility industry by offering alternative channels for electric power generation, distribution, and trade. This inevitably demands timely, reliable and secure information exchanges among these parties (Bose, 2005).

  • The current data communication infrastructure lacks the support for large-scale real-time coordination among different electric power grid health monitoring and control systems, which could have prevented the 2003 massive blackout incident in North America (Birman et al., 2005).

  • The use of modern computer networking technology could also revolutionize the everyday electric power grid operations, as shown by the huge benefits of substation automation and the use of Phasor Measurement Units (PMUs) for electric power grid health monitoring (Melliopoulos, 2007).

However, the openness and the ease of information sharing and cooperation brought by the data communication infrastructure transformation also increased the likelihood of cyber attacks on the electric power grid, as demonstrated recently by an experiment conducted by the US Department of Energy’s Idaho Lab (CNN, 2007). To address such vulnerability, intrusion detection and intrusion tolerance techniques must be used to enhance the current and future data communication infrastructure for the electric power grid. Byzantine fault tolerance is a fundamental technique to achieve the objective (Castro & Liskov, 2002; Lamport, Shostak, & Pease, 1982).

In this article, we focus our discussions on the security and reliability of electric power grid health monitoring and control. We elaborate in detail the need for Byzantine fault tolerance and the challenges of applying Byzantine fault tolerance into this problem domain. In particular, we investigate experimentally the feasibility of using such sophisticated technology to meet potentially very stringent real-time requirement for the health monitoring and control of electric power grid, while ensuring high degree of reliability and security of the system.



A Byzantine faulty process may behave arbitrarily. In particular, it may disseminate conflicting information to different components of a system, which constitutes a serious threat to the integrity of a system (Lamport, Shostak, & Pease, 1982). Because a Byzantine faulty process may also choose not to send a message, or refuse to respond to requests, it can exhibit crash fault behavior as well. Consider the scenario that multiple PMUs periodically report their measurement results to a controller for electric power grid health monitoring. When it detects an abnormally, the controller may wish to issue specific control instructions to the actuating devices, such as Intelligent Electronic Devices (IEDs) (Hossenlopp, 2007) located at the same substation as those PMUs to alleviate the problem. Due to the critical role played by the controller, it must be replicated to ensure high availability. Otherwise, the controller would become a single-point of failure. The main components and their interactions are illustrated in Figure 1.

Figure 1.

The interaction of substation devices (PUMs and IEDs) and the controller replicas


However, the controller replicas, the PMUs, and the IEDs, might be compromised under cyber attacks. Consider the following two scenarios:

Key Terms in this Chapter

Normal Operation: It refers to the operation of an algorithm during a period that either there is no fault, or the faults do not disrupt its operation. For example, when a backup replica crashes, the PBFT algorithm would still operate as normal.

View Change: It refers to the configuration change of the group of replicas that engage in Byzantine fault tolerance. When the primary is suspected of being faulty, a new view is initiated so that a different replica is elected as the primary for the new view.

Phasor Measurement Unit (PMU): It is a device that measures the electrical waves in an electric power grid. The measurements must be synchronized with a global clock, such as a GPS.

Sampling Rate: It is defined as the number of samples taken per unit of time.

Byzantine Fault Tolerance: It refers to the capability of a system to tolerate Byzantine faults.

Intelligent Electronic Device (IED): It is an actuating device that is capable of receiving commands from a controller. Example IEDs include protective relaying devices, and voltage regulators.

SCADA: It is short for supervisory control and data acquisition. It is a type of industrial control system that monitor and control industrial processes that exist physically.

Jitter: It refers to the deviation from the periodicity of a sequence of events or signals.

Complete Chapter List

Search this Book: