Enhancing the Resiliency of Smart Grid Monitoring and Control

Enhancing the Resiliency of Smart Grid Monitoring and Control

Copyright: © 2018 |Pages: 10
DOI: 10.4018/978-1-5225-2255-3.ch267
(Individual Chapters)
No Current Special Offers


In this chapter, we present the justification and a feasibility study of applying the Byzantine fault tolerance (BFT) technology to electric power grid health monitoring. We propose a set of BFT mechanisms needed to handle the PMU data reporting and control commands issuing to the IEDs. We report an empirical study to assess the feasibility of using the BFT technology for reliable and secure electric power grid health monitoring and control. We show that under the LAN environment, the overhead and jitter introduced by the BFT mechanisms are negligible, and consequently, Byzantine fault tolerance could readily be used to improve the security and reliability of electric power grid monitoring and control while meeting the stringent real-time communication requirement for SCADA operations.
Chapter Preview


Smart grid is one of the hottest research areas in recent years. The development of smart grid is partially driven by the fact that the traditional data communication infrastructure for electric power grid can no longer meet the needs of new developments (Wang, Xu, & Khanna, 2011):

  • The recent deregulation would allow many independent parties to enter the utility industry by offering alternative channels for electric power generation, distribution, and trade. This inevitably demands timely, reliable and secure information exchanges among these parties (Bose, 2005).

  • The current data communication infrastructure lacks the support for large-scale real-time coordination among different electric power grid health monitoring and control systems, which could have prevented the 2003 massive blackout incident in North America (Birman et al., 2005).

  • The use of modern computer networking technology could also revolutionize the everyday electric power grid operations, as shown by the huge benefits of substation automation and the use of Phasor Measurement Units (PMUs) for electric power grid health monitoring (Melliopoulos, 2007).

However, the openness and the ease of information sharing and cooperation brought by smart grid also increased the likelihood of cyber attacks on the electric power grid, as demonstrated recently by an experiment conducted by the US Department of Energy’s Idaho Lab (CNN, 2007). To address such vulnerability, intrusion detection and intrusion tolerance techniques must be used to enhance the current and future data communication infrastructure for the electric power grid. Byzantine fault tolerance is a fundamental technique to achieve the objective (Castro & Liskov, 2002; Zhao, 2014a).

In this chapter, we focus our discussions on the security and reliability of smart grid health monitoring and control. We elaborate in detail the need for Byzantine fault tolerance and the challenges of applying Byzantine fault tolerance into this problem domain. In particular, we investigate experimentally the feasibility of using such sophisticated technology to meet potentially very stringent real-time requirement for the health monitoring and control of smart grid, while ensuring high degree of reliability and security of the system.



A Byzantine faulty process may behave arbitrarily. In particular, it may disseminate conflicting information to different components of a system, which constitutes a serious threat to the integrity of a system (Lamport, Shostak, & Pease, 1982). Because a Byzantine faulty process may also choose not to send a message, or refuse to respond to requests, it can exhibit crash fault behavior as well. Consider the scenario that multiple PMUs periodically report their measurement results to a controller for electric power grid health monitoring. When it detects an abnormally, the controller may wish to issue specific control instructions to the actuating devices, such as Intelligent Electronic Devices (IEDs) (Hossenlopp, 2007) located at the same substation as those PMUs to alleviate the problem. Due to the critical role played by the controller, it must be replicated to ensure high availability. Otherwise, the controller would become a single-point of failure. The main components and their interactions are illustrated in Figure 1.

Figure 1.

The interaction of substation devices (PUMs and IEDs) and the controller replicas


However, the controller replicas, the PMUs, and the IEDs, might be compromised under cyber attacks. Consider the following two scenarios:

Key Terms in this Chapter

Sampling Rate: It is defined as the number of samples taken per unit of time.

Intelligent Electronic Device (IED): It is an actuating device that is capable of receiving commands from a controller. Example IEDs include protective relaying devices, and voltage regulators.

Phasor Measurement Unit (PMU): It is a device that measures the electrical waves in an electric power grid. The measurements must be synchronized with a global clock, such as a GPS.

View Change: It refers to the configuration change of the group of replicas that engage in Byzantine fault tolerance. When the primary is suspected of being faulty, a new view is initiated so that a different replica is elected as the primary for the new view.

Jitter: It refers to the deviation from the periodicity of a sequence of events or signals.

SCADA: It is short for Supervisory Control and Data Acquisition. It is a type of industrial control system that monitor and control industrial processes that exist physically.

Normal Operation: It refers to the operation of an algorithm during a period that either there is no fault, or the faults do not disrupt its operation. For example, when a backup replica crashes, the PBFT algorithm would still operate as normal.

Byzantine Fault Tolerance: It refers to the capability of a system to tolerate Byzantine faults.

Complete Chapter List

Search this Book: