Article Preview
Top1. Introduction
Due to the trends of continuous chip-miniaturization along with the limits posed by chip power-dissipation, the processor design industry has now reached the end of a decades-long trend of increasing clock frequencies. Instead, next-generation computing systems are expected to be built with tens of cores per chip. Thus, to achieve high performance, the computer designers are now focusing on using larger number of cores, instead of improving the performance of a single-core. This paradigm shift, however, has made a profound impact on the enterprises and businesses, such as power system control centers, which use serial execution based legacy codes. Since the processor clock-frequency and single-core performance are plateauing, the performance which can be achieved using serial execution is unlikely to improve. At the same time, due to rising electricity demands, the size of power systems is increasing which necessitates higher computational power for performing stability analysis. Thus, power system control centers now must move towards using parallel computing with multicore and many-core processors. Further, since a load-unbalanced scheduling is likely to nullify the advantage obtained from parallelization, use of efficient load-balancing techniques is also essential for achieving high performance gains.
In this paper, we present an approach for parallelization of power system contingency analysis while also achieving load-balancing. We use Chapel language (Chamberlain, Callahan, & Zima, 2007) which is a state-of-the-art high-performance parallel programming language. Chapel language has been developed by Cray Inc. and it is intended to increase the programmability, performance and portability of high performance computing systems and applications. It also aims to enable interoperability with existing languages to promote code reuse and easy adoption by programmers. In Chapel, concurrency is controlled by the language constructs themselves and not by library extensions or compiler directives. To achieve load-balancing, we use an efficient dynamic scheduling technique, viz. work-stealing scheduling (Blumofe & Leiserson, 1994), which has been shown to be efficient in terms of space requirment and execution time. We present the important features of Chapel and design choices which enable us to achieve high performance gains.
We simulate hundreds of contingencies of a large, 13029-bus power system. We parallelize the task of contingency analysis using 2, 4, 8 and 16 cores and compare the performance with that of serial execution. The results have shown that our approach scales well with number of cores and also provides large computational gains over serial execution. Further, it outperforms a conventional scheduling technique namely master-slave scheduling algorithm. The computational advantages provided by our approach can enable real-time simulation of power system contingencies and thus help avoid harmful effects of component failures.
Our approach is highly useful for control center operators in analyzing a large number of contingencies and thus taking suitable corrective and preventive action against catastrophic events such as blackouts. Our approach does not require modifying or rewriting legacy code and hence, can be integrated into large commercial simulation software with minimum overhead. Several previous studies on parallelization of contingency analysis have focused only on steady state contingency analysis (Huang, Chen, & Nieplocha, 2009). By comparison, in this paper, we perform dynamic contingency analysis since it presents significant challenges to parallelization due to variation in simulation times of different contingencies.
The rest of the paper is organized as follows. Section 2 reviews the use of HPC (high-performance computing) in power systems and also presents a brief background on the Chapel language. We also compare Chapel with other languages. Section 3 discusses the overall parallelization approach and also explains the work-stealing based algorithms. Section 4 discusses the salient features of our approach. Section 5 presents the experimental results. We also discuss the optimizations incorporated and analyze the results. Finally, Section 6 concludes this paper and also discusses future work.