High Performance Computing (HPC) applications are scientific applications that require significant CPU capabilities. They are also data-intensive applications requiring large data storage. While many researchers have examined the performance of Amazon’s EC2 platform across some HPC benchmarks, an extensive study and their comparison between Amazon’s EC2 and Microsoft’s Windows Azure is largely missing with metrics such as memory bandwidth, I/O performance, and communication and computational performance. The purpose of this paper is to implement existing benchmarks to evaluate and analyze these metrics for EC2 and Windows Azure that span both Infrastructure-as-a-Service and Platform-as-a-Service types. This was accomplished by running MPI versions of STREAM, Interleaved or Random (IOR) and NAS Parallel (NPB) benchmarks on small and medium instance types. In addition a new EC2 medium instance type (m1.medium) was also included in the analysis. These benchmarks measure the memory bandwidth, I/O performance, communication and computational performance.
With the advent of cloud computing researchers have been trying analyze its suitability for HPC applications. In this paper, we evaluate two public cloud-computing platforms mentioned above. Small and medium instance types were chosen for benchmarking and collecting information on memory bandwidth, I/O performance, and communication and computation performance as the number of nodes in the cluster was increased from 1, 2, 4, 6 & 8. Existing benchmarks
STREAM, IOR and NPB were used to measure the above metrics.
Hence, this paper attempted to accomplish the following objectives.
Compare and study the variability of memory bandwidth between EC2 and Windows Azure using STREAM benchmark.
Compare and study the variability of IO performance between EC2 and Windows Azure using IOR benchmark when number of nodes in the cluster was increased.
Compare and study the variability of communication and computational performance between EC2 and Windows Azure using NPB benchmarks when the number of nodes was increased.
“Cloud” refers to a combination of both hardware and software applications available over the Internet as services. These applications as services can be used to store, retrieve, and share data from systems connected to the Internet. Large data centers used to build this “cloud” are designed to support highly scalable applications. The cloud computing platforms are categorized as Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS) (Sun Microsystems, 2009).
EC2 is Amazon’s cloud computing Infrastructure-as-a-Service cloud platform and supports applications that are highly scalable, which is one of the requirements for the HPC applications in the cloud.
Windows Azure is Microsoft’s application platform for public cloud and is offered as Platform-as-a-Service. This platform can be used to build scalable applications and parallel processing such as in a cluster. On Windows Azure, this means running several role instances simultaneously, all working in parallel to perform tasks. Windows Azure provides HPC Scheduler for distributing its work across the instances. The remaining sections in the paper discuss the related works in section 2, our experimentation in section 3 followed by results discussion in section 4 and conclusions in section 5.Top
2.1. Memory Bandwidth
With the CPU processing speeds increasing more quickly than computer memory speeds, the high performance computing systems will be especially limited in performance by memory bandwidth rather than by the computational performance of the CPU. The ratio of CPU speed to memory speed is growing rapidly in high performance systems.
Evangelinos et al. tested the memory bandwidth of EC2 instance using the STREAM benchmark (Evangelinos, 2008). The results showed high bandwidth for the standard instance type. The High-CPU medium instance delivered bandwidth better than what one would expect from two cores sharing the same socket’s pins to main memory.
2.2. Input/Output Performance
Evangelinos et al. tested the I/O subsystem performance on the IOR benchmark in POSIX mode and tested large read and write requests on both the local /tmp disk and the remote home directory on standard small instance [Evangelinos08]. The results showed that there is an appreciable difference between the write and read performance of the standard and the High-CPU instances to/from local disk. In addition, the results showed that while the read performance from local disk appears to be close between the two instance types (standard and high CPU instance), most measurements were in the range of 800MB/s for the standard one.
Ghoshal et al. presented the results in benchmarking the I/O performance over different clouds and HPC platforms to identify the major bottlenecks in existing infrastructure (Ghoshal, 2011). The paper also compares the I/O performance using IOR benchmarks on two cloud platforms - Amazon and the Magellan cloud test bed. For evaluation criteria, the study measured both buffered I/O and direct I/O to understand the effects of buffering caches.