Discovering Patterns for Architecture Simulation by Using Sequence Mining

Discovering Patterns for Architecture Simulation by Using Sequence Mining

Pinar Senkul (Middle East Technical University, Turkey), Nilufer Onder (Michigan Technological University, USA), Soner Onder (Michigan Technological University, USA), Engin Maden (Middle East Technical University, Turkey) and Hui Meen Nyew (Michigan Technological University, USA)
DOI: 10.4018/978-1-61350-056-9.ch013
OnDemand PDF Download:
$37.50

Abstract

The goal of computer architecture research is to design and build high performance systems that make effective use of resources such as space and power. The design process typically involves a detailed simulation of the proposed architecture followed by corrections and improvements based on the simulation results. Both simulator development and result analysis are very challenging tasks due to the inherent complexity of the underlying systems. The motivation of this work is to apply episode mining algorithms to a new domain, architecture simulation, and to prepare an environment to make predictions about the performance of programs in different architectures. We describe our tool called Episode Mining Tool (EMT), which includes three temporal sequence mining algorithms, a preprocessor, and a visual analyzer. We present empirical analysis of the episode rules that were mined from datasets obtained by running detailed micro-architectural simulations.
Chapter Preview
Top

Introduction

The goal of computer architecture research is to design and build high performance systems that make effective use of resources such as space and power. The design process typically involves a detailed simulation of the proposed architecture followed by corrections and improvements based on the simulation results. Both simulator development and result analysis are very challenging tasks due to the inherent complexity of the underlying systems. In this chapter, we present our work on applying sequence mining algorithms (Mannila et al., 1997; Laxman et al. 2007) to the analysis of computer architecture simulations (Onder, 2008). Sequence mining is an important branch of data mining and was designed for data that can be viewed as a sequence of events with associated time stamps. Using sequence mining to analyze architectural simulations carries significant advantages for three main reasons. First, a time based analysis is essential because events that repeat or certain events that are clustered temporally can affect processor performance. Second, automated and well-defined techniques give more profound insights as compared to manual analysis. In the literature, there are few studies that propose using data mining and machine learning for architecture simulation analysis. In (Hamerly, et. al., 2006), clustering is used as the basic method to find repetitive patterns in a program’s execution. In another recent work (Akoglu & Ezekwa, 2009), the use of sequence mining for improving the prefetching techniques is investigated. The existence of a considerable amount of unexplored uses of sequence mining for architecture simulation analysis is the third motivation for our study.

Our research methodology is as follows. We first take a micro-architecture definition developed using a special description language (Zhou & Onder, 2008). The definition includes a specification of the micro-architectural components of a computer system, how these components interact, and how they are controlled. We then simulate the written specification on benchmark programs and record the behavior of the system using a micro-architecture simulator (Onder, 2008). We finally feed the recorded results into the sequence based mining tool we developed. Our tool is called Episode Mining Tool (EMT) and consists of three modules. The first module is the data preprocessor which transforms the raw output data of the architecture simulation into processable data. The second module is the episode miner that takes the inputs along with the user specified options and applies sequence mining algorithms to generate the frequent episodes and rules seen in the data. The episode miner includes implementations of three algorithms, namely window based episode mining algorithm (Mannila et al., 1997), minimal occurrence based episode mining algorithm (Mannila et al., 1997), and non-overlapping occurrence based algorithm (Laxman et al, 2007). The third module of EMT is the visual analyzer, which produces graphical charts depicting the frequent episodes and rules.

In our experiments, the primary functionality of EMT is to generate a variety of patterns that show strong relationships between microprocessor events. In addition to this, relationship between event types and Instructions Per Cycle (IPC) changes can be investigated. Such an analysis provides information on how the particular software being run interacts with the processor and allows us to create concise information about the nature of the benchmark programs. As another analysis, it is possible to compare the patterns generated for two different architectures and to analyze the difference between them. Such a comparison provides helpful information to predict the behavior of new architectures without actually running simulations on them.

This chapter is organized as follows. In the Background Section, we describe the components of computer hardware that are related to this work, how processor performance is improved, and how simulation based techniques are used. In the Representation Section, we show how micro-architectural events are represented as time sequence data. In the Episode Mining Section, we present the algorithms implemented and used. In the Empirical Work Section, we explain the experiments performed and their results. In the Episode Mining Tool Section, we describe the features and usage of our tool. We conclude with a summary, lessons learned, and further potential applications of our findings.

Complete Chapter List

Search this Book:
Reset