High-Performance Customizable ComputingDomingo Benitez (University of Las Palmas de Gran Canaria, Spain)
Copyright © 2012.
30 pages.
OnDemand Chapter PDF Download
Download link provided immediately after order completion
| $37.50 | |
Available.
Instant access upon order completion.
DOI: 10.4018/978-1-61350-116-0.ch003
Sample PDFCite
MLA
Benitez, Domingo. "High-Performance Customizable Computing." Handbook of Research on Computational Science and Engineering: Theory and Practice. IGI Global, 2012. 48-77. Web. 22 May. 2013. doi:10.4018/978-1-61350-116-0.ch003
APA
Benitez, D. (2012). High-Performance Customizable Computing. In J. Leng, & W. Sharrock (Eds.), Handbook of Research on Computational Science and Engineering: Theory and Practice (pp. 48-77). Hershey, PA: Engineering Science Reference. doi:10.4018/978-1-61350-116-0.ch003
Chicago
Benitez, Domingo. "High-Performance Customizable Computing." In Handbook of Research on Computational Science and Engineering: Theory and Practice, ed. J. Leng and Wes Sharrock, 48-77 (2012), accessed May 22, 2013. doi:10.4018/978-1-61350-116-0.ch003
Export Reference
 Favorite  | | TopAbstractMany accelerator-based computers have demonstrated that they can be faster and more energy-efficient than traditional high-performance multi-core computers. Two types of programmable accelerators are available in high-performance computing: general-purpose accelerators such as GPUs, and customizable accelerators such as FPGAs, although general-purpose accelerators have received more attention. This chapter reviews the state-of-the-art and current trends of high-performance customizable computers (HPCC) and their use in Computational Science and Engineering (CSE). A top-down approach is used to be more accessible to the non-specialists. The “top view” is provided by a taxonomy of customizable computers. This abstract view is accompanied with a performance comparison of common CSE applications on HPCC systems and high-performance microprocessor-based computers. The “down view” examines software development, describing how CSE applications are programmed on HPCC computers. Additionally, a cost analysis and an example illustrate the origin of the benefits. Finally, the future of the high-performance customizable computing is analyzed. TopIntroductionFrequently, automated solutions to Computational Science and Engineering (CSE) problems require that billions to trillions of complex operations be applied to input data acquired from the real world. In many cases, these solutions must be reported without delay, they are time critical, and frequently, they must also be of the highest precision. Both, availability and precision of information are key elements in resolving CSE problems and so making living more comfortable and longer. In order to reach this performance goal, high-performance computing is a research and development domain which aids the solution of CSE problems with a combination of high-performance computers and parallel programs. For many years, the fastest computers integrated central processing units (CPUs) or microprocessors that were specialized in performing the greatest number of operations per second. However, nowadays, the architectures of the fastest high-performance computers are dominated by a large population of multi-core programmable processors, many of which can be also integrated into desktop or server computers. In this time of transition, new high-performance processors can provide higher levels of performance than their predecessors due mainly to an increase in the number of processing cores that are integrated on-chip. Increasing the numbers of processing cores on a single chip offers increased computer performance at somewhat lower power dissipation than a complex single-core microprocessor with an equivalent number of transistors on-chip. Nevertheless, the multi-core approach does not address three basic problems. Firstly, the available computing power on-chip is not efficiently utilized by programs. Secondly, the connection from the processor to the external memory becomes more loaded as the number of cores increases. This and the difference in operating frequency between multi-core processor and external memory can become a bottleneck of parallel processing and stall some or all cores. The third problem is caused because effective programming of multi-core systems is difficult, and in many cases, software is ultimately responsible for the lack of performance scalability as the amount of cores increases (Mackin & Woods, 2006). An alternative approach has arisen; High-Performance Customizable Computing (HPCC) is a different paradigm of high-performance computing. Instead of having only programmable processors, customizable computers also integrate hardware coprocessors with non-fixed architectures. These high-performance computing elements can be customized for a portion of a specific program and so accelerate the execution of key steps in the application software. Customizable hardware devices offer the advantage of speeding up several software applications because its hardware flexibility allows the same chip to be specialized and reused. This is the main property that is applied to High-Performance Customizable Computing. This property is very useful in exploiting the inherent parallelism of many CSE problems. Customizable devices have shown a big potential for use in high-performance computing with much better power efficiency than programmable processors. New customizable devices are providing ever higher performance because their clock frequency and the number of transistors dedicated to specialized processing both are increasing. Additionally, customizable devices have other advantages that are exploited in embedded hardware engineering, such as reducing both the non-recurrent engineering costs (Dehon, 2008) and development time of a product (Guccione, 2008). Two types of computing systems that integrate customizable devices are common nowadays: configurable and reconfigurable systems. Configurable Systems are built from baseline chip designs that are partially specialized during design-time and before fabrication (Leibson, 2006). After chip fabrication, these systems can be software-programmed but cannot be specialized anymore. On the other hand, Reconfigurable Systems are based on field-programmable devices that can be completely customized after fabrication (Chang, 2008). The main goal of this chapter is to help readers understand how customizable hardware systems can be exploited to provide high performance, i.e., how to get 10X, 100X or 1000X the performance of the equivalent number of transistors in a microprocessor-based computer with much better power efficiency. The reader will gain insight into the design, management and use of high-performance infrastructures that integrate microprocessor-based and customizable computers. TopComplete Chapter List
Search this Book:
Reset | 1. |
Gabriele Jost (The University of Texas at Austin, USA), Alice E. Koniges (Lawrence Berkeley National Laboratory, USA)
The upcoming years bring new challenges in high-performance computing (HPC) technology. Fundamental changes in the building blocks of HPC hardware are forcing corres...
Sample PDF |
More details... | $37.50 |
| 2. |
Ivan Girotto (National University of Ireland Galway, Republic of Ireland), Robert M. Farber (Pacific Northwest National Laboratory, USA)
This chapter focuses on the technical/commercial dynamics of multi-threaded hardware architecture development, including a cost/benefit account of current and future...
Sample PDF |
More details... | $37.50 |
| 3. |
Domingo Benitez (University of Las Palmas de Gran Canaria, Spain)
Many accelerator-based computers have demonstrated that they can be faster and more energy-efficient than traditional high-performance multi-core computers. Two type...
Sample PDF |
More details... | $37.50 |
| 4. |
Rasit O. Topaloglu (GlobalFoundries, USA), Swati R. Manjari (Rensselaer Polytechnic Institute, USA), Saroj K. Nayak (Rensselaer Polytechnic Institute, USA)
Interconnects in semiconductor integrated circuits have shrunk to nanoscale sizes. This size reduction requires accurate analysis of the quantum effects. Furthermore...
Sample PDF |
More details... | $37.50 |
| 5. |
Prashobh Balasundaram (IBM Dublin Software Laboratories, Republic of Ireland)
This chapter presents a study of leading open source performance analysis tools for high performance computing (HPC). The first section motivates the necessity of op...
Sample PDF |
More details... | $37.50 |
| 6. |
David Worth (Science and Technology Facilities Council, UK), Chris Greenough (Science and Technology Facilities Council, UK), Shawn Chin (Science and Technology Facilities Council, UK)
The purpose of this chapter is to introduce scientific software developers to software engineering tools and techniques that will save them much blood, sweat, and te...
Sample PDF |
More details... | $37.50 |
| 7. |
Diane Kelly (Royal Military College, Canada), Daniel Hook (Engineering Seismology Group, Canada), Rebecca Sanders (EA Pogo, Canada)
The aim of this chapter is to provide guidance on the challenges and approaches to testing computational applications. Testing in our case is focused on code testing...
Sample PDF |
More details... | $37.50 |
| 8. |
Judith Segal (The Open University, UK), Chris Morris (STFC Daresbury Laboratory, UK)
There are significant challenges in developing scientific software for a broad community. In this chapter, we discuss how these challenges are somewhat different bot...
Sample PDF |
More details... | $37.50 |
| 9. |
Fumie Costen (University of Manchester, UK), Akos Balasko (Hungarian Academy of Sciences, Hungary)
The computational architecture of Enabling Grids for E-sciencE is introduced as it made our code porting very challenging, and the discussion presented is directly a...
Sample PDF |
More details... | $37.50 |
| 10. |
Abid Yahya (Universiti Malaysia Perlis, Malaysia), Farid Ghani (Universiti Malaysia Perlis, Malaysia), R. Badlishah Ahmad (Universiti Malaysia Perlis, Malaysia), Mostafijur Rahman (Universiti Malaysia Perlis, Malaysia), Aini Syuhada (Universiti Malaysia Perlis, Malaysia), Othman Sidek (Collaborative Microelectronic Design Excellence Center, Malaysia), M. F. M. Salleh (Universiti Sains Malaysia, Malaysia)
This chapter presents performance of a new technique for constructing Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) encrypted codes based on a row division method....
Sample PDF |
More details... | $37.50 |
| 11. |
Hubertus J. J. van Dam (Pacific Northwest National Laboratory, USA)
Quantum chemistry was a compute intensive field from the beginning. It was also an early adopter of parallel computing, and hence, has more than twenty years of expe...
Sample PDF |
More details... | $37.50 |
| 12. |
Marc Hafner (Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland), Heinz Koeppl (Swiss Federal Institute of Technology Zurich (ETHZ), Switzerland)
With the advances in measurement technology for molecular biology, predictive mathematical models of cellular processes come in reach. A large fraction of such model...
Sample PDF |
More details... | $37.50 |
| 13. |
C. T. J. Dodson (University of Manchester, UK)
Many real processes have stochastic features which seem to be representable in some intuitive sense as `close to Poisson’, `nearly random’, `nearly uniform’ or with...
Sample PDF |
More details... | $37.50 |
| 14. |
Stefania Tomasiello (University of Basilicata, Italy)
Though relatively unknown, the Differential Quadrature Method (DQM) is a promising numerical technique that produces accurate solutions with less computational effor...
Sample PDF |
More details... | $37.50 |
| 15. |
Marco Evangelos Biancolini (University of Rome, Italy)
Radial Basis Functions (RBF) mesh morphing, its theoretical basis, its numerical implementation, and its use for the solution of industrial problems, mainly in Compu...
Sample PDF |
More details... | $37.50 |
| 16. |
Joanna Leng (Visual Conclusions, UK), Theresa-Marie Rhyne (Visualization Consultant, USA), Wes Sharrock (University of Manchester, UK)
This chapter focuses on state of the art at the intersection of visualization and CSE. From understanding current trends it looks to future applications for these te...
Sample PDF |
More details... | $37.50 |
| 17. |
Peter Sarlin (Åbo Akademi University, Finland)
Since the 1980s, two severe global waves of sovereign defaults have occurred in less developed countries (LDCs): the LDC defaults in the 1980s and the LDC defaults a...
Sample PDF |
More details... | $37.50 |
| 18. |
Iain Barrass (Health Protection Agency, UK), Joanna Leng (Visual Conclusions, UK)
Since infectious diseases pose a significant risk to human health many countries aim to control their spread. Public health bodies faced with a disease threat must u...
Sample PDF |
More details... | $37.50 |
| 19. |
Eldon R. Rene (University of La Coruña, Spain), Sung Joo Kim (University of Ulsan, South Korea), Dae Hee Lee (University of Ulsan, South Korea), Woo Bong Je (University of Ulsan, South Korea), Mirian Estefanía López (University of La Coruña, Spain), Hung Suck Park (University of Ulsan, South Korea)
Sequencing batch reactor (SBR) is a versatile, eco-friendly, and cost-saving process for the biological treatment of nutrient-rich wastewater, at varying loading rat...
Sample PDF |
More details... | $37.50 |
| 20. |
Joanna Leng (Visual Conclusions, UK), Wes Sharrock (University of Manchester, UK)
Computational Science and Engineering (CSE) is an emerging, rapidly developing, and potentially very significant force in changing scientific practice by offering a...
Sample PDF |
More details... | $37.50 |
| 21. |
Kerstin Kleese van Dam (Pacific Northwest National Laboratory, USA), Mark James (University of California San Diego, USA), Andrew M. Walker (University of Bristol, UK)
This chapter describes the key principles and components of a good data management system, provides real world examples of how these can be successfully integrated w...
Sample PDF |
More details... | $37.50 |
| 22. |
Jens Jensen (Science and Technology Facilities Council, UK), David L. Groep (National Institute for Subatomic Physics, the Netherlands)
Modern science increasingly depends on international collaborations. Large instruments are expensive and have to be funded by several countries, and they generate ve...
Sample PDF |
More details... | $37.50 |
| 23. |
Matt Ratto (University of Toronto, Canada)
Computational science and engineering (CSE) technologies and methods are increasingly considered important tools for the humanities and are being incorporated into s...
Sample PDF |
More details... | $37.50 |
| 24. |
Phillip L. Manning (University of Manchester, UK, & University of Pennsylvania, USA), Peter L. Falkingham (University of Manchester, UK)
Dinosaurs successfully conjure images of lost worlds and forgotten lives. Our understanding of these iconic, extinct animals now comes from many disciplines, not jus...
Sample PDF |
More details... | $37.50 |
TopKey Terms in this ChapterField Programmable Gate Array (FPGA): is a reconfigurable electronic device with fine-grain architecture that implements customized computational logic specific to the application being executed, and can be reconfigured for a wide range of tasks (Maxfield, 2004). Its reconfigurable architecture is composed of: processing elements called Look-Up-Tables (LUTs) that can implement any logic function with few inputs, an interconnection network that can connect any logic cell with the rest of the circuit, memory blocks that store data to be loaded by any other element of the architecture, and special modules that are integrated on chip to efficiently do a frequently used task such as multiplications, digital signal processing, and external input-output interfacing. Reconfigurable Devices: are versatile configurable electronic components that are used to build distinct hardware implementations on the same set of reconfigurable resources after chip fabrication (Compton & Hauck, 2002). Reconfigurability is achieved by the use of an integrated configuration memory that stores the information about the state and functionally of each part of the device. The device is configured by loading in a configuration bitstream, consisting of a series of commands and frame data. At any time after a reconfigurable device has been powered up, it is possible to suspend its operations, load in a completely new hardware configuration, and restart its operation using the newly loaded configuration(Leibson, 2006). Processor,: or Central Processing Unit (CPU): is the central circuit of a computer that processes a sequence of jobs that arrive over time and actually executes the application program. Since the beginning of CPUs, their performance has been driven by higher clock rates and improved internal organizations of the circuit. In the last years, new CPUs with higher clock rates are not commercially reliable and the technology trend has been to integrate more than one core on a chip. Additionally, low power CPUs are playing a central role in high-performance computers because building ever-larger clusters of commercial off-the-shelf hardware are being constrained by power and cooling (Donofrio et al, 2009; Henkel & Parameswaran, 2007). High-Performance Computers (HPC): provide hardware and software infrastructures whose main goal is to accelerate the execution of customer applications or improve their fault-tolerance. Usually, these machines are composed of multiple processors, large memory capacity, large disk storage, and high-bandwidth communications among all their main components (Blake et al, 2009). Coprocessors: are specialized circuits that can be integrated into a computer and connected to a CPU to provide added performance for applications, implementing specific computational tasks (Gulati & Khatri, 2010). Customizable Electronic Devices: can be customized to efficiently execute a task and frequently are used as CPU and/or coprocessor. They can achieve typically 10-1000 times faster execution compared to today’s fastest CPU and a reduction of about 95% in power consumption. Three types of customizable devices can be distinguished: ASICs, Configurable Processors, and Reconfigurable Devices. Application Specific Integrated Circuit (ASIC): is an integrated electronic circuit that is customizable during the design phase to efficiently execute a specific task. After fabrication, it cannot be modified to execute other tasks. This customizable device can achieve the best performance and the lowest energy consumption. However, its design and fabrication costs are very high and can only be justified if the number of chips sold is very large (Rigo et al, 2010). Configurable Processors: are special ASICs that are based on a conventional CPU and tailored during chip design time for a specific software application. This type of processor produces much better computing efficiency and much lower power consumption than the original CPU. After fabrication, they cannot be configured again. Instruction Set Architecture (ISA): is the set of hardware elements of the processor that can be managed by the software program. Hardware control from the program is performed via normalized machine instructions. A program is a composition of machine instructions that are loaded sequentially by the CPU over time (Patterson & Hennessy, 2009). Parallelization: is the software technique that allows an application program to be partitioned and then, the independent resources in a computer can be efficiently activated with the independent program parts. This software partitioning can be done at instruction-level, data-level, thread-level, procedure-level or program-level. Many CSE applications can be parallelized. If the respective parallel programs are executed on HPC platforms, costs and manpower are improved (Akhter & Roberts, 2006). |
| |