Article Preview
Top1. Introduction
Current high-end graphics processing units (abbreviate to GPUs) become very popular in the high performance computing community fields. These GPU cards, such as NVIDIA Tesla, Fermi, Kepler series, contain up to thousand cores per chip. For example, NVIDIA Tesla K20m GPU card has 2496 CUDA cores. They contain massive multi-threaded processors; moreover, the thousands of threads can be declared and executed simultaneously to fully utilize GPU computing power. However, these GPU cards, called desktop GPUs, should be installed in a personal computer or a server with desktop CPUs. Moreover, the cost and power consumption of constructing a high performance computing platform with these desktop GPU cards are high. For example, a NVIDIA Tesla or Fermi GPU card may be needed to spend thousands of US dollars and a personal computer or a server may also spend about hundreds to thousands of US dollars. The overall power consumption by this platform may be up to a kilowatt. Besides, this platform is hard to provide the immediate and mobility requirements, i.e., the developers should demo the program by the remote control.
Jetson TK1 (or called Tegra K1, http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html), it shows that the speedup ratios and the performance per watt by NVIDIA Jetson TK1 are better than those by Apple A7 used in the iphone5. James Wolfer (Wolfer, 2015) compared NVIDIA Jetson TK1 with Raspberry PI Model in terms of time and speedup ratios. The results showed that the single NVIDIA Jetson TK1 outperforms Raspberry PI Model in these two items. P.S. Paolucci et al. (Paolucci et al., 2014) also compared the ARM Cortex-A15 CPU in NVIDIA Jetson TK1 with the Intel XEON E5620 CPU in the SuperMicro server in terms of joule per synaptic event, total energy, power consumption, and execution time. For the first two items, NVIDIA Jetson TK1 achieves 4.5 and 4.4 times performance better than the SuperMicro server, respectively. NVIDIA Jetson TK1 achieves 14.4 times performance better than SuperMicro server for the power consumption. However, the execution time by NVIDIA Jetson TK1 is 3.3 times larger than that by the SuperMicro server. S. Fu et al. (Fu et al., 2015) do the comparison of performance, power efficiency, and cost efficiency among desktop Intel i7-3770 multi-core CPU, desktop NVIDIA GTX 690 GPU card and NVIDIA Jetson TK1. The results proved that the power efficiency and cost efficiency by NVIDIA Jetson TK1 are both better than those by desktop Intel i7-3770 CPU and desktop NVIDIA GTX 690 GPU card, respectively; the performance by NVIDIA Jetson TK1 is close to that by desktop Intel i7-3770 multi-core CPU, and the desktop NVIDIA GTX 690 GPU has the best performance. From above results, NVIDIA Jetson TK1 is a comparable embedded and mobile board by comparing with other embedded boards and desktop CPUs; moreover, NVIDIA Jetson TK1 has the low cost and low power consumption advantages by comparing with other desktop GPU cards. Hence, it will become a new research direction to study NVIDIA Jetson TK1 in the high performance computing community fields.