ROCK-CNN: Distributed Deep Learning Computations in a Resource-Constrained Cluster

ROCK-CNN: Distributed Deep Learning Computations in a Resource-Constrained Cluster

Rezeda Khaydarova, Dmitriy Mouromtsev, Vladislav Fishchenko, Vladislav Shmatkov, Maxim Lapaev, Ivan Shilin
DOI: 10.4018/IJERTCS.2021070102
(Individual Articles)
No Current Special Offers


The paper is dedicated to distributed convolutional neural networks on a resource constrained devices cluster. The authors focus on requirements that meet the users' needs. Based on this, architecture of the system is proposed. Two use cases of CNN computations on a ROCK-CNN cluster are mentioned, and algorithms for organizing distributed convolutional neural networks are described. Experiments to validate proposed architecture and algorithms for distributed deep learning computations are conducted as well.
Article Preview

1. Introduction

Nowadays, IoT-based technologies (Internet of Things) become an indispensable part of our daily life (Khaydarova et al., 2020). Our dependence upon the Internet and the devices is increasing at a fast pace. Key communication technologies enabling to use IoT are WSN (Wireless Sensor Network), machine-to-machine (m2m) communication, human-machine interaction, web services, information systems, etc (Prasanna & Rao, 2012). Domain in taken IoT technology implementation has grown dramatically over the last decade. One of the most popular IoT applications is smart houses and home automation. Interconnected devices which may be controlled remotely, smart metering applications to save energy, water, and other resources are the state of art issues.

Constantly emerging modern IoT device management systems support more sophisticated deep-learning technologies making use of neural networks to capture and analyze the environments. Amazon Echo intended to comprehend and implement human voice commands is one of the examples (Tang et al., 2017). Deep learning applications for IoT devices often require pseudo-real-time functionality, such as security camera-based recognition tasks, requiring low latency to respond to target events: strangers in the house or unattended objects left in subway or airport (P. Zhang et al., 2017),(Cai et al., 2017).

Convolutional neural networks (CNNs) have been intensively researched and used in large-scale data processing due to their comparable classification accuracy (X. Zhang et al., 2018), (Song et al., 2017). However, executing CNN locally on mobile and embedded devices requires large computational resources and has great memory consumption that is not usually possible in IoT platforms. Moreover, by drastically increasing the number of devices connected to the Internet, the network latency increases.

CNN consumes a lot of computational resources and requires powerful computers (supercomputers) with the latest implementations of graphics processing unit (GPU). But supercomputers consume a lot of energy, are expensive and take a lot of space. Another option is to use common clusters which are still expensive. An alternative to common clusters (a number of interconnected desktop computers) is resource-constrained devices (RCDs) as single board computer (SBC) clusters. Interests in SBC computer clusters have grown dramatically since the first Raspberry Pi was released in 2012 (Basford et al., 2020). SBC clusters` domain is still being researched and developed. New, more powerful SBCs are emerging every year. One of such devices is RockPro64 (RockPro64, 2020) released in June 2018. The distributed computing domain requires fare scalability as well as cluster price optimization.

In this paper, we present a SBC cluster consisting of RockPro64 single board computers (RockPro64, 2020). We design the ROCK-CNN architecture for locally distributed convolutional neural network for RockPro64 cluster adaptive for resource-constrained devices. Deep learning use cases, such as dealing with images and time series data, are displayed.

-We organize this paper as follows: in section 2, related works on SBC clusters are presented; problem statement and development pipeline are described in section 3; section 4 is dedicated to the system architecture requirements and ROCK-CNN architecture description; we introduce two use-cases (face recognition, and time-series data) for distributed deep learning in a resource-constrained cluster in section 5; moreover, proposed algorithms for distributed face recognition and designed CNN model for time-series data are verified by testing a number of metrics to efficiency testing in section 6. Finally, discussion and results section is presented in section 7, conclusion is described in section 8.

Complete Article List

Search this Journal:
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 2 Issues (2018)
Volume 8: 2 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing