A Scalable Big Stream Cloud Architecture for the Internet of Things

A Scalable Big Stream Cloud Architecture for the Internet of Things

Laura Belli (University of Parma, Parma, Italy), Simone Cirani (University of Parma, Parma, Italy), Luca Davoli (University of Parma, Parma, Italy), Gianluigi Ferrari (University of Parma, Parma, Italy), Lorenzo Melegari (University of Parma, Parma, Italy), Màrius Montón (WorldSensing, Barcelona, Spain) and Marco Picone (University of Parma, Parma, Italy)
DOI: 10.4018/IJSSOE.2015100102
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The Internet of Things (IoT) will consist of billions (50 billions by 2020) of interconnected heterogeneous devices denoted as “Smart Objects:” tiny, constrained devices which are going to be pervasively deployed in several contexts. To meet low-latency requirements, IoT applications must rely on specific architectures designed to handle the gigantic stream of data coming from Smart Objects. This paper propose a novel Cloud architecture for Big Stream applications that can efficiently handle data coming from Smart Objects through a Graph-based processing platform and deliver processed data to consumer applications with low latency. The authors reverse the traditional “Big Data” paradigm, where real-time constraints are not considered, and introduce the new “Big Stream” paradigm, which better fits IoT scenarios. The paper provides a performance evaluation of a practical open-source implementation of the proposed architecture. Other practical aspects, such as security considerations, and possible business oriented exploitation plans are presented.
Article Preview

Introduction

The actors involved in IoT scenarios have extremely heterogeneous characteristics (in terms of processing and communication capabilities, energy supply and consumption, availability, and mobility), spanning from constrained devices, also denoted as “Smart Objects (SOs),” to smartphones and other personal devices, Internet hosts, and the Cloud. Smart Objects are typically equipped with sensors and/or actuators and are thus capable to perceive and act on the environment where they are deployed. By 2020, 50 billions of Smart Objects are expected to be deployed in urban, home, industrial, and rural scenarios (Evans, 2011), in order to collect relevant information, which may be used to build new useful applications.

Shared and interoperable communication mechanisms and protocols are currently being defined and standardized, allowing heterogeneous nodes to efficiently communicate with each other and with existing common Internet-based hosts or general-purpose Internet-ready devices. The most prominent driver for interoperability in the IoT is the adoption of the Internet Protocol (IP), namely IPv6 (Postel, 1981; Deering & Hinden, 1998). An IP-based IoT will be able to extend and interoperate seamlessly with the existing Internet.

In a typical IoT scenario, sensed data are collected by SOs, deployed in and populating the IoT network, and sent uplink to collection entities (servers or the Cloud). In some cases, an intermediate element may support the Cloud, carrying out storage, communication, or computation operations in local networks (e.g., data aggregation or protocol translation). This approach is the basis of the Fog Computing (Bonomi, Milito, Zhu, & Addepalli, 2012) and will be better explained in the “Background” section.

Figure 1 shows the hierarchical structure of layers involved in data collection, processing, and distribution in IoT scenarios.

Figure 1.

The hierarchy of layers involved in IoT scenarios: the Fog works as an extension of the Cloud to the network edge to support data collection, processing, and distribution

With billions of nodes capable of gathering data and generating information, the availability of efficient and scalable mechanisms for collecting, processing, and storing data is crucial.

Big Data techniques, which were developed in the last few years and became popular due to the evolution of online and social/crowd services, address the need to process extremely large amounts of heterogeneous data for multiple purposes. These techniques have been designed mainly to deal with huge volumes of information (focusing on storage, aggregation, analysis, and provisioning of data), rather than to provide real-time processing and dispatching (Zaslavsky, Perera, & Georgakopoulos, 2013; Leavitt, 2013). Cloud Computing has found a direct application with Big Data analysis due to its scalability, robustness, and cost-effectiveness.

One of the distinctive features of IoT systems is the deployment of a huge amount of heterogeneous data sources collecting data from the environment and sending information through the internet to collectors. The work of all data sources generate, as a whole, streams with a very high frequency. Moreover, several relevant IoT scenarios (such as industrial automation, transportation, networks of sensors and actuators) need real-time or predictable latency.

The number of data sources, on one side, and the subsequent frequency of incoming data, on the other side, create a new need for Cloud architectures to handle such massive information flows.

Big Data approaches typically have an intrinsic inertia because they are based on batch processing. For this reason, they are not suitable to the dynamicity of IoT scenarios with real-time requirements.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 7: 4 Issues (2017): 3 Released, 1 Forthcoming
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing