Fog-Cloud Collaboration for Real-Time Streaming Applications: FCC for RTSAs

Fog-Cloud Collaboration for Real-Time Streaming Applications: FCC for RTSAs

Biji Nair (National Institute of Technology Tiruchirappalli, India) and S. Mary Saira Bhanu (National Institute of Technology Tiruchirappalli, India)
DOI: 10.4018/978-1-5225-7335-7.ch007

Abstract

Real-time streaming applications (RTSAs) generate huge volumes of temporally ordered, infinite, continuous, high speed data streams demanding both real-time and long-term data analytics. Fog computing is a reliable solution for processing and analyzing real-time streaming data as it offers low latency, location-aware, geographically distributed service at fog node and provides long-term services at the cloud data center (DC). This chapter addresses the challenge of coordinating the fog nodes and cloud for efficient processing of real-time streaming data in motion and at rest. The fog-cloud collaboration framework proposed in this chapter employs data stream management system (DSMS) schema at the fog node for real-time stream data processing and response generation. The data representation in micro-clusters at fog node and macro-clusters at DC facilitates accurate data analytics. The coordination between fog node and DC is through local ontology and global ontology respectively.
Chapter Preview
Top

Introduction

Fog computing as suggested by (Luan et al., 2015) and (Portelli & Anagnostopoulos, 2017) is a viable technology to cater the needs of processing real-time streaming data of RTSA. RTSA includes wide range of applications generating huge volumes of high speed continuous streams of data like telecommunication calling records, credit card transaction flows, network monitoring and traffic engineering, stock exchange financial market, power supply & manufacturing in areas of engineering & industrial processes, video streams, RFIDs from areas of sensor, monitoring & surveillance, web logs and web page click streams. The main challenge in processing data from RTSA includes continuous evaluation of data stream as it arrives, generating quick responses to continuous query requests and providing compact summary of the high rate data stream. The analysis of voluminous historical archived data is as important as on-line data analysis. The key factor involved while adopting fog computing for processing and analytics of data from RTSA is effective fog-cloud collaboration. Fog computing is capable for real-time data processing by providing low-cost resources for computation in fog nodes near the proximity of end devices. Cloud offers large scale and long-term storage for the high-speed data generated, heavy computation, long term data analytics and wide connectivity. The proposed work deals with the distribution of task of data stream processing and analytics of RTSA among fog nodes and cloud. Fog node receiving the live streaming data has to evaluate the data stream continuously in real-time for actionable decision making. Hence the granularity of amount of streaming data processed with respect to time (usually in microsecond) depends on the real-time requirements and memory capacity of the fog node processing it. The cloud data center (DC) stores and evaluates summary of the data streams received from the fog nodes at a coarser granularity level i.e. at longer time scale varying from hours, days, months to years. This work presents a framework for summarization and analysis of voluminous data generated by RTSAs. The proposed framework is generic which has considered varied type of RTSAs generating different types of streaming data. It involves fog-cloud alliance for short-term and long-term data processing, analytics, storage and decision making for RTSA. The responsiveness of data analytics is focused locally at fog node in real time and at a global scope at DC in a wider scenario over longer durations running rich data analytics. The data analysis and storage are done at different granularity levels of time scale based on the urgency of decision making. This requires different granularity levels of knowledge representation at the fog and the DC level. Hence streaming data is represented as micro clusters at fog level and macro clusters at DC. The data analytics at different levels of granularity requires robust synchronization and coordination between the distributed fog layer and the centralized control at the DC. The research work presents a generic fog computing framework modeled for data analysis and storage of streaming data generated by RTSAs. The framework meets real-time and long-term analytics requirements for knowledge extraction from voluminous high rate streaming data through fog-cloud collaboration. It addresses the issues of representation of data streams, knowledge extraction at different levels of granularity, real-time stream query processing and storage.

The main contribution of this chapter is the fog-cloud collaboration framework for RTSA data stream processing. It includes Data Stream Management System (DSMS) schema at the fog node which summarizes useful information of streaming data as micro clusters through clustering in real time. The micro clusters are thereafter used for actionable analytics. DSMS schema uses local ontology for RTSA specific data stream clustering, query processing and analytics. The chapter presents fog-cloud collaboration for RTSA stream processing through macro clustering of the micro clusters at DC forwarded to it by fog node for long-term analytics. The coordination between DC and fog is established through updates of local ontology with the changes observed overtime by the global ontology for each RTSA.

Key Terms in this Chapter

Ontology: It is a formal representation of the concepts, entities, and their relation shared by several applications in a given domain.

Auto-Encoders: It is a data compression technique for encoding input data in lower dimensions without loss of information for the ease of processing data. The decoder in auto-encoders can regenerate the input data.

Summarization: Summarization is the step in data mining that provides concise representation of a data stream or a dataset in compressed form.

Granularity: The unit and scale of data processing based on the urgency and the level at which decision making is required by an application.

Analytics: Analytics involves analysis of the data stream through systematic processing to infer and interpret the information contained in it for data pattern recognition and decision making.

Collaboration: Collaboration refers to the association of the fog nodes and cloud DC to work together to process streaming data of RTSA.

Clustering: Clustering is an unsupervised learning method used for the abstraction of unlabeled data into classes or groups based on their similarity.

Complete Chapter List

Search this Book:
Reset