Article Preview
Top1. Introduction
People spend a dominant part of their time in their home. As society and innovation advance, enthusiasm for enhancing the intelligence of the environments in which we live and work is developing. By filling different places with sensors and gathering data during daily routines, researchers can gain information on human daily conduct and the effect of conduct on the inhabitants and their surroundings (Tsai, 2014).
Incorrect utilizations of home apparatuses alongside absence of a smart energy infrastructure advocated to unnecessary waste or energy consumption in most places. Today because of the development of sensor, the power use information of apparatuses can be gathered effortlessly. Specifically, an expanding number of smart power meters, which helps data collection of appliance usage, have been deployed. With the enormous amount of appliance usage data, valuable information may exist but hidden. Therefore, proposing data mining algorithms to find appliance usage patterns from this huge amount of usage data so as to make usage behavior of appliances clear.
Many researchers have focused on the reduction of electricity usage in residences because of its role in CO2 and greenhouse gases emissions. (Atanasov, 2015) presents an approach to data modeling in the domain of home energy saving which extends existing solutions with context-ware concepts and relationships. However, electricity conservation is a tedious task for residential users due to the lack of detailed electricity usage. If representative patterns of appliance electricity usage are existing, inhabitants can adjust their apparatus utilization to conserve the energy effectively (Chen, 2015).
Appliance usage patterns offer clients to better some assistance with understanding how they utilize the apparatuses at home and distinguish irregular uses of apparatuses. Additionally, it encourages appliance manufacturers to design clever control of smart appliances (Chen, 2014). As the appliances’ usage data is so large which is called big data, extracting valuable information needs big data processing tools such as Spark (Karau, 2015) and Hadoop (White, 2012). Generating data in in smart city and smart home are placed in the category of big data as it has the big data challenges described in (Russom, 2011) and that can best be described along the so-called 3 V’s: Volume, Velocity, and Variety.
In this research valuable sequence pattern from real appliances’ usage dataset of SGSC (Motlagh, 2015) is extracted using PrefixSpan (Pei, 2001). The experiments in this research is implemented on Spark as a novel distributed and parallel big data processing platform on two different clusters. The contributions of this paper are as follows: creating usage sequences from the power usage data of each appliance, mining the sequences and extracting interesting sequence patterns by PrefixSpan using a big data platform, and some findings that show that the dataset and computations distribution imbalance can impact the efficiency of PrefixSpan when implemented on distributed environment such as Apache Spark.
The rest of the paper is organized as follows: Related works are discussed in section 2. In section 3, some preliminaries is discussed. The dataset used in our experiments and the experiments and results were brought in section 4 and 5 respectively. Finally, in section 6 conclusion and future research are discussed.