Analysis of the Join Performance in Vertically Distributed Cloud Databases

Analysis of the Join Performance in Vertically Distributed Cloud Databases

Jens Kohler (Faculty of Informatics, University of Applied Sciences Mannheim, Mannheim, Germany), Kiril Simov (Linguistic Modelling Department, Bulgarian Academy of Sciences, Sofia, Bulgaria) and Thomas Specht (Faculty of Informatics, University of Applied Sciences Mannheim, Mannheim, Germany)
DOI: 10.4018/IJARAS.2015070104


Cloud Computing becomes interesting for enterprises across all branches. Renting computing capabilities from external providers avoids initial investments, as only those resources have to be paid that were used eventually. Especially in the context of “Big Data” this pay-as-you-go accounting model is particularly important. The dynamically scalable resources from the Cloud enable enterprises to store or analyze these huge amounts of unstructured data without using their own hardware infrastructure. However, Cloud Computing is currently facing severe data security and protection issues. These challenges require new ways to store and analyze data, especially when huge data volumes with sensitive data are stored at external locations. The presented approach separates data on database table level into independent chunks and distributes them across several clouds. Hence, this work is a contribution to a more secure and resilient cloud architecture as multiple public and private cloud providers can be used independently to store data without losing data security and privacy constraints.
Article Preview


The SeDiCo approach follows the principle of vertically distributed database tables. The main idea is to divide crucial database data and distribute them across different (public and private) cloud providers. Thus, every provider only gets a small part of the data. These individual small parts are worthless without the other parts, so enterprises are able to meet their compliance rules concerning data security and protection and so Cloud Computing becomes an interesting alternative to store vast amounts of data.

The main idea is to divide a basic database table into 2 partitions and to store these partitions at different cloud providers. Figure 1 illustrates the entire approach with Customer_Partition1 “p1” (in a public cloud) and Customer_Partition2 “p2” (in a private cloud). If now the data of one partition is stolen (e.g. creditCardNo of p2) this data cannot be misused, as the corresponding data (e.g. “name” and “dateOfBirth” of p1) is stored somewhere else. Moreover, only the owner of the data knows where.

Figure 1.

SeDiCo Architecture

Furthermore, as data are distributed across several cloud providers, their different programming interfaces - application programing interfaces (APIs) - become a major challenge. Different APIs have to be encapsulated in order to have a unique way of accessing the different interfaces. This encapsulation was realized with the jclouds1 framework because of its Java support. Further alternatives are mentioned later the related work section.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 7: 1 Issue (2016)
Volume 6: 2 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing