Optimising Cloud Operation Cost Utilising External Memory

Optimising Cloud Operation Cost Utilising External Memory

Neil Hyland (BIM and Scan Ltd., Ireland) and Shawn E. O'Keeffe (BIM and Scan Ltd., Ireland)
DOI: 10.4018/IJDIBE.2020070104

Abstract

Handling large volumes of data on computers incur operational costs when physical hardware is considered, especially RAM, creating a need for intelligent solutions that both maintain an acceptable level of performance and enable cheaper scaling. The authors extend their previous work converting their existing point cloud processing and analysis tool to use external memory via the STXXL C++ library, replacing the entire dataset storage layer with STXXL's intelligent caching system. A rationale for adopting this technique, and methodology for testing previous and modified versions of the software is put forth, and the authors investigate the behaviour of their software tool to establish trade-offs. Competing versions of their software are fed sample datasets in E57 and IFC formats; the results of which are captured and analysed. The authors find that while execution speed is lowered, reduced memory consumption contributes to a higher throughput, enabling greater efficiency and real hardware cost savings.
Article Preview
Top

Background

Systems designed to handle large amounts of data are prevalent in scientific computing, cloud computing, “big data” processing, HPC, and distributed/cluster computing. Databases and archival storage require guarantees of integrity that function at scale (Doorn & Rivero, 2002; Ailamaki & Papadomanolakis, 2007). Medical imaging needs precision to be effective, and precision is guaranteed with greater detail, necessitating larger datasets. In the realm of the Internet large companies such as Google, Facebook, Amazon, etc. build and maintain huge software systems, and therefore must deal with enormous amounts of data passing through them to ensure acceptable operating times for end users of their web-based products. For surveying purposes, in the Architectural, Engineering, and Construction (AEC) industries, TLS and similar methods produce point clouds that contain millions of points representing scanned 3D space encompassing gigabytes of data. Point cloud files are multidimensional and extendable, i.e. each data point may capture spatial coordinates, surface normals, colour information, etc. The multidimensional nature of each individual point beyond a three-value Cartesian coordinate raises its memory requirement, which can be a significant chunk of memory when the volume of points stored grows.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 10: 2 Issues (2021): Forthcoming, Available for Pre-Order
Volume 9: 2 Issues (2020)
Volume 8: 2 Issues (2019)
View Complete Journal Contents Listing