Significance of In-Memory Computing for Real-Time Big Data Analytics

Significance of In-Memory Computing for Real-Time Big Data Analytics

Ganesh Chandra Deka (Ministry of Labour and Employment, India)
DOI: 10.4018/978-1-4666-5864-6.ch014
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Cloud computing provides online access of users’ data anytime, anywhere, any application, and any device. Due to the slower read/write operation of conventional disk resident databases, they are incapable of meeting the real-time, Online Transaction Processing (OLTP) requirements of cloud-based application, specifically e-Commerce application. Since In-Memory database store the database in RAM, In-Memory databases drastically reduce the read/write times leading to high throughput of a cloud-based OLTP systems. This chapter discusses In-Memory real time analytics.
Chapter Preview
Top

Introduction

The In-Memery computing has been around since 1990s. Currently, more than 50 software vendors deliver In-Memory technology based solutions. Systems such as Network Routers, low-end Set-top Boxes without consistent storage are the early users of IMDS.

Since the software used in these systems is running with minimal RAM and simple processor, IMDS or Main Memory Database system (MMDB) accelerated information storage, processing and retrieval storing data in RAM/DRAM. As there is no reading from or writing to secondary storage, transactions can be processed very quickly leading to elimination of processing overhead. In-memory systems can safely remove the buffer management and logging at the expense of durability. IMDS are intended for distributed and scalable computing environments (VietHiP, n.d.).

Elimination of latency is the key design goal for IMDS. Virtualization, cheaper semiconductor memory and cloud computing altogether has revolutionized the development of advanced database systems. IMDS is having lots of potential for systems where transaction speed is of utmost importance.

Key areas where IMDS delivers business value are:

  • 1.

    SaaS

  • 2.

    AaaS(Analytics as a Service)

    • Financial Analysis

    • Performance management

    • ERP applications

    • Business Intelligence (BI)

    • CRM

    • Mobile BI

    • Industrial and business functions like Operational Reporting, strong set of Analytical tools and

  • 3.

    Optimized integrated modules

  • 4.

    Social Networking websites

  • 5.

    Online gaming

  • 6.

    Real time applications

The growing popularity of Bigdata will compel lots of companies to use IMDS for dealing with very large Structure, Semi-structured, Unstructured and Hybrid data. This chapter discusses the salient features of seven popular In-Memory database systems. The In-database processing is discussed in brief.

Top

Real Time Analytics And In-Memory Computing

Analytics is the term used to define data patterns that provide meaning to a business or an entity. Real-time analytics refers to analytics that is to be accessed as they come into the system. Real-time analytics necessitate refreshed results such as page views, website navigation, shopping cart use or any other kind of online activity. These kinds of data can be extremely important to businesses for conducting dynamic analysis and reporting in order to quickly respond to trends in user online activities for strategic planning of business activities (Janssen, n.d.).

The exponential growth of cloud computing has resulted the explosion of data sources. The Internet based applications can be easily deployed in the cloud environment simply by starting or stopping members of cluster of web servers as well as application servers. Most of the cloud based solutions are real-time hence In-Memory databases are having lots of prospects for cloud computing applications. Lots of vendors providing database solutions are now coming up with their In-Memory database solutions. By using in-memory database technology, real-time applications for verticals such as financial services, digital advertising, telecom and mobile Web, can gain a number of benefits. The potential users of IMDS are real-time enterprise sector, such as Business Analytics, Capital markets (algorithmic trading, order matching engines, etc.), Real-time cache for e-Commerce and Web-based systems. An increase of 100 μSec of waiting time can dramatically reduce the probability that customers will continue to interact or return. In case of e-Commerce application this directly affects the profits.

Key Terms in this Chapter

Nonvolatile Random Access Memory (NVRAM): The Non-Volatile Random Access Memory, a type of memory that retains its contents when power is turned off. One type of NVRAM is SRAM that is made non-volatile by connecting it to a constant power source such as a battery. Another type of NVRAM uses EEPROM chips to save its contents when power is turned off. In this case, NVRAM is composed of a combination of SRAM and EEPROM chips. The Sun Ultra 45 and Ultra 25 workstation motherboards use a nonvolatile random access memory module (NVRAM) that stores parameters used for configuring system startup. Many leading semiconductor companies (Intel, Samsung, IBM, Freescale, TI, RAMTRON, and many more) are developing alternative technologies to flash and get rid of the entire drawback the flash has. Ramtron International Corporation (taken over by Cypress Semiconductor recently) has pioneered in commercializing F-RAM (Ferro Random Access Memory) technology. FRAM is available in huge production quantities. The source code of 8051 based microcontroller's can be integrated with FRAM. Features of F-RAM are( http://en.wikipedia.org/wiki/Ramtron_International ): 1) Around 10,000 times greater endurance; 2) 3,000 times less power consumption; and 3) Nearly 500 times the write speed. Considering the speed and durability parameter F-RAM is better in comparison to Flash. However the maximum capacity available is hardly 1MB in a chip. The technology used in FRAM is based on DRAM i.e. the capacitor stores the charge but need to be refreshed due to leakage of charges by the capacitor. This limitation in F-RAM is eliminated by using the capacitor with a ferro-electric material called Lead Zirconate Titanate (PZT) which can retain the charge without power. In the present scenario there are few applications where F-RAM can perform far better than Flash. The long term future of F-RAM can be positive if the size increases and cost per MB falls.

Solid State Drive (SSD): A Solid-State Drive (SSD) is a data storage device that emulates a hard disk drive (HDD). NAND Flash SSD’s are essentially arrays of flash memory devices which include a controller that electrically and mechanically emulate and are software compatible with magnetic HDD’s. In an unpowered state, NAND flash can retain memory for seven to 10 years. The features of SDD are: 1) No Moving parts; 2) Generate less heat with no noise; 3) Lower power consumption than traditional HDDs; 3) Zero rotational latency-Faster than HDD; and 4) Durability-Most suitable for high shock and vibration environments. Less susceptible to damage from environmental conditions. NAND flash development road maps show flash circuitry is expected to be only 6.5nm in size. At that time, read/write latency is expected to double in Multilevel Cell (MLC) flash and increase more than 2.5 times in Triple-Level Cell (TLC) flash. NAND Flash-based solid state drives (SSDs) have made inroads as data storage for Web sites, data centers and even some embedded applications. Since the SSD does not have mechanical parts, they can outperform traditional hard disks for data access. Storage on an SSD eliminates physical disk I/O, resulting in better responsiveness. An IMDS boosted database write performance by 420 times. The Semiconductor Industry Association has estimated the annual sales of worldwide semiconductor devices at US $300 billion. The Asia / Pacific region will largest market amounting to 55% of sales. The sales in America will be (18%), Japan (15%) and Europe (12%). The driving force for the higher growth of the semiconductor industry in coming years will be the proliferation of consumer electronic gadgets specifically tablets PCs and Smartphone (Semiconductor and Other Electronic Component Manufacturing, First Research, Inc., January 14, 2013, http://www.marketresearch.com/First-Research-Inc-v3470/Semiconductor-Electronic-Component-Manufacturing-7306997/ ).

Memcached: Memcached is In-Memory key-value storage technique for small chunks of arbitrary data such as strings, objects resulting from database calls, API calls or page rendering etc. are being widely used. In 2009 Facebook used a total of 150 TB of DRAM in memcached and other caches for a database containing 200 TB of disk storage. Major Web search engines have also started keeping their search indexes entirely in DRAM.

Everspin Technologies Spin-Torque Magnetoresistive RAM (ST-MRAM): The commercialization of the first 64Mb Spin-Torque MRAM is an industry milestone along the path to broader use of more varied non-volatile memory technologies to improve storage device reliability, and to increase performance. ST-MRAM gives system designers the benefit of persistent, high endurance storage or memory for applications that demand better reliability and that need the performance boost of DDR3 speed. The 64Mb density MRAM provides an ideal entry point for non-volatile buffer and cache memory in solid state and RAID storage systems as well as storage appliances. The 64Mb device will complement existing low cost memory technologies, reducing overall system cost and complexity.

Online Transaction Processing (OLTP): Transactions are Basic business operations such as customer orders, purchase orders, receipts, time cards, invoices, and payroll checks in an organization. Transaction processing systems (TPS) perform routine operations and serve as a foundation for other systems. Online transaction processing (OLTP) is a computerized processing system whereby each transaction is processed immediately, without the delay of accumulating transactions into a batch and the affected records are updated accordingly. The OLTP: 1) Holds Current Data; 2) Store Detail of data; 3) Dynamically change data by operation such as Insert, Update and Delete for repetitive processing and high level of transaction; and 4) Support day to day decision making. The earliest OLAP systems used multidimensional arrays in memory to store data cubes and are known as multidimensional OLAP (MOLAP) systems. OLAP implementations using only relational database features are called relational OLAP (ROLAP) systems. Hybrid systems, which store some summaries in memory and store the base data and other summaries in a relational database, are called hybrid OLAP (HOLAP) systems.

Other NVRAM Technologies: The world is waiting for Nanotechnology to produce MOSFETS and other storage devices at nano scales. There are many papers getting published and serious research is going on in every hi-tech institutes around the world on nano technology. One name popping up is Nano RAM; made using nanotubes. Once we see new memory or processor devices from nanotechnology all the present technologies will be dropped like old vacuum tube technology is dropped.

Embedded Database: “Embedded database” refers to a database system that is built into the software program by the application developer, is invisible to the application’s end-user and requires little or no ongoing maintenance. Many in-memory databases fit that description, but not all do. In contrast to embedded databases, a “client/server database” refers to a database system that utilizes a separate dedicated software program, called the database server, accessed by client applications via interprocess communication (IPC) or remote procedure call (RPC) interfaces. Some in-memory database systems employ the client/server model while others provide remote interfaces that facilitate access to an in-memory database that resides on another node of the network.

Memory Tables: Some DBMSs provide a feature called “memory tables” through which certain tables can be designated for all-in-memory handling. Memory tables don’t change the underlying assumptions of database system design and the optimization goals of a traditional DBMS are diametrically opposed to those of an IMDS. With an on-disk database, the primary burden on performance is file I/O. Thus its design seeks to reduce I/O, often by trading off memory consumption and CPU cycles to do so. This includes using extra memory for a cache, and CPU cycles to maintain the cache.

Phase Change Random Access Memory (PRAM): Phase Change Memory (PCM) is a term used to describe a class of non-volatile memory devices that employ a reversible phase change in materials to store information. IBM announced that they had created a stable, reliable, multi-bit Phase Change Memory with high performance and stability. Reads and writes 100 times faster than flash, stays reliable for millions of write-cycles as opposed to just thousands with flash and is cheap enough to be used in anything from enterprise-level servers all the way down to mobile phones (Yam M., (2011) AU30: The in-text citation "Yam M., (2011)" is not in the reference list. Please correct the citation, add the reference to the list, or delete the citation. , IBM Develops Memory 100x Faster Than Flash, http://www.tomshardware.com/news/ibm-phase-change-memory-flash,13034.html ). According to an IBM press release the PRAM developed by them is: 1) Reliable multi-bit phase-change memory technology demonstrated; 2) Scientists have achieved a 100 times performance increase in write latency compared to Flash; and 3) Enables a paradigm shift for enterprise IT and storage systems, including cloud computing by 2016. After the launch of large scale commercial production only PRAM's lead over flash and other NVRAM will be confirmed. However PRAM can be one of the potential replacement of flash.

Complete Chapter List

Search this Book:
Reset