Introduction to Big Data Analytics

Introduction to Big Data Analytics

Copyright: © 2024 |Pages: 48
DOI: 10.4018/979-8-3693-0413-6.ch001
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Big data refers to data collections that are either too huge or too complex for traditional data-processing application software to manage. The three major concepts initially associated with big data are volume, variety, and velocity. The fourth major concept, veracity, is concerned with the accuracy or believability of the data. Big data analytics is the act of acquiring and analyzing massive volumes of data to discover market trends, insights, and patterns that may help firms in making better business decisions. Across all corporate sectors, improving efficiency results in more shrewd operations overall, more profits, and happy customers. This chapter gives an overview on how to store and manage big data, importance of big data analytics, how to apply big data analytics using different methods and tools to benefit businesses, and big data analytics applications in various fields, as well as challenges facing big data analytics.
Chapter Preview
Top

Introduction

The creation of digital data is partially a result of the use of online-connected gadgets. Thus, information about their users is transmitted through cellphones, tablets, and desktops. Connected smart gadgets share data on how consumers utilize commonplace items.

Data originates from a variety of sources besides linked devices, including demographic data, climatic data, scientific and medical data, energy usage data, etc. The location of device users, their travels, their hobbies, their consumption patterns, their pastime activities, their projects, and other information are all provided by these data. But also details on how the tools, equipment, and infrastructure are utilized. The amount of digital data is continually expanding as more people utilize the Internet and mobile devices. We currently reside in an informational society that is transitioning to a knowledge-based society. We require more data in order to derive better information. Information is a key component of the political, cultural, and economic spheres in the society of information (Hilbert & Lopez, 2011).

The phrase “Big data” refers to the development and application of technologies that deliver the appropriate information from a vast amount of data that has been expanding exponentially in our society to the appropriate user at the appropriate moment. Along with handling the complexity of managing more varied forms and complicated and interrelated data, the problem is dealing with the continuously growing volumes of data. Its definition changes depending on the communities that are interested in it as a user or provider of services because it is a complicated polymorphic object. Big data, a technology developed by the web's titans, positions itself as a way to provide everyone real-time access to massive databases.

Big Data is highly challenging to define exactly because different fields have different ideas of what is considered to be a large amount of data. It identifies a class of techniques and technologies rather than a specific set of technologies. This is a new field, and the definition is evolving as we try to figure out how to use this new paradigm and capitalize on the benefits. Big data is a broad term for data that can be stored, processed, and computed more efficiently than traditional databases (Riahi & Riahi, 2015). Big Data as a resource necessitates the use of tools and techniques that may be used to examine and draw patterns from vast amounts of data (Najafabadi et al., 2015).

Because of the variety and velocity of the data manipulation, structured data analysis evolves. As a result, it is no longer sufficient to simply analyze data and generate reports; due to the variety of the data, the systems in place also need to be able to support data analysis. In order to aid in its exploitation, analysis entails automatically identifying, among a variety of quickly changing data, the correlations between the data.

The term “Big Data Analytics” refers to the procedure of gathering, compiling, and analyzing sizable data sets in order to identify various patterns and other pertinent data. Big data analytics is a collection of technologies and approaches that call for novel forms of integration in order to reveal significant hidden values from sizable datasets that are more complicated and huge in scale than typical datasets. It generally focuses on finding better and more efficient solutions to both new and old problems.

This chapter discusses the characteristics and ecosystem of Big data, as well as the types and applications of Big data analytics. The main topics to be covered in this chapter includes the following;

  • o

    The meaning of the term “Big data” and its characteristics

  • o

    Big data ecosystem

  • o

    Storage of Big data

  • o

    Big data management technologies

  • o

    Big data analytics

  • o

    Big data analytics life cycle

  • o

    Big data Analytics Processing

  • o

    Big data analytics and machine learning

  • o

    Key studies and trends in machine learning and Big data analytics

  • o

    Current research directions, developments and challenges in machine learning and Big data analytics

  • o

    Benefits of Big data analytics to businesses

  • o

    Types of Big data analytics

  • o

    Tools used in Big data analytics

  • o

    Applications of Big data analytics

  • o

    Ethical implications of Big Data

  • o

    Challenges and Barriers for Big data analytics

Key Terms in this Chapter

Network Attached Storage (NAS): is dedicated file storage that enables multiple users and heterogeneous client devices to retrieve data from centralized disk capacity. Users on a local area network (LAN) access the shared storage via a standard Ethernet connection.

Support Vector Machine (SVM): is a supervised machine learning algorithm used for both classification and regression.

Enterprise Resource Planning (ERP) System: is a type of software system that helps organizations automate and manage core business processes for optimal performance.

Enterprise Data Warehouse (EDW): is a relational data warehouse containing a company’s business data, including information about its customers. An EDW enables data analytics, which can inform actionable insights. Like all data warehouses, EDWs collect and aggregate data from multiple sources, acting as a repository for most or all organizational data to facilitate broad access and analysis.

Customer Relationship Management System (CRM): is a technology for managing all your company’s relationships and interactions with customers and potential customers.

Solid State Drives (SSD): is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage.

Massively Parallel Processing (MPP): is a way of processing large amounts of data by dividing it into parts and using many processors or computers to work on them at the same time.

Storage Area Network (SAN): is: a computer network that provides access to consolidated, block-level data storage.

Return on Investment (ROI): is a popular metric because of its versatility and simplicity. Essentially, ROI can be used as a rudimentary gauge of an investment’s profitability.

Magnetic, Agile, Deep (MAD) Analysis: Magnetic, Agile, Deep data analysis The authors define the MAD acronym as a re-imagination of the data warehouse concept such that: Magnetic: encourages (attracts) new data sources, has reduced sensitivity to cleanliness of data sources.

Complete Chapter List

Search this Book:
Reset