Machine Learning and Deep Learning for Big Data Analysis

Machine Learning and Deep Learning for Big Data Analysis

Copyright: © 2024 |Pages: 32
DOI: 10.4018/979-8-3693-0413-6.ch008
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

As data plays a role in machine learning and provides insights across various sectors, organizations are placing more emphasis on collecting, organizing, and managing information. However, traditional methods of analysing data struggle to keep up with the increasing complexity and volume of big data. To extract insights from datasets, advanced techniques like machine learning and deep learning have emerged. In the field of self-driving cars, analysing sensor data relies on methodologies developed from data analytics. These trends extend beyond cases; big data and deep learning are driving forces supported by enhanced processing capabilities and the expansion of networks. Managing the complexities involved in processing amounts of data requires scalable architectures that leverage distributed systems, parallel processing techniques and technologies such as GPUs. This development is particularly relevant for industries like banking, healthcare, and public safety, which have pressing demands, for transparency and interpretability in models.
Chapter Preview
Top

1. Introduction

Big Data is a kind of data that is so vast and complex that it cannot be handled by standard systems or data-warehousing technology. Big data is unable to stored using a relational database management system or processed using standard SQL-style queries (Ishwarappa & Anuradha, 2015). Due to the development of technology and services, massive amounts of structured, unstructured and semi-structured data have been generated from a variety of sources (Ishwarappa & Anuradha, 2015). Big data is originated from various sectors like agriculture, medical, IOT. In healthcare sector big data is produced by medical equipment, automated medical tools, such as sensor devices and high-throughput instruments, etc (Mohammed Alqahtani, 2023). By attaching IOT devices to fields, soil, and plants to collect data in real time directly from the ground, the agriculture industry creates big data.

Big Data can be identified using 7 Vs includes Velocity, Volume, Variety, Veracity, Variability, Visualization, and Value. Keep in mind that these Vs are provisional and could go up or down in the near future (Tyagi & G, 2019).

  • ” Volume” is a term used to refer to the quantity of Big Data. The IOT, medical devices, cloud computing traffic, and other variables all play a part in the rapid increase in data volume. Many big companies like Google, Apple etc. have a large amount of information or data in the form of logs (Ishwarappa & Anuradha, 2015).

  • “Velocity” describes the rate with which data is gathered and the rate with which it is processed, stored, and evaluated by databases.

  • “Variety” is a term that describes the range of data types. Different form of data can be gathered from numerous sources. Data might be structured, unstructured or semi-structured.

  • “Variability” refers to the inconsistencies in the data that gets generated. This is mostly due to varied data sources, distinct data layouts, or errors in data filling.

  • “Veracity” reflects on the accuracy as well as data quality. when coping with massive amount of data. It might not be entirely accurate or it might. Missing information may be present in the accumulating data. (Ishwarappa & Anuradha, 2015)

  • “Visualisation” refers to the ability to convert data into pictorial representations that are easy to understand and analyse.

  • “Value” is discussing the potential value of big data, which directly affects how organizations might use the information obtained.

Key Terms in this Chapter

Supervised Learning: In the supervised learning paradigm, a model is trained on a labeled dataset to determine the correlation between the input data and the associated output, allowing the model to be used for classification or prediction.

Reinforcement Learning: In the machine learning paradigm known as reinforcement learning, agents are trained to make decisions by interacting with their surroundings, taking feedback in the form of rewards or penalties, and modifying their behavior over time to maximize the cumulative reward.

Data Mining: Finding relevant patterns, trends, and information inside huge databases by statistical, mathematical, or computational methods is known as data mining.

Big Data: Big data refers to enormous and complex datasets that exceed the capacity of typical data.

Deep Learning: Deep learning is a subfield of machine learning that uses neural networks with numerous layers (deep neural networks) to learn and represent complicated patterns in data.

Unsupervised Learning: Unsupervised learning, which is frequently used for clustering, dimensionality reduction, or anomaly detection, is a machine learning paradigm in which a model investigates and finds patterns in unlabeled data without explicit direction.

Machine Learning: Machine learning is a computational method that enables systems to automatically learn and improve from experience without being explicitly programmed.

Complete Chapter List

Search this Book:
Reset