A Review of Infrastructures to Process Big Multimedia Data

A Review of Infrastructures to Process Big Multimedia Data

Jaime Salvador, Zoila Ruiz, Jose Garcia-Rodriguez
DOI: 10.4018/978-1-7998-2460-2.ch001
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In the last years, the volume of information is growing faster than ever before, moving from small to huge, structured to unstructured datasets like text, image, audio and video. The purpose of processing the data is aimed to extract relevant information on trends, challenges and opportunities; all these studies with large volumes of data. The increase in the power of parallel computing enabled the use of Machine Learning (ML) techniques to take advantage of the processing capabilities offered by new architectures on large volumes of data. For this reason, it is necessary to find mechanisms that allow classify and organize them to facilitate to the users the extraction of the required information. The processing of these data requires the use of classification techniques that will be reviewed. This work analyzes different studies carried out on the use of ML for processing large volumes of data (Big Multimedia Data) and proposes a classification, using as criteria, the hardware infrastructures used in works of machine learning parallel approaches applied to large volumes of data.
Chapter Preview
Top

2. Big Data

Big Data is present in all areas and sectors worldwide. However, it's complexity exceeds the processing power of traditional tools, requiring high-performance computing platforms to exploit the full power of Big Data (Shim, 2013). These requirements have undoubtedly become a real challenge. Many studies focus on the search of methodologies that allow lowering computational costs with an increase in the relevance of extracted information. The need to extract useful knowledge has required researchers to apply different machine learning techniques, to compare the results obtained and to analyze them according to the characteristics of the large data volumes (volume, velocity, veracity and variety, the 4V's) (Mujeeb & Naidu, 2015).

The techniques used by Machine Learning (ML) are focused on minimizing the effects of noise from digital images, videos, hyperspectral data, among others, extracting useful information in various areas of knowledge, such as civil engineering (Rashidi, 2016), medicine (Athinarayanan, 2016), remote Sensing (Torralba, 2008).

With the various repositories of images that have been generated over the last years, many computer vision algorithms try to solve problems related to finding matches for existing local image features in Big Data, grouping the characteristics and labeling them (Muja, 2009).

Actually, there are several information repositories related to a wide range of areas, these datasets can be used to test the performance of some algorithms. For example:

Complete Chapter List

Search this Book:
Reset