Tools, Technologies, and Methodologies to Support Data Science: Support Technologies for Data Science

Tools, Technologies, and Methodologies to Support Data Science: Support Technologies for Data Science

Ricardo A. Barrera-Cámara, Ana Canepa-Saenz, Jorge A. Ruiz-Vanoye, Alejandro Fuentes-Penna, Miguel Ángel Ruiz-Jaimes, Maria Beatriz Bernábe-Loranca
DOI: 10.4018/978-1-7998-3053-5.ch004
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Various devices such as smart phones, computers, tablets, biomedical equipment, sports equipment, and information systems generate a large amount of data and useful information in transactional information systems. However, these generate information that may not be perceptible or analyzed adequately for decision-making. There are technology, tools, algorithms, models that support analysis, visualization, learning, and prediction. Data science involves techniques, methods to abstract knowledge generated through diverse sources. It combines fields such as statistics, machine learning, data mining, visualization, and predictive analysis. This chapter aims to be a guide regarding applicable statistical and computational tools in data science.
Chapter Preview
Top

Background

This section presents some works related to applicable technology applications in data science. Platforms: Performs an analysis of hardware platforms considering specific features and software framework used in them, as critical elements that must be present for the execution of big data algorithms (Singh & Reddy, 2014); Learning Machine: Some criteria are proposed and analyzed for the selection of opens source tools for learning machine with big data. The experience of processing, libraries and machine learning framework is also considered (Landset et al., 2015); Software: Open source data mining tools are analyzed considering their operational characteristics, license, programming languages, web support, type, domain that are also used in data science (Barlas, 2015); Vizualization: Various tools and techniques of data visualization oriented to large volumes of data are analyzed, presenting their functional and non-functional characteristics (Caldarola & Rinaldi, 2017); Dataset: The availability of data, exchange, access, use recovery, searches make possible the emergence of data stores or data sets available in public access dataset services but from a company with information search services on the internet (Chapman et al., 2019).

In Figure 1, presents a timeline related to the year of launch of the technologies identified in the background of this work.

Figure 1.

Timeline of technologies

978-1-7998-3053-5.ch004.f01

Key Terms in this Chapter

Dataset: Set of data organized in form tabular where each column usually represents a variable or field.

Methodology: Series of stages that follow a software or data science project, the stages in sets are called the life cycle.

Learning Machine: Discipline that uses and develops techniques with the aim of providing intelligence to computers and predicting situations from data.

License: Permission granted by the owner of a product or data to the end user for its use, distribution, modification, or the terms that this or the end user contract specifies.

Repository: Place that has various digital resources such as documents, data, software. These are available under various user licenses and commonly on properly organized websites for access, consultation, or download.

Framework: Series of conceptual, practical, or normative elements used as a reference to deal with a type of problem used.

Platform: It is a computer or hardware system on which to run various hardware or software applications or both, with a type of use license.

Complete Chapter List

Search this Book:
Reset