Understanding Data Analytics Is Good but Knowing How to Use It Is Better!

Understanding Data Analytics Is Good but Knowing How to Use It Is Better!

Copyright: © 2019 |Pages: 36
DOI: 10.4018/978-1-5225-7609-9.ch005


Collecting the data and being able to generate value from it: this is certainly the key success factor of tomorrow's champions, one that will allow you to innovate and create new business models. Faced with the 3Vs of big data, many companies are embarking on big data projects with the main objective: generating value. The goal is to succeed, by the detailed analysis of large amounts of data, to lift the veil and discover hitherto hidden models and barely perceptible correlations, as many new business opportunities that companies must grasp. The key to the success of any big data analytics initiative is to define your goals, identify specific business questions that a suitable technical architecture will need to answer, and use the data experts to generate value from data by using specific algorithms.
Chapter Preview


In all summaries, the problems seem simpler than they actually are.

Rollo May

Big data can be analyzed using software tools commonly used in advanced analytical disciplines, such as predictive analytics, data mining, and statistical analysis… Traditional BI software and data visualization tools may also play a role in the analysis process, but semi-structured and unstructured data may not be suitable for traditional data warehouses based on relational databases. In addition, these warehouses are sometimes unable to meet the processing requirements imposed by sets of big data that must be updated frequently, or even continuously.

Big data analytics is the process of examining large datasets containing heterogeneous data types to uncover hidden patterns, unknown correlations, market trends, user preferences, and other exploitable information. Increasingly, we discuss the benefits of the data analysis from Twitter, Google, Facebook, and any other space in which more and more people are leaving digital traces and filing information that may be exploitable and exploited.

Every second, visitors interact with interconnected objects and leave behind a tremendous amount of data that companies can then use to create tailor-made experiences. Faced with such a challenge, both make sure that the used technologies are able to correctly handle this volume of data! Big data and the use of data analytics are being adopted more frequently, especially in companies that are looking for new methods to develop smarter capabilities and tackle challenges in the dynamic processes.

As a result, many companies seeking to collect, process and analyze big data have turned to new technologies, including Hadoop and related tools such as YARN, MapReduce, Spark, Hive, and Pig, as well as NoSQL databases. These technologies form the basis of an open source software infrastructure that supports the processing of large and heterogeneous data sets on clustered systems.

Also, who says big data says also be able to manage data. A data approach, therefore, requires data governance that is irreproachable. If the data is not the right data, the data analytics process will follow with a lot of disappointment and failures. In addition, we must not forget that big data is a highly technical field that has developed very rapidly. The result is a lack of skills it is imperative to fill. By reading this chapter you will discover the how you can conduct a data analytics process and what you need to better guide this process.


When Big Data, Analytics And Value Creation Meet

Big data is perceived by some as “the spearhead of digital transformation”. This is notably what Klaus Schwab explains in his book ‘The Fourth Industrial Revolution’. According to him, the digital age is centered around data: this means its access and use, in order to refine products and experiences and to converge towards a world of continual adjustment.

Data means the appearance of a fact which we can record, store, modify, and send on. The conception of data is not an exact idea. From the point of view of database designing, data is the meaningless series of signs from which we can earn information after processing. Ackoff (1996), defines data as symbols, information as data that are processed to be useful, and the knowledge as an application of data and information in order to have the ability to understand “what”, ‘‘why’’ and ‘‘how’’.

Figure 1.

From big data to knowledge


According to Taylor (1980), the value of information begins with data, which takes on value throughout its evolution until it achieves its objective and specifies an action to take during a decision. Information is a message with a higher level of meaning. It is raw data that a subject in turns transforms into knowledge through a cognitive or intellectual operation.

Big data is everywhere, especially in the business context. The most mature companies in the exploitation of data are distinguished by the following criteria:

Key Terms in this Chapter

Data Science: It is a new discipline that combines elements of mathematics, statistics, computer science, and data visualization. The objective is to extract information from data sources. In this sense, data science is devoted to database exploration and analysis. This discipline has recently received much attention due to the growing interest in big data.

Machine Learning: A method of designing a sequence of actions to solve a problem that optimizes automatically through experience and with limited or no human intervention.

Hadoop: Big data software infrastructure that includes a storage system and a distributed processing tool.

Open Source: A designation for a computer program in which underlying source code is freely available for redistribution and modification.

Data Mining: This practice consists of extracting information from data as the objective of drawing knowledge from large quantities of data through automatic or semi-automatic methods. Data mining uses algorithms drawn from disciplines as diverse as statistics, artificial intelligence, and computer science in order to develop models from data; that is, in order to find interesting structures or recurrent themes according to criteria determined beforehand and to extract the largest possible amount of knowledge useful to companies. It groups together all technologies capable of analyzing database information in order to find useful information and possible significant and useful relationships within the data.

MapReduce: Is a programming model or algorithm for the processing of data using a parallel programming implementation and was originally used for academic purposes associated with parallel programming techniques.

Complete Chapter List

Search this Book: