Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Plan and Rules for Data Analysis Success: A Roadmap

Source Title: Big Data Analytics for Entrepreneurial Success

DOI: 10.4018/978-1-5225-7609-9.ch008

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Adapting the complex big data into your projects will be one of your strengths! Your mission to integrate big data is not limited to the use of sophisticated tools to solve your problems, but you must align the requirements of your activities with data lake or data warehouse through clear and correct strategies, taking into account your business as a goal. This provides support to your companies in all stages of your projects: from defining and taking requirements to start production and subsequent maintenance. Finally, it will help you create sustainable and stable competitive advantages.

Chapter Preview

Top

Introduction

My freedom thus consists in moving about within the narrow frame that I have assigned myself for each one of my undertakings. . . . Whatever diminishes constraint diminishes strength. The more constraints one imposes, the more one frees one’s self of the chains that shackle the spirit. Stravinsky (1942, p 65)

Today we are witnessing a strong enthusiasm around the theme of big data. Publications of different natures and demonstrations multiply and the promises also, without really defining the outline of the phenomenon, to be able to approach it as a real project and not as a fuzzy and ephemeral technological mode.

Big data is an extraordinary opportunity for a company, a sector or even a country. Indeed, it allows having the useful and necessary knowledge at the right time to better manage the growing complexity of the operational. To take full advantage of this large amount of data, the first step is to define the process by which data should be collected, processed and analyzed. Then, we must identify the most appropriate business domain to launch a pilot or a Proof of Concept. Subsequently, we must validate the choice of tools and appropriate technologies and finally build an organization and governance to sustain and enhance the big data initiatives.

If some companies are already engaged in big data initiatives, the difficulty for others, especially small businesses and entrepreneurs, is how and where to start? Here are the important keys to implement when starting a big data project.

These different points that we put forward will allow future entrepreneurs to better understand the experience of value creation based on the big data analytics, as a whole. These tools will shed light on the conditions of success for entrepreneurship in the big data universe and on the different actions to be implemented.

Top

Data Analytics Workflow

Data Analytics, big data, and machine learning are very popular terms in today’s business world. However, perimeters encompassed by each of these terms overlap meaning different things. From a data point of view, big data refers to several Vs, in addition to the three famous Vs, which highlight the ability of traditional tools to process and analyze the available data (collection, storage, analysis, integration, etc.).

In this chapter, I will start by describing the workflow that can be adopted to better explore the data. Typically, I will explain how the data analytics process can be applied when working with big data in general.

Let’s go!

But it should be noticed that, due to its experimental side, the data analytics will empirically run this workflow. As a result, this experimental model of work is not linear, but iterative: The analyst or the entrepreneur, who want work with data, will define a hypothesis, implement it, and then refine it. Usually, a big data analytics process is represented in the following form.

Figure 1.

Data analytics process

If you decide to work with data and launch your proper big data project, you need to have a clear idea of the implementation process to be performed because there are several steps to respect. From the setting up of good questions and the definition of goals to the exploration of the data through the preparation (collection, cleaning …) of that data, until the critical analysis of the results, globally, here is a data analytics workflow:

Key Terms in this Chapter

Business model: A business model is a company's plan for how it will generate revenues and make a profit. It explains what products or services the business plans to manufacture and market, and how it plans to do so, including what expenses it will incur.

Missing Values: Occur when no data value is stored for the variable in an observation.

Data Mining: This practice consists of extracting information from data as the objective of drawing knowledge from large quantities of data through automatic or semi-automatic methods. Data mining uses algorithms drawn from disciplines as diverse as statistics, artificial intelligence, and computer science in order to develop models from data; that is, in order to find interesting structures or recurrent themes according to criteria determined beforehand and to extract the largest possible amount of knowledge useful to companies. It groups together all technologies capable of analyzing database information in order to find useful information and possible significant and useful relationships within the data.

Data Lake: Is a collection of storage instances of various data assets added to the originating data sources. These assets are stored in a near-exact, or even exact, a copy of the source format. The purpose of a data lake is to present an unrefined view of data to only the most highly skilled analysts, to help them explore their data refinement and analysis techniques independent of any of the system-of-record compromises that may exist in a traditional analytic data store (such as a data mart or data warehouse).

Data Analysis: This is a class, of statistical methods, that makes it possible to process a very large volume of data and identify the most interesting aspects of its structure. Some methods help to extract relations between different sets of data, and thus, draw statistical information that makes it possible to describe the most important information contained in the data in the most succinct manner possible. Other techniques make it possible to group data in order to identify its common denominators clearly, and thereby understand them better.

Outliers: An outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In a sense, this definition leaves it up to the analyst (or a consensus process) to decide what will be considered abnormal. Before abnormal observations can be singled out, it is necessary to characterize normal observations.

Natural Language Processing (NLP): An interdisciplinary field of computer science, artificial intelligence, and computational linguistics that focuses on programming computers and algorithms to parse, process, and understand human language.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference