A Survey on Tools for Data Analytics and Data Science

A Survey on Tools for Data Analytics and Data Science

Pankaj Pathak (Symbiosis International University (Deemed), India), Samaya Pillai Iyengar (Symbiosis International University (Deemed), India), and Minal Abhyankar (Symbiosis International University (Deemed), India)
DOI: 10.4018/978-1-7998-3053-5.ch003
OnDemand PDF Download:
No Current Special Offers


In the current times, the educational and employment areas are changing at a very fast rate. The change is visible especially in the zone of technology-education. Approximately 4-5 years back, technology education meant coding, using different computer science programming languages. But in the recent times data science and data analytics have become the buzz words. The employment in this area has also undergone a tremendous change effect. Many new employment opportunities have sprung in this area as well with the regular or existing jobs becoming less or extinct. The entire business domain is warming to these buzz words. And the industry preference for these techniques has widened. The chapter discusses both the concepts and the tools being used.
Chapter Preview

Introduction And Background

Database is the container of information i.e. processed data. It is used to store the data. The main objective of a database is storage of data. With the Database comes the database management system, the system to create and manage all operations related to the database. Codd (1990).

The timeline for Database is as follows:

1950s and early 1960s: The Data processing and storage of data mainly done with magnetic tapes. The Magnetic tapes could give a sequential access only. For the input process the “Punched cards” were used.

Late 1960s and 1970s: The innovation of Hard disk was done. It allowed direct access to the data. In database handling, the network and hierarchical data models were in reputed and used extensively. Ted Codd put forth the concept of relational data model, which is still relevant in today’s world. The relational data model enabled better performance in transactions and helped real time transactions.

1980s: Research in the area of relational DB domain has a great commercial value. During this time, SQL became the industrial de facto standard. Parallel and distributed database systems were launched in the commercial arena for usage. They proved to be most useful for organizations. It was during this time that the Object-oriented databases were also featuring as a new concept in the database domain.

1990s: In this era there was a thrust in decision support systems which were huge. The data-mining applications were also launched and developed during this period. Large multi-terabyte data warehouses were designed. It was the new emergence of a concept called “Web commerce”.

2000s: Here the XML and XQuery standards were launched and developed. “Automated database administration” started to feature in the organizations simplifying and easing the lives of database administrators.

Key Terms in this Chapter

Database: A collection of processed data. i.e. storage container of the information.

Data Science: Data science is the study of data. It involves developing methods of recording, storing, and analyzing data to effectively extract useful information. The goal of data science is to gain insights and knowledge from any type of data—both structured and unstructured.

Unstructured Data: Data which cannot be stored in a table structure like an audio file or video and images.

Structured Data: Data which can be stored in a table structure of two dimensions of row – column.

Churn: Customers who left the telco subscription within the last month.

Tools of Data Science: Business analytics tools are types of application software that retrieve data from one or more business systems to be reviewed and analyzed.

Complete Chapter List

Search this Book: