Introduction to the Popular Open Source Statistical Software (OSSS)

Introduction to the Popular Open Source Statistical Software (OSSS)

Zhijian Wu (New York University, USA), Zichen Zhao (Yale University, USA) and Gao Niu (Bryant University, USA)
DOI: 10.4018/978-1-7998-2768-9.ch003
OnDemand PDF Download:
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter first introduces the two most popular Open Source Statistical Software (OSSS), R and Python, along with their Integrated Development Environment (IDE) and Graphical User Interface (GUI). Secondly, additional OSSS, such as JASP, PSPP, GRETL, SOFA Statistics, Octave, KNIME, and Scilab, will also be introduced in this chapter with function descriptions and modeling examples. The chapter intends to create a reference for readers to make proper selection of the Open Source Software when a statistical analysis task is in demand. The chapter describes software explicitly in words. In addition, working platform and selective numerical, descriptive, and analysis examples are provided for each software. Readers could have a direct and in-depth understanding of each software and its functional highlights.
Chapter Preview
Top

Background

Open Source Software (OSS) is a type of computer software that had its code released to the public. St. Laurent (2008) indicated that users have the right to study, change and redistribute the software under the copyright granted by the software license holder. Closed source or proprietary software can only be modified and maintained by the people, teams and organizations who own the software. Microsoft Office and Adobe Photoshop are well-known proprietary software.

Open Source Software is popular to statistical analysis practitioners, not only because it is free, but also because it is more adaptive to the current rapidly developing academic research advancement environment.

This chapter first introduces the two most popular Open Source Statistical Software (OSSS) R and Python along with its Integrated Development Environment (IDE) and Graphical User Interface (GUI). Then, additional OSSS, like JASP, PSPP, GRETL, SOFA Statistics, Octave, KNIME and Scilab, are introduced with description of their functions and modeling examples. Figure 1 lists all of the popular open source statistical software and IDEs that are introduced in this Chapter.

Figure 1.

Logos of popular open source statistical software and its IDEs (Designed by Niu, 2019)

978-1-7998-2768-9.ch003.f01

Figure 2 and 3 demonstrate the popularity development within last five years of the Open Source Statistical Software discussed in this chapter. The value represents the Google search interest. A value of 100 is the peak popularity which happens on the third week of 2019 for Python, a value of 50 represents the software is half as popular. The data is extracted on 12/19/2019 from trends.google.com under the category of “Science” and “Web Search”. Since R and Python dominate the popularity charts, two figures are created in order to better presents the relationship between all of the software. Figure 2 demonstrates R and Python popularity. Figure 3 shows other Open Source Statistical Software (OSSS).

Figure 2.

Python and R popularity 2014-2019 (Designed by Niu, 2019)

978-1-7998-2768-9.ch003.f02
Figure 3.

Stacked Area Graph for Open Source Statistical Software Popularity (Designed by Niu, 2019)

978-1-7998-2768-9.ch003.f03

Key Terms in this Chapter

Python: Python is a programming language. It was created by Guido van Rossum and released in 1911. Python could be used for web development, software development, system scripting, statistical analysis and many others purposes.

Open Source Software (OSS): Open source software is software that any users could share, study, inspect, modify and enhance (What is open source?, 2019).

Integrated Development Environment (IDE): IDE a software application that supports programmers for software development.

JASP: JASP is free and open source statistical software, image process software. JASP is considered as an open source version of SPSS. The software name stands for Jeffreys’s Amazing Statistics Program (What does JASP stand for?, 2019).

Graphical User Interface (GUI): A user interface that allows users visually interact with computer through items such as window, buttons, menus.

SOFA Statistics: SOFA statistics is an open source statistical software. It is short for Statistics Open for All. SOFA can produce many well-designed graphs for presentation.

KNIME: KNIME is short for Konstanz Information Miner ( Berthold et al., 2008 ). It is a free open source software for data processing, statistical analysis, and report generation.

PSPP: PSPP is a free and open source statistical software that includes various advanced statistical packages. PSPP is adaptive to many other spreadsheet applications which makes data transformation across software easily.

R: R is a programming language and open source statistical software. It has a strong statistical analysis capability and graphical visualization functionality.

Scilab: Scilab is a free open source software. It is designed for engineer and academic researchers. Scilab is considered as an open source version of MATLAB.

Gretl: A cross-platform software package for econometric analysis, written in the C programming language. GRETL is short for GNU Regression, Econometrics and Time-series Library (GNU Regression, Econometrics, and Time-Series Library, 2019 AU163: The in-text citation "GNU Regression, Econometrics, and Time-Series Library, 2019" is not in the reference list. Please correct the citation, add the reference to the list, or delete the citation. ).

General Public License: A license that intended to provide freedom to share, study, modify software and guarantee the freedom for all its users (GNU Operating System, 2014 AU162: The in-text citation "GNU Operating System, 2014" is not in the reference list. Please correct the citation, add the reference to the list, or delete the citation. ).

Complete Chapter List

Search this Book:
Reset