A Survey of Open Source Statistical Software (OSSS) and Their Data Processing Functionalities

A Survey of Open Source Statistical Software (OSSS) and Their Data Processing Functionalities

Gao Niu (Bryant University, USA), Richard S. Segall (Arkansas State University, USA), Zichen Zhao (Yale University, USA) and Zhijian Wu (New York University, USA)
Copyright: © 2021 |Pages: 20
DOI: 10.4018/IJOSSP.2021010101
Article PDF Download
Open access articles are freely available for download

Abstract

This paper discusses the definitions of open source software, free software and freeware, and the concept of big data. The authors then introduce R and Python as the two most popular open source statistical software (OSSS). Additional OSSS, such as JASP, PSPP, GRETL, SOFA Statistics, Octave, KNIME, and Scilab, are also introduced in this paper with function descriptions and modeling examples. They further discuss OSSS's capability in artificial intelligence application and modeling and Popular OSSS-based machine learning libraries and systems. The paper intends to provide a reference for readers to make proper selections of open source software when statistical analysis tasks are needed. In addition, working platform and selective numerical, descriptive and analysis examples are provided for each software. Readers could have a direct and in-depth understanding of each software and its functional highlights.
Article Preview
Top

2. Background

2.1 How Open Source Software, Free Software, And Freeware Differ

2.1.1 Open Source Software (OSS)

Open Source Software (OSS) is a type of computer software in which source code is released under a license in which the copyright holder grants users the rights to study, change, and distribute the software to anyone and for any purpose. (Wikipedia (2019a))

For software to be considered “Open Source”, it must meet ten conditions as defined by the Open Source Initiative (OSI). Of these ten conditions, it’s the first three that are really at the core of Open Source and differentiates it from other software. These three conditions are according to the Open Source Initiative (2007):

  • 1.

    Free Redistribution: The software can be freely given away or sold.

  • 2.

    Source Code: The source code must either be included or freely obtainable.

  • 3.

    Derived Works: Redistribution of modifications must be allowed.

The other conditions are: (Open Source Initiative (2007))

  • 4.

    Integrity of The Author's Source Code: Licenses may require that modifications are redistributed only as patches.

  • 5.

    No Discrimination against Persons or Groups: no one can be locked out.

  • 6.

    No Discrimination against Fields of Endeavor: commercial users cannot be excluded.

  • 7.

    Distribution of License: The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.

  • 8.

    License Must Not Be Specific to a Product: the program cannot be licensed only as part of a larger distribution.

  • 9.

    License Must Not Restrict Other Software: the license cannot insist that any other software it is distributed with must also be open source.

  • 10.

    License Must Be Technology: Neutral: no click-wrap licenses or other medium-specific ways of accepting the license must be required.

Macaulay (2017) discussed benefits of open source software that are summarized in Figure 1 below.

Figure 1.

Benefits of Open Source Software (OSS) (derived from Macaulay (2017))

IJOSSP.2021010101.f01

2.1.2 Open Source License

According to Wikipedia (2019f) an open source license is a type of license for computer software and other products that allows the source code, blueprint or design to be used, modified and/or shared under defined terms and conditions. This allows end users and commercial companies to review and modify the source code, blueprint or design for their own customization, curiosity or troubleshooting needs.

Open-source licensed software is mostly available free of charge, though this does not necessarily have to be the case.

Licenses that only permit non-commercial redistribution or modification of the source code for personal use only are not considered generally as open source licenses.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 13: 4 Issues (2022): Forthcoming, Available for Pre-Order
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 1 Issue (2015)
Volume 5: 3 Issues (2014)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing