Introduction to Python and Its Statistical Applications

Introduction to Python and Its Statistical Applications

Siyu Shi (Pennsylvania State University, USA)
DOI: 10.4018/978-1-7998-2768-9.ch006
OnDemand PDF Download:
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter introduces the history of Python and its IDEs (integrated development environment) and code editors as developing environment. The history tells how Python started from ABC programming language in the Netherlands to a community with developers from different areas, and later became one of the most popular programming languages in the world. Popular IDEs and Code Editor for professional developers and beginners are also introduced with their advantages and disadvantages. Later in this chapter, the authors introduce Python libraries, which could be used in statistical analysis, and give out a simple case on how these methods can be applied.
Chapter Preview
Top

Background: History Of Python

Python was conceptualized by Guido van Rossum in the late 1980s. Guido van Rossum received a master’s degree in mathematics and computer science from the University of Amsterdam in 1982. (https://en.wikipedia.org/wiki/Guido_van_Rossum). Though his education background made him like a mathematician, he enjoyed the fun computer brings to him. He shared ideas with students and staff in the basement of the science building “The most important lesson I learned was about sharing: while most of the programming tricks I learned there died with the mainframe era, the idea that software needs to be shared is stronger than ever. Today we call it open source, and it’s a movement. Hold that thought!” (Guido van Rossum King’s Day Speech).

At that time, he tried programming languages such as Pascal, C, Fortran etc. These programming languages were designed to make computers run faster. In the 1980s, though IBM and Apple started a trend of PC (Personal Computer), the features of these PCs were in a rather low level compared with today’s modern PCs. Such as Macintosh in the early age, it only had 8MHz and 128KB RAM for its CPU. A simple large array could explode its RAM. So, the concept of complier back in that day was to make optimization for programs to run properly.

This concept distressed Guido, he knew how to write a function using C, but the writing process could take plenty of time (though he already knew how to build it). The other programming option was Shell. Shell was used as an interpreter for UNIX for quite long time. Shell could paste functions in UNIX together like a glue. Though it could implement a function with a few lines when C could have the same job done in hundreds, it was not a real programming language. Shell could not call all the functions in computer.

Guido hope there could be a programming language that could implement all computer functional interface like C and be programmed easily like Shell. ABC was such kind of language that showed the way to Guido. ABC was developed by CWI (Centrum Wiskunde & Informatica) from Netherlands. Guido worked at CWI and participated in the development of ABC. Other than most of the programming languages that day, ABC wanted to be “a language that be useful for researchers, lab assistants, professional users who were not also professional programmers”. (Oral History of Guido van Rossum, part 1). ABC was designed to be easy to read, use, remember and learn.

Figure 1.

A function example programmed by ABC to collect the set of all words in a document, and contrast with what has been, or is currently being done as it relates to the chapter's specific topic and the main theme of the book (ABC Programmer's Handbook)

978-1-7998-2768-9.ch006.f01

ABC had not become popular as it required high performance of features from PCs. The owners of these PCs were more concerned with the efficiency of the programs rather than difficulty of learning. There were also some disadvantages of ABC design:

  • Lack of Extensibility: If more functions were needed in ABC, there were plenty of things to be changed.

  • Cannot Directly Perform I/O: ABC could not directly operate on files. The difficulty of input and output is fatal to a programming language like a Porsche you cannot open its door.

  • Over Revolution: ABC tried to use nature language for programming. But programmers were used to programming languages, which made ABC hard for professional programmers to begin with.

  • Difficulty of Spread: ABC had a powerful complier and had to be record on tape. Guido need to carry a large tape with him when he was visiting.

In 1989, Guido started to write complier/interpreter for python in Christmas. The name ‘Python’ came from a TV series named ‘Monty Python's Flying Circus’. He wanted Python to be the kind of language between C and Shell, fully featured, easily learned, and extendable.

Key Terms in this Chapter

Correlation Matrix: Correlation matrix is a table showing correlation coefficients between variables. Each cell in the table shows the correlation between two variables.

Decision Tree: Decision tree is a decision support tool that uses tree-like model to explore the possible outcomes.

Time Series: Time series is a series of data points that are listed in time order.

Python: Python is an open-source programming language created by Guido van Rossum and released in 1911.

Integrated Development Environment (IDE): IDE a software application that provides functions to programmers for software development.

Model Validation: Model validation is a process that used to test whether the output of a statistical model is acceptable using data generating process.

R-Squared: R-Squared is a statistical measure that represents the proportion of the variance for a dependent variable that is explained by independent variables in a model.

Complete Chapter List

Search this Book:
Reset