Introduction to Linear Regression

Introduction to Linear Regression

DOI: 10.4018/978-1-68318-016-6.ch006


In statistical modelling, regression analysis is a statistical process for estimating the relationships among variables. More specifically, regression analysis helps the reader understand how the dependent variable changes when any of the independent variables is varied. Thus, regression analysis estimates the average value of the dependent variable when the independent variables are fixed. Therefore, the estimation target is a function of the independent variables called regression function. In limited circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. Nonetheless, caution has to be taken since correlation might not signify causality. Regression analysis techniques are varied. Nevertheless, in this chapter, we will present only the fundamental analysis.
Chapter Preview

R Vs. Python

To make a linear regression model, the variability of Publications variable was studied when related with the Gender, R_user, Python_user, and Age variables.

Linear Regression Model

In linear regression model, the functional relationship between the dependent variable and the independent variables are (Maroco, 2011):

In this model, are the regression coefficients and is the errors or residuals of the model. is the y-intercept, and represents the partial slopes (i.e. a measure of the influence of in , i.e., the variation per variation unit of ). The term (errors or residuals of the model) reflects the measurement errors and the natural variation in . If there is only one independent variable, the model is called simple linear regression model. If the model has more than one independent variable, it is called multiple linear regression model.

Considering that the population is not defined, the linear regression analysis should start with the estimation of the regression coefficients from a representative sample of the population under study, using the estimators

producing sample estimates of the population's parameters . The commonly used methods for estimation of these coefficients are mostly very laborious. Most software has extensive modules for linear regression. Thus, they eliminate the task of estimating these parameters; its detailed presentation will not be available in this book.

Complete Chapter List

Search this Book: