Introduction to Linear Regression

Introduction to Linear Regression

DOI: 10.4018/978-1-68318-016-6.ch006

Abstract

In statistical modelling, regression analysis is a statistical process for estimating the relationships among variables. More specifically, regression analysis helps the reader understand how the dependent variable changes when any of the independent variables is varied. Thus, regression analysis estimates the average value of the dependent variable when the independent variables are fixed. Therefore, the estimation target is a function of the independent variables called regression function. In limited circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. Nonetheless, caution has to be taken since correlation might not signify causality. Regression analysis techniques are varied. Nevertheless, in this chapter, we will present only the fundamental analysis.
Chapter Preview
Top

R Vs. Python

To make a linear regression model, the variability of Publications variable was studied when related with the Gender, R_user, Python_user, and Age variables.

Linear Regression Model

In linear regression model, the functional relationship between the dependent variable and the independent variables 978-1-68318-016-6.ch006.m01 are (Maroco, 2011):

978-1-68318-016-6.ch006.m02

In this model, 978-1-68318-016-6.ch006.m03 are the regression coefficients and 978-1-68318-016-6.ch006.m04 is the errors or residuals of the model. 978-1-68318-016-6.ch006.m05 is the y-intercept, and 978-1-68318-016-6.ch006.m06 represents the partial slopes (i.e. a measure of the influence of 978-1-68318-016-6.ch006.m07 in 978-1-68318-016-6.ch006.m08, i.e., the 978-1-68318-016-6.ch006.m09 variation per variation unit of 978-1-68318-016-6.ch006.m10). The term 978-1-68318-016-6.ch006.m11 (errors or residuals of the model) reflects the measurement errors and the natural variation in 978-1-68318-016-6.ch006.m12. If there is only one independent variable, the model is called simple linear regression model. If the model has more than one independent variable, it is called multiple linear regression model.

Considering that the population is not defined, the linear regression analysis should start with the estimation of the regression coefficients from a representative sample of the population under study, using the estimators

978-1-68318-016-6.ch006.m13
producing sample estimates 978-1-68318-016-6.ch006.m14 of the population's parameters 978-1-68318-016-6.ch006.m15. The commonly used methods for estimation of these coefficients are mostly very laborious. Most software has extensive modules for linear regression. Thus, they eliminate the task of estimating these parameters; its detailed presentation will not be available in this book.

Complete Chapter List

Search this Book:
Reset