Panel Data: A Case Study Analysis

Panel Data: A Case Study Analysis

Vera Costa, Rui Portocarrero Sarmento
Copyright: © 2021 |Pages: 21
DOI: 10.4018/978-1-7998-3479-3.ch045
(Individual Chapters)
No Current Special Offers


Panel data is a regression analysis type that uses time data and spatial data. Thus, the behavior of groups, for example, enterprises or communities, is analyzed through a time scale. Panel data allows exploring variables that cannot be observed or measured or variables that evolve over time but not across groups or communities. In this chapter, two different techniques used in panel data analysis is explored: fixed effects (FE) and random effects (RE). First, theoretical concepts of panel data are presented. Additionally, a case study example of the use of this type of regression is provided. Panel data analysis is performed with R language, and a step-by-step approach is presented.
Chapter Preview


In Statistical Data Analysis, when analyzing a dataset containing variables observed through time, Panel Data regression analysis methods are commonly used. Panel data research, also associated with longitudinal or cross-sectional time-series data (t=1,…,T). can be used in the study of varied types of entities, from companies or countries to individuals (i=1,…,N). From the perspective of data structure, spatial panel data models are the combination of conventional cross-sectional and time series data models, as represented by (Zhou & Yamaguchi, 2018):

Figure 1.

Structure of panel data models


Panel data research can provide means to control subjacent variables not observed or measured. Thus, it accounts for individual characteristics, for example, when studying the evolution of several communities or groups in social media through time, differences in behaviors across communities or variables that change over time but not across communities (i.e., global rules, agreements between communities or social media platforms rules)

This document is focused in two techniques used to analyze panel data:

  • Fixed effects

  • Random effects

Thus, the authors initiate the research by introducing the reader to the background state of the art regarding Panel Data. Then, the primary focus of the chapter considers the introduction to the case study data and obtained results. Results start with the model calibration for the Linear Regression. Additionally, the authors present and explain the case study results for four specifications, one-way or two-way, fixed or random effects, and compare the final results.



Historically, econometric and statistical models have been developed by using cross-sectional or time-series data (Washington, Karlaftis, & Mannering, 2003). However, in several cases, there is an availability of data based on cross-sections of individuals observed over time (or other observational units such as firms, geographic entities, and so on). Data which combines cross-sectional and time-series characteristics, can be called panel data, pooled data or longitudinal data (Dougherty, 2006).

Panel data can provide predictions on the evolution of a certain dependent variable according to other variables that are measured among distinct entities (cross-sectional) and time intervals (time series). Thus, it allows researchers to construct and test realistic behavioral models that cannot be identified using only cross-sectional or time-series data. Formally, a panel data has the following form (Kunst, 2011)

Xit,i=1,…,N, t=1,…,T.

Panel data can be represented in a rectangular form, like a board. Dimension i is called the “individual dimension,” and t is the time dimension. X can be a scalar (real) variable or also a vector-valued variable. Additionally, a general panel data regression model is written as (Hauser, 2013)


978-1-7998-3479-3.ch045.m02 is a K dimensional vector of explanatory variables, without a constant term,

β0 the intercept, is independent of i and T.

β is a (K×1) vector, the slopes, is independent of i and T.

ϵit the error, varies over i and T.

Individual characteristics (which do not vary over time), zi may be included. In this case, the panel data regression model is written as

.where 978-1-7998-3479-3.ch045.m04 is a K dimensional vector of individual characteristics (time-invariant).

Key Terms in this Chapter

Panel Data: Data derived from a small number of observations over time on a large number of cross-sectional units (for Example, individuals, households, firms, or governments).

Fixed Effects: Variables that are constant across individuals, i.e., do not change, or change at a constant rate over time (for example, age, sex, or ethnicity).

One-Way Error Model: A model that considers only one group of the data (for example, year or country).

OLS Regression: Commonly called linear regression. OLS method corresponds to minimizing the sum of square differences between the observed and predicted values.

Random Effects: Variables that are random and unpredictable (for example, cost of a new car varies when purchased in different years).

Two-Way Error Model: A model that considers two group of the data (for example, year and country).

Complete Chapter List

Search this Book: