On the Mining of Cointegrated Econometric Models

On the Mining of Cointegrated Econometric Models

J.L. van Velsen (Dutch Ministry of Justice, The Netherlands) and R. Choenni (Dutch Ministry of Justice, The Netherlands)
DOI: 10.4018/978-1-60566-908-3.ch007
OnDemand PDF Download:
List Price: $37.50


The authors describe a process of extracting a cointegrated model from a database. An important part of the process is a model generator that automatically searches for cointegrated models and orders them according to an information criterion. They build and test a non-heuristic model generator that mines for common factor models, a special kind of cointegrated models. An outlook on potential future developments is given.
Chapter Preview

Introduction And Motivation

Research and development in data mining started with the extraction of association rules from vast amount of data. Today, data mining has evolved in a wide variety of directions, ranging from complexity control of algorithms to the development of applications for many domains, such as counter terrorism, medical diagnosing, marketing and so on (Antonie, Zaïane & Coman, 2001; Bach, 2003; Banek, Min Tjoa & Stolba, 2006; Bhattacharyya, 1999; Choenni, 2000; Wang & Han, 2000). The extraction of econometric models, however, has received relatively little attention in the field of data mining.

An econometric model is a model that specifies the statistical relationship that is believed to hold between its variables. These models play a central role in many fields of research and become increasingly important in forecasting tools. For example, in finance, stock prices may be expressed in terms of other stock prices and macro-economic variables, such as industrial production and interest rates (Cheung & Ng, 1998; Nasseh & Strauss, 2000; Pesaran & Timmermann, 2000). Another example, within government forecasting, is the modelling of recorded crime, which may be expressed in terms of demographic and macro-economic variables, such as the number of young males and unemployment (Deadman, 2003; Greenberg, 2001; Hale & Sabbagh, 1991). Two common econometric models are the linear regression model and the cointegrated model.

The parameters of an econometric model are estimated from historical data of the variables. This requires assumptions on how the variables evolve in time. In a linear regression model, all variables are assumed stationary. Loosely speaking, a variable is stationary if its statistical properties, such as its mean, do not depend on time. (We will come to a more precise definition of stationarity in the text.) If one or more variables are non-stationary, the regression is spurious. A well-known example is the regression of two independent integrated variables: Based on statistical significance testing, there seems to be a strong relation between the variables, while, in fact, they are independent (Granger & Newbold, 1974). (An integrated variable is a special kind of non-stationary variable, its definition will be given in the text.)

Today, model generators exist that search for the best linear regression model out of a large group of candidate models. For example, given historical data of a stationary response variable and of a collection of N stationary candidate predictor variables, the program PcGets (Krolzig & Hendry, 2001) heuristically searches for the best model through the set of 2N-1 candidate models. This generator employs a form of backwards regression: Starting with a large subset of the set of N candidate predictor variables and based on significance testing and information criteria, candidate variables are dropped during iteration steps.

Our objective is the mining of cointegrated models. In this case, the variables evolve in time in such a way that they are all integrated, but linear combinations of the variables exist that are stationary. Cointegrated models are very common in econometric modelling. For a detailed example with two data sets, see (Johansen, 1995). Because cointegrated models cannot be found with a generator that operates under the assumptions of a linear regression model (stationary variables), the mining of cointegrated models is an interesting new development in the mining of econometric models.

Complete Chapter List

Search this Book: