Clustering and Visualization of Multivariate Time Series

Clustering and Visualization of Multivariate Time Series

Alfredo Vellido (Universidad Politécnica de Cataluña, Spain) and Iván Olier (Universidad Politécnica de Cataluña, Spain)
DOI: 10.4018/978-1-60566-766-9.ch008
OnDemand PDF Download:
No Current Special Offers


The exploratory investigation of multivariate time series (MTS) may become extremely difficult, if not impossible, for high dimensional datasets. Paradoxically, to date, little research has been conducted on the exploration of MTS through unsupervised clustering and visualization. In this chapter, the authors describe generative topographic mapping through time (GTM-TT), a model with foundations in probability theory that performs such tasks. The standard version of this model has several limitations that limit its applicability. Here, the authors reformulate it within a Bayesian approach using variational techniques. The resulting variational Bayesian GTM-TT, described in some detail, is shown to behave very robustly in the presence of noise in the MTS, helping to avert the problem of data overfitting.
Chapter Preview


The analysis of MTS is an established research area, and methods to carry it out have stemmed both from traditional statistics and from the Machine Learning and Computational Intelligence fields. In this chapter, we are mostly interested in the latter, but considering a mixed approach that can be ascribed to Statistical Machine Learning.

MTS are often analyzed for prediction and forecasting and, therefore, the problem is considered to be supervised. In comparison, little research has been conducted on the problem of unsupervised clustering for the exploration of the dynamics of multivariate time series (Liao, 2005). It is sensible to assume that, in many problems involving MTS, the states of a process may be reproduced or revisited over time; therefore, clustering structure is likely to be found in the series. Furthermore, for exploratory purposes, it would be useful to visualize the way these series evolve from one cluster or region of clusters to another over time, as this could provide intuitive visual cues for forecasting as well as for the distinction between mostly stable states, smooth dynamic regime transitions, and abrupt changes of signal regime.

It has been argued (Keogh and Lin, 2005) that, in some cases, time series clustering is a meaningless endeavour. It is generally agreed, though, that this might only apply to certain types of time series clustering methods, such as those resorting to subsequence clustering. Nevertheless, it has recently been shown (Simon, Lee & Verleysen, 2006) that, for univariate time series and with an adequate pre-processing based on embedding techniques, clustering using Self-Organizing Maps (SOM: Kohonen, 2001) can indeed be meaningful. This must be understood in the sense that the distribution of two different time series over the SOM map (or, in the case of the statistically principled models we present in this chapter, over the latent space of states) should be significantly different. In this chapter we analyze multivariate time series but, equally, clustering would be meaningful only if the distribution of two multivariate time series over the latent space of states clearly differed (which is the case).

One of the most commonly used Machine Learning methods for the clustering of MTS is indeed SOM, in general without accounting for the violation of the independent identically distributed (i.i.d.) condition. Several extensions of SOM have been developed to explicitly accommodate time series through recurrent connectivity (Chappell & Taylor, 1993; Strickert & Hammer, 2005; Tiňo, Farkas & van Mourik, 2006; Voetglin, 2002). The SOM was originally defined as a biologically inspired model but has long ago veered away towards general data analysis. Despite attempts to fit it into a probabilistic framework (e.g., Kostiainen & Lampinen, 2002; Yin & Allinson, 2001), it has mostly retained its heuristic definition, which is at the origin of some of its limitations. Generative Topographic Mapping (GTM: Bishop, Svensén & Williams, 1998a) is a nonlinear latent variable model that was originally devised as a probabilistic alternative to SOM that aimed to overcome aforementioned limitations. Its probability theory foundations have enabled the definition of principled extensions for hierarchical structures (Tiňo & Nabney, 2002), missing data imputation (Carreira-Perpiñan, 2000; Vellido, 2006), adaptive regularization (Bishop, Svensén & Williams, 1998b; Vellido, El-Deredy & Lisboa, 2003), discrete data modelling (Bishop, Svensén & Williams, 1998b; Girolami, 2002), and robust outlier detection and handling (Bullen, Cornford & Nabney, 2003; Vellido & Lisboa, 2006), amongst others.

Key Terms in this Chapter

Latent Variable Model: a statistical or machine learning model that relates a set of variables (observable variables, residing in observed data space) to set of latent variables

Multivariate Time Series: In statistics, signal processing, and many other fields, a multivariate time series is a set of sequences of data points, measured typically at successive times, spaced at (often uniform) time intervals

Clustering: The process of assigning individual data items into groups (called clusters) so that items from the same cluster are more similar to each other than items from different clusters. Often similarity is assessed according to a distance measure.

Data Visualization: visual representation of data, aiming to convey as much information as possible through visual processes

Nonlinear Dimensionality Reduction: the process of representing high-dimensional data in lower-dimensional spaces through mapping, projection, feature selection and other methods

Variational Bayesian Methods: A family of techniques for approximating intractable integrals arising in Bayesian statistics and machine learning. They can be used to lower bound the marginal likelihood of several models with a view to performing model selection, and often provide an analytical approximation to the parameter posterior probability which is useful for prediction

Generative Topographic Mapping: A nonlinear model for dimensionality reduction. It belongs to the manifold learning family and performs simultaneous data clustering and visualization.

Complete Chapter List

Search this Book: