Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Clustering and Visualization of Multivariate Time Series

Alfredo Vellido, Iván Olier

Source Title: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques

DOI: 10.4018/978-1-60566-766-9.ch008

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

The exploratory investigation of multivariate time series (MTS) may become extremely difficult, if not impossible, for high dimensional datasets. Paradoxically, to date, little research has been conducted on the exploration of MTS through unsupervised clustering and visualization. In this chapter, the authors describe generative topographic mapping through time (GTM-TT), a model with foundations in probability theory that performs such tasks. The standard version of this model has several limitations that limit its applicability. Here, the authors reformulate it within a Bayesian approach using variational techniques. The resulting variational Bayesian GTM-TT, described in some detail, is shown to behave very robustly in the presence of noise in the MTS, helping to avert the problem of data overfitting.

Chapter Preview

Top

Introduction

The analysis of MTS is an established research area, and methods to carry it out have stemmed both from traditional statistics and from the Machine Learning and Computational Intelligence fields. In this chapter, we are mostly interested in the latter, but considering a mixed approach that can be ascribed to Statistical Machine Learning.

MTS are often analyzed for prediction and forecasting and, therefore, the problem is considered to be supervised. In comparison, little research has been conducted on the problem of unsupervised clustering for the exploration of the dynamics of multivariate time series (Liao, 2005). It is sensible to assume that, in many problems involving MTS, the states of a process may be reproduced or revisited over time; therefore, clustering structure is likely to be found in the series. Furthermore, for exploratory purposes, it would be useful to visualize the way these series evolve from one cluster or region of clusters to another over time, as this could provide intuitive visual cues for forecasting as well as for the distinction between mostly stable states, smooth dynamic regime transitions, and abrupt changes of signal regime.

It has been argued (Keogh and Lin, 2005) that, in some cases, time series clustering is a meaningless endeavour. It is generally agreed, though, that this might only apply to certain types of time series clustering methods, such as those resorting to subsequence clustering. Nevertheless, it has recently been shown (Simon, Lee & Verleysen, 2006) that, for univariate time series and with an adequate pre-processing based on embedding techniques, clustering using Self-Organizing Maps (SOM: Kohonen, 2001) can indeed be meaningful. This must be understood in the sense that the distribution of two different time series over the SOM map (or, in the case of the statistically principled models we present in this chapter, over the latent space of states) should be significantly different. In this chapter we analyze multivariate time series but, equally, clustering would be meaningful only if the distribution of two multivariate time series over the latent space of states clearly differed (which is the case).

One of the most commonly used Machine Learning methods for the clustering of MTS is indeed SOM, in general without accounting for the violation of the independent identically distributed (i.i.d.) condition. Several extensions of SOM have been developed to explicitly accommodate time series through recurrent connectivity (Chappell & Taylor, 1993; Strickert & Hammer, 2005; Tiňo, Farkas & van Mourik, 2006; Voetglin, 2002). The SOM was originally defined as a biologically inspired model but has long ago veered away towards general data analysis. Despite attempts to fit it into a probabilistic framework (e.g., Kostiainen & Lampinen, 2002; Yin & Allinson, 2001), it has mostly retained its heuristic definition, which is at the origin of some of its limitations. Generative Topographic Mapping (GTM: Bishop, Svensén & Williams, 1998a) is a nonlinear latent variable model that was originally devised as a probabilistic alternative to SOM that aimed to overcome aforementioned limitations. Its probability theory foundations have enabled the definition of principled extensions for hierarchical structures (Tiňo & Nabney, 2002), missing data imputation (Carreira-Perpiñan, 2000; Vellido, 2006), adaptive regularization (Bishop, Svensén & Williams, 1998b; Vellido, El-Deredy & Lisboa, 2003), discrete data modelling (Bishop, Svensén & Williams, 1998b; Girolami, 2002), and robust outlier detection and handling (Bullen, Cornford & Nabney, 2003; Vellido & Lisboa, 2006), amongst others.

Key Terms in this Chapter

Latent Variable Model: a statistical or machine learning model that relates a set of variables (observable variables, residing in observed data space) to set of latent variables

Multivariate Time Series: In statistics, signal processing, and many other fields, a multivariate time series is a set of sequences of data points, measured typically at successive times, spaced at (often uniform) time intervals

Clustering: The process of assigning individual data items into groups (called clusters) so that items from the same cluster are more similar to each other than items from different clusters. Often similarity is assessed according to a distance measure.

Data Visualization: visual representation of data, aiming to convey as much information as possible through visual processes

Nonlinear Dimensionality Reduction: the process of representing high-dimensional data in lower-dimensional spaces through mapping, projection, feature selection and other methods

Variational Bayesian Methods: A family of techniques for approximating intractable integrals arising in Bayesian statistics and machine learning. They can be used to lower bound the marginal likelihood of several models with a view to performing model selection, and often provide an analytical approximation to the parameter posterior probability which is useful for prediction

Generative Topographic Mapping: A nonlinear model for dimensionality reduction. It belongs to the manifold learning family and performs simultaneous data clustering and visualization.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Clustering and Visualization of Multivariate Time Series

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List