Improving Similarity Search in Time Series Using Wavelets

Improving Similarity Search in Time Series Using Wavelets

Ioannis Liabotis, Babis Theodoulidis, Mohamad Saraaee
DOI: 10.4018/978-1-59904-951-9.ch064
(Individual Chapters)
No Current Special Offers


Sequences constitute a large portion of data stored in databases. Data mining applications require the ability to process similarity queries over a large amount of time series data. The query processing performance is an important factor that needs to be taken into consideration. This article proposes a similarity retrieval algorithm for time series. The proposed approach utilizes wavelet transformation in order to reduce the dimensionality of the time series. The transformed series are indexed using X-Trees, which is a spatial indexing technique able to efficiently index high-dimensional data. The article proves that this technique outperforms the usage of the Fourier transformation, since the wavelet transformation provides better approximation of the time series. Through the experiments, it can be concluded that the optimum performance is obtained using 16 to 20 wavelet coefficients. Furthermore, a novel mechanism for reducing the complexity of the calculation for the false alarms removal is proposed. Storing the approximation coefficients of the penultimate level of the decomposition tree, the Euclidean distance between the two sequences is calculated, thus reducing further the number of false alarms before calculating the actual Euclidean distance using the complete time series. The article concludes with a detailed performance evaluation of the proposed similarity retrieval algorithm using data from the Greek stock market and the temperature measurements from Athens. The comparison is done with techniques that use the Haar transform and the R*-Tree, and the proposed algorithm is shown to outperform them.

Complete Chapter List

Search this Book: