An Approach for ECG Characterization and Classification Using the Combination of Wavelet Transform and Decision Tree Methods

An Approach for ECG Characterization and Classification Using the Combination of Wavelet Transform and Decision Tree Methods

Faiza Charfi (National Engineering School of Sfax, Tunisia) and Ali Kraiem (National Engineering School of Sfax, Tunisia)
DOI: 10.4018/ijsbbt.2012070103
OnDemand PDF Download:
No Current Special Offers


A new automated approach for Electrocardiogram (ECG) arrhythmias characterization and classification with the combination of Wavelet transform and Decision tree classification is presented. The approach is based on two key steps. In the first step, the authors adopt the wavelet transform to extract the ECG signals wavelet coefficients as first features and utilize the combination of Principal Component Analysis (PCA) and Fast Independent Component Analysis (FastICA) to transform the first features into uncorrelated and mutually independent new features. In the second step, they utilize some decision tree methods currently in use: C4.5, Improved C4.5, CHAID (Chi - Square Automatic Interaction Detection) and Improved CHAID for the classification of ECG signals, which are taken, from the MIT-BIH database, including normal subjects and subjects affected by arrhythmia. The authors’ results suggest the high reliability and high classification accuracy of C4.5 algorithm with the bootstrap aggregation.
Article Preview


In modern society, cardiovascular disease has been one of leading causes of death for years (Leigang et al., 2010). Electrocardiogram (ECG) is a vital signal for the clinical diagnosis of heart diseases. Many algorithms exist to improve the accuracy of ECG beat classification (Yu & Chen, 2008). The analysis of the cardiac signal ECG is extensively used in the various pathology diagnoses. Atrial fibrillation (AF) is a major cause of morbidity and mortality in the elderly population. The atrial fibrillation is an irregular heartbeat and is the most common type of cardiac arrhythmia that impacts between 2% and 10% of population over 50 years of age (Langley, Rieta, Stridh, Millet, Sörnmo, & Murray, 2006). Whereas Right Bundle Branch Block (RBBB) corresponds to the deterioration of atrio-ventricular conduction in the right side of the heart during the chronic phase of the disease. Hence, the ECG interpretation and classification is important for cardiologists to decide the diagnostic categories of cardiac problems.

The earlier method of ECG signal analysis and classification was based on decision tree method, but this is not the only method used. Hence, the frequency representation of a signal is highly required. To accomplish this, Okajima et al. (1990) studied the frequency components using the Fast Fourier Transform (FFT) and multiple narrow-bands filtering in the frequency domain at various center frequencies, followed by inverse FFT. They obtained time–frequency presentations of the QRS complex, which were different for normal subjects and patients with Coronary Artery Disease (CAD). Forgione et al. (1992) showed focal reduction of high-frequency components of the QRS complex (150–250 Hz) under myocardial ischemia induced by Percutaneous Transluminal Coronary Angioplasty (PTCA). Many other techniques have been reported for the classification of ECG signals, in particular Fuzzy and Neural Network Algorithm (Ceylan et al., 2009), machine learning methods (Michie et al., 1994), statistical or pattern-recognition methods (such as k-nearest neighbors and Bayesian classifiers (Aha et al., 1991; Dasarathy, 1990), and expert systems (Hu et al., 1997). These works include diagnostic problems in oncology, neuropsychology, and gynaecology (Lavrac, 1999). Güler et al. (2005) describes the combined neural network model to guide model selection for classification of ECG beats.

However, exploratory data analysis techniques have limitations on the amount of the data that can be effectively processed. The ECG data collection might contain irrelevant or redundant features that affect negatively the accuracy of the classifier algorithm. The size and dimensionality of data make it difficult to use available information to identify features that discriminate between the classes of interest. Thus, the feature selection is an important task in effective data mining to boost the sensitivity, specificity and accuracy of the classification process.

In this study, we carry out an in-depth analysis of publicly available data from the MIT-BIH Arrhythmia database (Moody & Mark, 1991) in order to detect and identify some ECG abnormalities by using Data Mining classification methods. One main goal is to establish, apply, and evaluate the use of Data Mining methods in differentiating between some clinical and pathological observations.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 3: 1 Issue (2015)
Volume 2: 4 Issues (2013)
Volume 1: 4 Issues (2012)
View Complete Journal Contents Listing