Article Preview
Top1. Introduction
A brain-computer interface (BCI) is a hardware and software system that can communicate with computer or external devices through brain activity (Nicolas-Alonso and Gomez-Gil, 2012). BCI system can provide a new human computer interaction method which does not depend on the traditional peripheral nervous system and muscle tissue such as a mental typewriter presented by Berlin BCI (Blankertz et al., 2006) and an approach based on recognition of sign language from imagination (Al Qattan and Sepulveda, 2017). BCI is also a promising technology for the communication of locked-in state (CLIS) patients (Nicolas-Alonso and Gomez-Gil, 2012; Hinterberger et al., 2003) as well as the interaction with videogames or virtual environments (Lécuyer et al., 2008). In recent years, BCI has also made significant progress in speech imagery analysis (Chengaiyan and Anandhan, 2015) and attention detection (Subramanian et al., 2017).
A key phenomenon observed across different recording modalities is that the neurophysiological rhythmic activities, mostly in alpha (7-13 Hz), mu (8-12 Hz), beta (14-30 Hz) and gamma frequency bands (>30 Hz), recorded over the sensorimotor cortex are modulated by actual movement, motor intention, or motor imagery. Such rhythmic brain activities measured by EEG or other recording modalities over the sensorimotor cortex are collectively referred to as the sensorimotor rhythms (SMRs) (Yuan and He, 2014). Motor imagery (MI) EEG signals is a kind of original EEG signals containing SMRs. MI-EEG signals one of the most popular EEG signals used in non-invasive BCI studies to serve as control signals. Different MI tasks can cause different oscillatory activities observed in the sensorimotor cortex of the brain in the form of event-related desynchronization (ERD) or event-related synchronization (ERS) (Pfurtscheller and Da Silva, 1999).
Nevertheless, compared with such traditional interaction modes as keyboard and mouse, the precision and reliability of BCI system are still insufficient since the performance varies greatly across or even within the subjects (Ahn and Jun, 2015). EEG signals are often distorted by such artifacts as electromyography (EMG) or electrooculography (EOG) (Nicolas-Alonso and Gomez-Gil, 2012). Moreover, in most of the cases, EEG performance is a typical non-stationary process. Therefore, it is still a significant challenge to decode mental activities from EEG signals accurately.
In recent years, with the increase of the size of data, deep learning algorithm is known to provide better classification performance in such fields as shape recognition and speech recognition. In the study on BCI, (An et al., 2018) construct a deep belief net (DBN) for MI EEG signals classification and their results is better than those from SVM. (Yang et al., 2015) also employ convolutional neural networks (CNN) to classify the augmented common spatial pattern (ACSP) features of MI-EEG signals. In recent study, (Tabar and Halici, 2016) investigate convolutional neural networks and stacked autoencoders (SAE) which have better classification performance compared with state-of-art approaches for left/right hand MI-EEG signals classification.
Inspired by the success of CNN-SAE proposed by (Tabar and Halici, 2016), in order to reduce the parameters of neural network and improve the performance, we extend their work of CNN on the classification of motor imagery EEG signals. The proposed approach is analyzed and evaluated by using BCI Competition IV dataset 2b (Leeb et al., 2008).
In this paper, the processing method of input data of CNN combining time, frequency and location information is introduced. First, the original signals obtained from C3, Cz and C4 electrodes are preprocessed to extract useful information of MI-EEG signals. Then Short Time Fourier Transform (STFT) is employed to convert the preprocessed data from each electrode to 2-dimensional feature matrix. The data of feature matrix of related frequency bands (i.e. mu band and beta band) from each electrode is extracted and combined to one matrix which consists of the input data of CNN.