An Analysis on Multimodal Framework for Silent Speech Recognition

An Analysis on Multimodal Framework for Silent Speech Recognition

Ramkumar Narayanaswamy, Karthika Renuka D., Geetha S., Vidhyapriya R, Ashok Kumar L.
DOI: 10.4018/978-1-6684-3843-5.ch010
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

A brain-computer interface (BCI) is a computer-based system that collects, analyses, and converts brain signals into commands that are sent to an output device to perform a desired action. BCI is used as an assistive and adaptive technology to track the brain activity. A silent speech interface (SSI) is a system that enables speech communication when an acoustic signal is unavailable. An SSI creates a digital representation of speech by collecting sensor data from the human articulatory, their neural pathways, or the brain. The data from a single stage is very minimal in order to capture for further processing. Therefore, multiple modalities could be used; a more complete representation of the speech production model could be developed. The goal is to detect speech tokens from speech imagery and create a language model. The proposal consists of multiple modalities by taking inputs from various biosignal sensors. The main objective of the proposal is to develop a BCI-based end-to-end continuous speech recognition system.
Chapter Preview
Top

Introduction

Speech is one of the natural ways of human communication. Person who is suffering from traumatic injuries or any neurodegenerative disorders will not be able to communicate to the public. The studies have said that approximately around 0.4% of the Europe people have a Speech disorder disease. Approximately 40 million American natives suffer from Communication disorders In human life, the Speech and language impairments creates a large impact, they find it very difficult in every aspects of communication activities.. Furthermore, the health-care professionals interact with speech-disabled people is not much comfortable in interacting with them. The people with communication disorders often have a feeling that they are idle and become more stressed since they are not able to communicate like normal people which leads to clinical depression.

The Brain Computing Interface (BCI) is an emerging field in the health care sector. BCI technologies establish a connection between the human brain and the outside world, eliminating the need for traditional data transmission methods. BCI is a technology that uses neural features to help people regain or improve their abilities. Speech BCI would enable real-time communication using brain correlates of attempted or imagined speech. Speech BCI would enable real-time communication using brain correlates to imagined speech. Neural decoders, feature extraction, and brain recording models are all undergoing new improvements. Automatic Speech Recognition (ASR) and related sciences have been hot topics in recent years, and they offer the basis for the Brain Computing Interface.

Silent Speech Interfaces (SSI) is an alternative to traditional acoustic-based speech interfaces. The idea of the Silent Speech Interface is unavailability of intelligible acoustic signals in the speech activity. In recent years Augmented and Assistive Communication have emerged and Silent Speech Interface is one of the assistive devices which are used to restore the oral communication. Lip reading is one of the most familiar method of recognizing the Silent Speech. A variety of other devices are available in order to capture the speech related bio-signals, they are surface Electromyography (sEMG) and Electroencephalography (EEG). In order to capture the electrical activity of the facial muscles with the help of electrodes the (sEMG) sensor is used. The electroencephalogram (EEG) is used to record neural activity in anatomical regions of the brain that are involved in speech production. Because they enable voice communication without relying on the acoustic signal, SSIs are basically new way to restore communication capabilities to human being with speech problems.

SSIs are a sort of assistive technology that helps people regain their ability to communicate verbally. Acoustic Signals are combined with the biosignals produced by various organs during speech production. These biosignals are created by chemical, electrical, physical, and biological processes that occur during speech production by using the biosignals.

Figure 1.

Silent Speech Interface

978-1-6684-3843-5.ch010.f01

In the above Fig.1, the speech related data is collected from various speech related activities like Brain, muscle, speech and Articulatory. The first step is to collect the data, after the data collection, the feature extraction is performed in order to remove the artifacts, after that it is fed into a deep neural networks and the output is obtained in the form of Speech Synthesis and Speech recognition.

In the next section, the review of literature on the state-of-art methods used for the Silent Speech Interface using different modalities like EEG, EMG and Lip Movement have been discussed.

Complete Chapter List

Search this Book:
Reset