Improvement of Speech Coding of Turkish Speech Data Using New Hybrid Genetic Algorithms

Improvement of Speech Coding of Turkish Speech Data Using New Hybrid Genetic Algorithms

DOI: 10.4018/978-1-6684-7679-6.ch006
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Understanding the mechanism of speech generation is critical to successful coding of the speech signal. In this study, first of all, what speech coding is, its purposes, usage areas, classification of speech coding according to some features and techniques are given. In addition, in this study, a voice coding study was carried out to increase the speech quality on Turkish voice data with a multi-purpose genetic-based speech coding method. In the multi-objective genetic algorithm method, a coding gain optimizing parameter is used to improve the quality of at least two lost audio data. By combining fuzzy logic and genetic algorithm, the authors proposed a new hybrid model and increased the sound quality compared to other methods. In addition, the solutions that provide the best match for frequency selection are obtained with the fitness function. This proposed system is also compared with the wavelet audio coding methods that are frequently used in audio coding. The results obtained showed that the system obtained by the hybrid model was more successful.
Chapter Preview
Top

Introduction

The sound enhancement process is done using sound processing tools to increase the intelligibility of the quality of the distorted signals that are affected by noise for various reasons. The main purpose of voice enhancement is to improve the performance of voice communication systems in noisy environments (Li et al., 2022).

Voice and speech improvement:

  • Hearing aids,

  • Voice and speaker recognition,

  • Encoding of audio signals.

It is an important and researched problem in applications such as. Sound enhancement is a very difficult problem for two reasons (Yuenyong et al., 2022):

  • First, the characteristics of nature and noise signals can change randomly depending on time and application.

  • The second is that the performance measurement criteria are handled differently (differently from each other) in each application.

Speech improvement:

  • Voice recognition systems,

  • To mobile radio communication systems,

  • To low quality audio recordings,

  • Hearing aids.

Used to increase their performance.

The source factors that cause sound distortion are listed below:

  • White or colored noise,

  • A periodic signal such as reverberation, sound flow,

  • Hum noise,

  • A fading signal.

The speech signal can be affected by a single noise signal, or it can be affected by more than one noise source at the same time (Chuang, Wang, & Tsao, 2022).

  • Digital voice communication, automatic voice recognition systems and human-machine interaction interfaces.

  • For example, in hands-free phone calls in vehicles, the transmitted signal may be distorted or affected by reverberation and background noise. Such systems work well in noise-free and conditions, but their performance is adversely affected in noisy environments.

  • There are two insightful criteria to measure sound improvement performance.

  • The first criterion, the first of these, is the quality of the improved audio signal. This quality is the clarity of the sound (clarity), its pure state, that is, the distorted nature of the speech, and the level of the residual noise in the sound.

  • The second criterion is the measurement of the intelligibility of the improved voice.

  • This is the percentage of words understood and recognized by the audience. This measurement is a personal (objective) measurement.

Many speech enhancement systems improve and increase signal quality by reducing the intelligibility of the speech. Speech enhancement is a pre-stage applied in most audio systems before the speech analysis stage. Speech analysis is the stage where feature extraction is done. At this stage, feature extraction can be done directly on the speech waveform in the time domain, as well as in the frequency domain. There are different approaches for different problems for speech improvement purposes.

Key Terms in this Chapter

Formant: These are the frequencies at which sound has the highest energy.

Speech: It is called the converted form of analog audio data so that it can be processed in computers.

Coding: It is the compression and compression of data into an audio format.

Complete Chapter List

Search this Book:
Reset