Research on Letter and Word Frequency and Mathematical Modeling of Frequency Distributions in the Modern Bulgarian Language

Research on Letter and Word Frequency and Mathematical Modeling of Frequency Distributions in the Modern Bulgarian Language

Tihomir Trifonov (St. Cyril and St. Methodius University of Veliko Tarnovo, Bulgaria) and Tsvetanka Georgieva-Trifonova (St. Cyril and St. Methodius University of Veliko Tarnovo, Bulgaria)
DOI: 10.4018/978-1-4666-6252-0.ch007
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The purpose of this chapter is to present current research on the modern Bulgarian language. It is one of the oldest European languages. An information system for the management of the electronic archive with texts in Bulgarian language is described. It provides the possibility for processing the collected text information. The detailed and comprehensive researches on the letter and the word frequency in the modern Bulgarian language from varied sources (fiction, scientific and popular science literature, press, legal texts, government bulletins, etc.) are performed, and the obtained results are represented. The index of coincidence of the Bulgarian language as a whole and for the individual sources is computed. The results can be utilized by different specialists – computer scientists, linguists, cryptanalysts, and others. Furthermore, with mathematical modeling, the authors found the letter and word frequency distributions and their models and they estimated their standard deviations by documents.
Chapter Preview
Top

Researches On The Frequencies Of The Letters And The Words In The Bulgarian Language

The frequency of the occurring the letters, the bigrams, the trigrams, the first and the last letters of the words, the average length of the words, the frequencies of the words reflect the way by which the people use their own language and determine unique characteristics of this language.

Complete Chapter List

Search this Book:
Reset