Entropy, Chaos, and Language

Entropy, Chaos, and Language

Daniela Lopez De Luise
DOI: 10.4018/978-1-7998-8686-0.ch013
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Natural language is a rich source of information, with a complex structure and a kaleidoscope of contents. This arises from the flexibility that living languages exhibit in order to reflect the speaker's intention, and also due to communication needs. From computational linguistics, multiple strategies have been developed that allow detecting and interpreting textual contents, but there is an uncovered margin, an interpretive range that remains outside the scope of automatic processing, and that requires rethinking the scope and perspectives of these tools. This chapter aims to show this gap and its implications, exploring dialectical and technical reasons. It also proposes a new perspective of interpretation and scope of textual processing, a sort of thermodynamics of productions that involve the communication of certain types. As part of the scope, there is a bibliographic analysis and a statistical and heuristic exploration of the proposal applied to 20Q game.
Chapter Preview
Top

Introduction

Language distinguishes humans from the rest of the species. It can be considered a code associated with a grammar, which is a set of rules that allow people to communicate. It is relevant during childhood to a proper cognitive evolution and for emotional and social maturity (Sala Torrent, 2020).

Logic reasoning follows premises in which the rules accepted as valid are applied, link by link, until the conclusions are produced, which is known as the logical consequence. The natural deduction system shows an intimate relationship between symbolic language and the logic of Natural Language (Kemel, 2020). Both languages, from their scopes, are coupled by rules of inference, validity, argumentation, deduction, and proof. According to Kemel the only possible explanation for the link that relates both languages is a “natural” isomorphism between them. There is a surprisingly close relationship with human thought in a way that is still not fully explained. Natural Language manages to represent, figure, interpret, compose and symbolize real objects and virtual objects in the mind, in order to produce logical reasoning processes. This is precisely the subject that occupies the work of Kemel: explaining and using the capacity that Natural Language has to make abstract constructions from concrete instances. The current work of this chapter also seeks to study the process of linguistic reasoning but starting from scratch (sentences and words without rules of grammar) to assign certain meaning to them, and exposing the processes behind it.

There are many reasons to consider Natural Language as a system working under the laws of chaos (Sala Torrent, 2020). This complex production of the human brain, presents logarithmic and fractal behaviors like many other productions of nature. There are dialectic, biological, practical, and technical background that partially supports the idea of an inborn strategy that underlies the brain mechanism of spoken and written text production. Although there are practical statistical explanations that work pretty well in general, and a Zipf-Mandelbrot fractal law that approximates with some error the word distribution in texts, the core of the linguistic mechanism is still under study. This chapter starts with a review of many of the ideas and concepts from different perspectives. These fundamentals help builds an approximation of the essence and implications of productions in Natural Language.

The main focus of this work is the determination of basic rules that govern words selection when the goal is to communicate an idea, in the context of a dialog between two individuals speaking in their native language. In order to perform a careful analysis, the study performed is reduced to a simple game named 20Q, whose target is to guess a word initially thought by another player. The play consists of asking him a series of questions. These can only be answered by yes or no. The words and questions studied in this chapter are in Spanish. A curious thing here is the fact that there is a high probability to get the right word even though that there are about 93000 words, according to RAE (Spain Royal Academy) in its Spanish Dictionary Edition of 2014 (Real Academia Española, 2021). Tests performed here use an AI system available for free on the web. This system is a Neural Network trained when visitors play with it. The fact that such a system is able to guess the correct word is very impressive. This strange phenomenon could be explained considering the language has many but very precise rules, some of them some used unconsciously. The findings shown here are part of a much larger work of extensive analysis to reveal the rules that govern language dynamics.

Key Terms in this Chapter

Linguistic Reasoning: It a reasoning and its components articulated in order to understand and produce using certain language regulated by a grammar. It aims at evaluating ability to think constructively, rather than at simple fluency or vocabulary recognition.

Natural Isomorphism: Isomorphism between Symbolic Language and Natural Language.

Fractal Dimension: Ratio providing a statistical index of complexity comparing how detail in a pattern (strictly speaking, a fractal pattern) changes with the scale at which it is measured.

Corpus: A specific compilation of textual data following certain criteria and usually with added meta data with linguistic value.

Entropy: In general, it is the lack of order or predictability. In communications a measure of information that can be transmitted through certain channel.

Thermodynamics: Branch of physical science that deals with the relations between heat and other forms of energy (such as mechanical, electrical, or chemical energy), and, by extension, of the relationships between all forms of energy.

Symbolic Language: Language that employs symbols and has been artificially constructed for the purpose of precise formulations (as in symbolic logic, mathematics, or chemistry).

Lematization: Process of grouping together the inflected forms of a word so they can be analyzed as a single item.

Natural Language: Language that emerged naturally as an interchange between humans, in contrast to artificial languages.

Tagging: In linguistics it is the action of attaching a label to some part of a text, usually to denote something of interest.

Zeta Function: Function formed by a sum of infinite functions raised to powers and convergent.

Fractal: Curves or shapes that has the property of self-similarity (chosen part is similar in shape to a given larger or smaller part when magnified or reduced to the same size).

Complete Chapter List

Search this Book:
Reset