Natural language processing discusses the applications of computational technique analysis and synthesis of natural languages. Semantic and morphological analysis are the two basic percepts in the natural language processing domain. Semantic analysis is the process of analyzing the lexical, grammatical, and syntactical parts of the words. The study of words known as morphology focuses on the meaning and structure of words. In this chapter, the authors focus on various morphological analyzers developed for Tamil language. Developing a highly accurate and adaptable morphological analyser is a challenging task. Morphological analyser basically identifies the morphemes and parts of speech for tagging. The atomic version of a word that retains the original meaning is called a morpheme. Morphological analyzer type includes phrase level and word level analyzers. Universal networking language (UNL) is a declarative kind used to express the natural language text using a semantic network. The major applications of UNL are information retrieval system, machine translation system, and UNL-based search engine.
TopIntroduction
Tamil is one of the oldest languages in history. Tamil Morphological study has been an interesting area in natural language understanding. Morphology is basically the process of analyzing the syntactic and semantic parts of the word. Morphological analysis can be done using Finite State Transducers, Finite State Machines, Tree Adjoining Grammars and Support Vector Machines. Semantic analysis primarily concentrates on word similarity. Semantics, which are employed in practical applications such as information retrieval/extraction, are used to measure word similarity. Semantic similarity is a most important measure in disciplines like Natural language processing, Artificial Intelligence, Sentiment Analysis and Information Retrieval techniques. Graph based semantic representations like Semantic networks, Dependency Graphs, Semantic relations and Universal Networking Language are the ways of semantic representation. The Universal Networking Language is a structured ontological graph-based intermediate representation of natural language. These structured representations of the data have a greater advantage in the Information Extraction/Retrieval during node search because of its lesser complexity. This book chapter focuses on the articles published in the Natural Language Processing area, which deals with the UNL, Morphological analysis and Semantic analysis. The paper organization is as follows: Section 2 reviews the ideology about Universal Networking Language. Section 3 discusses Morphological analysis and Section 4 elaborates on Semantic analysis, Section 5 provides conclusion and finally the References.
Review Methodology
Research articles have been collected from several sources like Springer, Elsevier, IEEE, ACM, arXiv. The articles were filtered based on the publication year starting from 2007 to 2022. Keywords like “Universal Networking Language”, “Semantic Analysis”, “Morphological Analysis”, “Machine Translation”, “Information Retrieval” were used to retrieve the relevant articles.
- ●
The review sections are classified into three namely Universal Networking Language, Morphological Analysis and Semantic analysis.
- ●
A comprehensive review of Universal Networking Language(UNL) is provided as a first attempt.
- ●
The UNL Graph generation using Enconversion is discussed briefly.
- ●
The models like Finite State Transducer, Support Vector Machines and Lexical Functional Grammar to develop a Morphological Analyser were discussed concisely.
- ●
The Semantic analysis systems classified based on Word Embedding, Word Overlapping models were discussed.
- ●
The challenges in the development of a Morphological Analyzer, Semantic Analysis and UNL are discussed.
- ●
Finally, the future research directions on these domains are reviewed and concluded.
TopUniversal Networking Language: A Semantic Representation
The Universal Networking Language (UNL) is a formal language based on linguistics that is used to represent and connect diverse languages and knowledge systems. It tries to break down language barriers and facilitate cross-linguistic and cross-cultural communication and information exchange.
The UNL standardizes the representation of meaning and knowledge in a language-independent manner. It combines linguistic, logic, and computer science ideas to develop a universal representation of language that can be comprehended and processed by humans and machines alike.
The main components of UNL include a controlled vocabulary of concepts, a set of grammatical rules, and a notation system for representing linguistic structures. By mapping the meaning of words and sentences to a common underlying structure, UNL allows for the translation, transfer, and sharing of information across multiple languages. UNL has been used in a variety of domains, including machine translation, cross-lingual information retrieval, multilingual communication, and the creation of language technology. In an increasingly interconnected world, it is a great tool for encouraging linguistic diversity, facilitating global collaboration, and improving cross-cultural understanding.