Article Preview
Top1. Introduction
Natural Language Processing (NLP) has received a lot of attention as a way to enhance the development of computer applications and techniques capable to automatically processing data relating to one or more languages (Zhou et al., 2020). Assimilating the linguistic domain is a fundamental issue; especially that it is involved in many other research domains such as e-health, e-business, education and antiterrorism. For instance, NLP is used in the e-health domain by proposing human-to-machine natural language instruction such as, robot-assisted surgery guided by human instruction (Costea et al., 2020). However, the linguistic domain is huge and complex and it presents considerable differences between languages (Schalley, 2019). These issues become worse when handling lingware applications (Baklouti et al., 2010).
Representing the linguistic knowledge in a common model could help the user to understand the meaning, scope and use of related techniques and algorithms. This is particularly useful for novice users, but can also provide new perspectives for expert ones.
Various linguistic registries and glossaries have been proposed. Unfortunately, such efforts provide a poor and an imprecise semantic description which are not sufficient for most lingware applications. Besides, they do not support multilingualism. Ontologies were proven to be more useful as they provide more precise and semantically richer results (Jarrar, 2021). However, most of the proposed ontologies represent only the linguistic data (e.g. word and Part Of Speech (POS)) and neglect the linguistic processing functionalities (e.g. segmentation and POS tagging) and the linguistic processing features (e.g. processing level and analysis type). Moreover, they do not offer a reasoning engine that assists the user in understanding the linguistic knowledge and developing lingware applications. Besides, they are hard to be used by users less or not familiar with ontologies as they do not offer an ontology visualization tool to facilitate the interaction with it. Finally, most of these ontologies do not support multilingualism.
In this paper, the authors propose LingFramework, a semantic and smart assistance framework for handling multilingual linguistic knowledge. It targets not only expert users, but also novice ones. It assists users in understanding the different aspects of the linguistic domain and ease the process of developing NLP applications. LingFramework is based on a multilingual linguistic domain ontology called LingOnto. This ontology allows representing linguistic data, linguistic processing functionalities and linguistic processing features. Moreover, LingOnto enables reasoning, via a SWRL based reasoning engine, about the aforementioned knowledge in order to guide the user to select valid NLP pipelines. For example, if the user is developing an annotation tool, he will be guided through each processing functionality choice, where only functionalities that are valid for the annotation task in the processing pipeline are made available for selection. LingOnto covers the French, English and Arabic languages. LingOnto is designed to be used by users, who are not necessary ontology experts. To overcome this issue, the authors propose a user friendly ontology visualization tool called LingGraph. It offers an understandable visualization of LingOnto to both ontology and non-ontology expert users. LingGraph is based on a smart search functionality which relies on a SPARQL pattern-based approach. It extracts and visualizes an ontological view from LingOnto related to only components corresponding to the user’s needs.
The authors applied LingFramework to assist users in identifying valid NLP pipelines related to NLP applications. Finally, they evaluate (i) its performance in identifying valid NLP pipelines and (ii) the usability of its user interfaces.
The current paper is organized as follows. Section 2 presents some related works. Section 3 details our semantic and smart framework for handling linguistic knowledge. Section 4 presents our multilingual linguistic domain ontology. Section 5 presents our user friendly ontology visualization tool. Section 6 details a brief example of a typical use of LingFramework. The evaluations of the performance and usability of our framework will be presented in Section 7. Finally, Section 8 draws conclusions and future research directions.