Semantic Network Formalism for Knowledge Representation: Towards Consideration of Contextual Information

Semantic Network Formalism for Knowledge Representation: Towards Consideration of Contextual Information

Souheyl Mallat (LATICE Laboratory Research Department of Computer Science, Tunisia, University of Monastir, Monastir, Tunisia), Emna Hkiri (LATICE Laboratory Research Department of Computer Science, Tunisia, University of Monastir, Monastir, Tunisia), Mohsen Maraoui (Computational Mathematics Laboratory, University of Monastir, Monastir, Tunisia) and Mounir Zrigui (LATICE Laboratory Research Department of Computer Science, Tunisia, University of Monastir, Monastir, Tunisia)
Copyright: © 2015 |Pages: 22
DOI: 10.4018/IJSWIS.2015100103
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In this paper, the authors propose formalism for representing a knowledge base (KB) by network. The objective is to achieve a high coverage of this base. This type of network is similar to the semantic network with the difference that the arcs are quantified by a value indicating the semantic proximity between the concepts. This semantic proximity presents taxonomic relations, synonyms, and non-taxonomic relations (contextual relations). This latter are discovered based on the association rules model. This model is based on (i) indexing method (ii) the French lexical database EuroWordNet (EWNF) and (iii) the Apriori algorithm. The contextual relations are the latent relations buried in the KB, carried by the semantic context. Evaluating our representation formalism shows better result about 80% of coverage of the KB.
Article Preview

1. Introduction

In specific domains, knowledge is mainly in text form. This knowledge can be research results in the case of specific topics. All this knowledge is used by users to perform tasks of NLP (lexical disambiguation, machine translation, etc.). This knowledge contain a significant amount of information required for a specific treatment.

Our KB is a result of an information retrieval task. The construction of this base is based on information extraction, which is a discipline of Natural Language Processing (NLP). This task looks at how relevant information in one context is expressed in written documents. In our case the goal for information extraction is building a list of relevant sentences in French (listRSF) and also called KB: this database is used by the disambiguation method of the query words in Arabic to French translation. Regarding the construction of the KB, it corresponds to listRSF the list of relevant sentences in Arabic in the work presented (Mallat, 2013; Hkiri, 2015).

To this end, we have proposed a process for selecting relevant sentences based on the weighting function. This function gives a weight to each term of the query based on two assumptions: (1) For each query term, we seek relevant sentences in the relevant documents of the corpus: The hypothesis is based on the binary formula (presence/absence) to determine the intersection of each sentence with the response. The relevant sentences contain the most words related to the response. (2) The second hypothesis is based on the search of query terms having a very low frequency in the different sentences but significant. For this we use the function IEF (Inverse Element Frequency), which has been proposed by many authors (Grabs, 2002).

In this way, we promote the terms found in only one or a very few relevant sentences. Finally a list of relevant Arabic sentences is aligned via the alignment tool MkAlign from the Arabic-French parallel corpus. The alignment result leads to generate a list of relevant sentences in French noted ListRSF or KB.

In many ways the information (textual content of the KB) are structured, accessible and more easily exploitable than in text. As the example of the information is stored in databases or structured in the network formalism. Our research falls within the context of information representation by a network formalism. Indeed a number of work under the automatic processing of natural languages NLP are based on the principles presented in (Church, 1990) to exploit networks of lexical collocations (semantic, syntactic, pragmatic). Niwa (1994) used the lexical networks in the context of word sense disambiguation. In addition Hindle (1990) exploited it respectively in the parsing and the generation. Such networks have the advantage of being easy to build automatically. For this reason, we are interested in the network representation formalism.

In the last decade lexical disambiguation under automatic query translation from Arabic to French around the world has taken giant leaps. Our disambiguation method leverages on our method of KB representation.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 14: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing