Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Exploring Fuzzy Association Rules in Semantic Network Enrichment Improvement of the Semantic Indexing Process

Souheyl Mallat, Emna Hkiri, Mounir Zrigui

Source Title: Innovations, Developments, and Applications of Semantic Web and Information Systems

DOI: 10.4018/978-1-5225-5042-6.ch006

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In the aim of natural language processing applications improvement, we focus on statistical approach to semantic indexing for multilingual text documents based on conceptual network formalism. We propose to use this formalism as an indexing language to represent the descriptive concepts and their weighting. Our contribution is based on two steps. In the first step, we propose the extraction of index terms using the multilingual lexical resource EuroWordNet (EWN). In the second step, we pass from the representation of index terms to the representation of index concepts through conceptual network formalism. This network is generated using the EWN resource and pass by a classification step based on association rules modelOur proposed indexing approach can be applied to text documents in various languages. Next, we apply the same statistical process regardless of the language in order to extract the significant concepts and their associated weights. We prove that the proposed indexing approach provides encouraging results.

Chapter Preview

Top

1. Introduction

It is known that ambiguities of natural language, have a detrimental effect on the results of query terms translation in the context of information retrieval by crossing languages. However, research efforts to integrate sense disambiguation techniques in machine translation (MT) have not been successful and get unconvincing results. In addition, our automatic translation system (ATS) (Jianfeng et al., 2001) requires a high precision of disambiguation to achieve an effect on the selection of the best translation in the target language of ambiguous words.

The semantic disambiguation process of the query in the target language is based on a similar document language as the query. This document is a list of relevant sentences (most similar to a user query); these sentences noted List_S are satisfying the query, and they are classified according to their degree of linguistic relevance (semantic, morphological). Building this List_S of words is presented in the work (Mallat et al., 2013) (Mallat et al., 2014). The same lists (of French and English sentences are the result of the multilingual parallel corpus alignment. Both versions of the lists are used as resources for the disambiguation process in the queries translation (Arabic-French) and (Arabic-English). The process is to match the query and the List_S content to find the words of the query in the target language that best fits this List_S. A key feature of the method of disambiguation is that the degree of matching of each translation of an ambiguous word and List_S depends on the highest weight.

Note that this List_S is expressed by singular characteristics of specific themes such as semantic and morphological wealth that is supposed to represent the best the relevant answers to a given query. Indeed, the disambiguation process improvement requires providing an effective method for representing and better analyzing the contents of this list.

In this paper, we focus on the extraction of the concepts (descriptors) or index concepts in order to associate for each document (List_S) a representation of its contents by concepts and their associated weights.

To do this, we focus our work to propose a statistical approach to semantic indexing of multilingual documents (French or English) that are taken only on calculations of the frequency of words. Also, we focus on exploiting taxonomic and non taxonomic relations (contextual) between terms. The proposed indexing approach consists of:

1.
Extracting the significant words or index terms associated with the concepts of a document in English or French, based on the two external lexical resource (multilingual thesaurus) EuroWordNet (EWN French and English EWN) (Gonzalo et al., 1998) (Vossen et al., 1997). As we consider that a EWN is composed of a set of lexicons and a set of relations between them designated by concepts.
2.
The construction and exploitation of conceptual network formalism of a document that requires the extraction of concept nodes and relations between them extracted from the previous step. In the extraction of relations, we rely on the EWN resource to identify the taxonomic relations, and we add the fuzzy association rules model to identify non taxonomic relations (contextual) between concepts. This model represents an inference mechanism to discover these latent relations, buried in List_S and carried by the semantic context. The goal of this model is to better represent the semantic content of the document. Thus, the novelty of this model involves two aspects: (1) co-occurrence of terms is taken into account during indexing of the List_S. The model’s descriptors are no longer words but sets of index terms (term-sets). The term sets capture the intuition that semantically related terms appear near one another in a List_S. (2) To estimate the importance of the word in the document not only by its frequency of occurrence, but also by semantic proximity and contextual values with the rest of the terms in the List_S.
3.
The index concepts are generated with new weights that better represent the content of the List_S by conceptual network formalism.

The paper is organized as follows: section 2 presents the existing problems, namely the disparity of terms and ambiguity faced in the indexing process. In Section 3, we present a bacground and state of the art of the indexing methods. In Section 4 we detail our indexing approach. In Section 5, we present experiments comparison and discussion of the results. Section 6 concludes the paper.

Top

2. Problematic

We address in our work three types of problems:

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Exploring Fuzzy Association Rules in Semantic Network Enrichment Improvement of the Semantic Indexing Process

Abstract

1. Introduction

2. Problematic

Complete Chapter List