Fuzzy Learning of Co-Similarities from Large-Scale Documents

Fuzzy Learning of Co-Similarities from Large-Scale Documents

Sonia Alouane-Ksouri (Ecole Nationale d'Ingénieurs de Tunis, Université de Tunis El Manar, Tunis, Tunisia) and Minyar Sassi Hidri (Ecole Nationale d'Ingénieurs de Tunis, Université de Tunis El Manar, Tunis, Tunisia)
Copyright: © 2015 |Pages: 17
DOI: 10.4018/ijfsa.2015100104
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

To analyze and explore large textual corpus, we are generally limited by the available main memory. This may lead to a proliferation of processor load due to greedy computing. The authors propose to deal with this problem to compute co-similarities from large-scale documents. The authors propose to enhance co-similarity learning by upstream and downstream parallel computing. The first deploys the fuzzy linear model in a Grid environment. The second deals with multi-view datasets while introducing different architectures by using several instances of a fuzzy triadic similarity algorithm.
Article Preview

Triadicdocument Co-Similarity Computing

In this work, we focus on the preprocessing step witch including data representation and similarity computing. It's proved that the sentence has been considered as a more informative feature term for improving the effectiveness of document clustering. While considering three levels (documents-sentences-words) to represent the data set, we are able to deal with a dependency between documents-sentences, as also between sentences-words and, by deduction, between documents-words.

The sentence-word-based document similarity (or triadic similarity) considers weighting scheme in computing the document similarity with sentences and the sentence similarity with words. A weighted value may be assigned as a link from a document to a word (or sentence) indicating the presence of the word(sentence) in that document.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing