Fuzzy Learning of Co-Similarities from Large-Scale Documents

Fuzzy Learning of Co-Similarities from Large-Scale Documents

Sonia Alouane-Ksouri, Minyar Sassi Hidri
Copyright: © 2015 |Pages: 17
DOI: 10.4018/ijfsa.2015100104
(Individual Articles)
No Current Special Offers


To analyze and explore large textual corpus, we are generally limited by the available main memory. This may lead to a proliferation of processor load due to greedy computing. The authors propose to deal with this problem to compute co-similarities from large-scale documents. The authors propose to enhance co-similarity learning by upstream and downstream parallel computing. The first deploys the fuzzy linear model in a Grid environment. The second deals with multi-view datasets while introducing different architectures by using several instances of a fuzzy triadic similarity algorithm.
Article Preview

Triadicdocument Co-Similarity Computing

In this work, we focus on the preprocessing step witch including data representation and similarity computing. It's proved that the sentence has been considered as a more informative feature term for improving the effectiveness of document clustering. While considering three levels (documents-sentences-words) to represent the data set, we are able to deal with a dependency between documents-sentences, as also between sentences-words and, by deduction, between documents-words.

The sentence-word-based document similarity (or triadic similarity) considers weighting scheme in computing the document similarity with sentences and the sentence similarity with words. A weighted value may be assigned as a link from a document to a word (or sentence) indicating the presence of the word(sentence) in that document.

Complete Article List

Search this Journal:
Volume 13: 1 Issue (2024)
Volume 12: 1 Issue (2023)
Volume 11: 4 Issues (2022)
Volume 10: 4 Issues (2021)
Volume 9: 4 Issues (2020)
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing