Hybrid Segmentation Prototype for Arabic Text-Based Documents: Towards Plagiarism Detection

Hybrid Segmentation Prototype for Arabic Text-Based Documents: Towards Plagiarism Detection

Sonia Alouane-Ksouri (National Engineering School of Tunis, University of Tunis El Manar, Tunis, Tunisia) and Minyar Sassi Hidri (National Engineering School of Tunis, University of Tunis El Manar, Tunis, Tunisia)
DOI: 10.4018/ijssmet.2015010104
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The contribution of this work relates to the field of Arabic text-based document analysis for the detection of plagiarism. This analysis will be carried out according to the triadic computation model of document similarity. The authors propose a hybrid segmentation prototype for Arabic text-based documents that links different processing steps in order to generate the similarity rate between the documents of an Arabic corpus. It involves two segmentation systems and a morphological analysis in order to obtain a matrix representation adapted to the triadic similarity computation according to three abstraction levels: documents, sentences and words.
Article Preview

Particularitis Of Arabic Text

In order to clearly identify this field of application, we give a brief overview of the particularities of an Arabic text: it is read and written from right to left, it lacks vowels and punctuation, the words are characterized by agglutination and the word order in the sentence by irregularity.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing