Enhancing Arabic NLP: A Comparative Study of AI-Driven Text Preprocessing Tools
Copyright: © 2027
|
Pages: 18
Abstract
Arabic is a first language for more than 300 million people. It has some unique features that can make it one of the most complex languages, such as multiple derivatives, unlimited vocabulary, diacritics, and others. Preprocessing Arabic text is an essential step in order to prepare text for Natural Language Processing (NLP) purposes. This article provides a comparison study of several preprocessing tools for Arabic text. It explains the challenges in pre-processing the Arabic language as well as the techniques that used in every particular tool. However, the authors used the PRISMA for reporting the systematic reviews, which they started with screening 200 articles and ended-up with including only 30 articles. After reviewing these articles deeply, the results show that different tools such as AMIRA, CAMel ,and NLP packages added value in text-preprocessing. However, most of this papers considered that the ambiguity in Arabic orthography as well as the dialectal variants are the most challenges in Arabic NLP.
Complete Chapter List
Search this Book: