Evaluation Approach of Arabic Character Recognition

Evaluation Approach of Arabic Character Recognition

Hanan Aljuaid, Dzulkifli Mohamad, Muhammad Sarfraz
DOI: 10.4018/978-1-4666-3906-5.ch010
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This paper proposes and contributes towards designing a complete system for off-line Arabic character recognition. The proposed system is specifically meant for Arabic handwriting recognition, but it equally works for the typed character recognition. It has various phases including preprocessing and segmentation. It also includes thinning phase and finds vertical and horizontal projection profiles. The recognition phase is managed by genetic algorithm. The genetic algorithm stands on feature extraction algorithm that defines six features for each segment. The algorithm, for Arabic handwriting recognition, obtained 90.46 recognition rate. The proposed system has been compared with other systems in the literature. It has achieved the second best recognition rate.
Chapter Preview
Top

1. Introduction

Character Recognition (CR) is an area of Pattern Recognition that has always been active for research. CR automation means translating images of characters into an editable text. In other words, it represents an attempt to simulate the human reading process. Handwritten character recognition is a very challenging task due to the existence of many difficulties such as the high variability of the handwritten styles and shapes, uncertainty of human writing, skew or slanting issues, segmentation of the words into characters and the size of the lexicon.

The problem of handwriting recognition (HWR) can be classified into two main groups, off-line and on-line recognition, according to the format of handwriting inputs. In offline recognition, only the image of the handwriting is available, while in the on-line case temporal information such as pen tip coordinates, as a function of time, is also available. Many applications require off-line HWR capabilities such as bank processing, mail sorting, document archiving, commercial form-reading, office automation, etc. So far, off-line HWR remains an open problem, in spite of a dramatic boost of research (Koerich et al., 2003; Plamondon, 2000; Vinciarelli, 2002) in this field and the latest improvements in recognition methodologies (El-Yacoubi, Gilloux, Sabourin, & Suen, 1999; Vinciarelli, 2004).

Studies in Arabic handwriting recognition, although not as advanced as those devoted to other scripts (e.g. Latin), have recently shown interest (Amin, 1998; Amara & Essoukri, 2003; Lorigo, 2006). It is important to point out here that the techniques developed for Latin HWR are not appropriate for Arabic handwriting. This is because, Arabic script is based on an alphabet and rules are distinct from those of Latin. Arabic language has special characteristics including, for example, an Arabic letter might have up to four different shapes, depending on its relative position in the text. For instance, the letter (ع) has four different shapes: at the beginning of the word (preceded by a space), in the middle of the word (no space around it), at the end of the word (followed by a space), and in isolation (preceded by an unconnected letter and followed by a space). These four possibilities, for all the Arabic characters, are shown in Table 1.

Table1.
Different shapes of Arabic alphabet
Isolated (I)End (E)Middle (M)Beginning (B)
Alifأـاـاا
Baبـبـبـبـ
Taتـتـتـتـ
Thaثـثـثـثـ
Jimجـجـجـجـ
Haحـحـحـحـ
Khaخـخـخـخـ
Dalددـدد
Dhalذذـذذ
Raررـرر
Zanززـزز
Siinسـسـسـسـ
Shiinشـشـشـشـ
Saddصـصـصـصـ
Dadضـضـضـضـ
Tahnطـطـطـطـ
Zahظـظـظـظـ
Aynعـعـعـعـ
Ghaynغـغـغـغـ
Faفـفـفـفـ
Gafقـقـقـقـ
Kafكـكـكـكـ
Lamلـلـلـلـ
Miimمـمـمـمـ
Noonنـنـنـنـ
Haهـهـهـهـ
Wawووـوو
Yaيـيـيـيـ
Lamalifلاـلاـلا
Tamarbotaةـة

Complete Chapter List

Search this Book:
Reset