Digital Documents Recognition

Digital Documents Recognition

Nicola Barbuti (University of Bari Aldo Moro, Italy) and Tommaso Caldarola (D.A.BI.MUS. Ltd – Digitalizzazione di Archivi, BIblioteche e MUSei, Italy)
Copyright: © 2015 |Pages: 11
DOI: 10.4018/978-1-4666-5888-2.ch379
OnDemand PDF Download:
$30.00
List Price: $37.50

Chapter Preview

Top

Background

Currently, digital document recognition encompasses the following technologies:

  • 1.

    Optical Character Recognition

  • 2.

    Intelligent Word Recognition (IWR)

  • 3.

    Intelligent Character Recognition (ICR)

  • 4.

    Pattern Matching.

This paragraph is divided concerning to the above different processing methodologies and specifying the features related to each one.

Key Terms in this Chapter

Digital Recognition Technology: The analytical artificial intelligence technology that processes sequences of characters or of whole words or phrase to recognize the content of digital images.

Intelligent Character Recognition: The technology able to recognize and to index the textual content of images of handwritten or printed documents by a training process based on segmentation and self-learning approach.

Training Process: Used to create a new ICR technology, it is based on the self-learning of the content of digital documents: through a segmentation process preliminarily defined, it creates the possibility to perform some parameters in order to optimize the performance and goodness of the results of the recognition, referring to each kind of document.

Semi-Automatic Self-Learning of Fonts: The key step of the new ICR technology, on which the whole process is based: if there are errors or flaws at this stage, everything that follows may result inaccurate. This process is iterative and incremental.

Graphic Matching: Based on a matching process, this technology allows to identify graphic objects into digital images such as regions or portions of them, using an approach based on shapes.

Optical Character Recognition: The technology based on electronic conversion of scanned images of printed text into machine-encoded text.

Intelligent Word Recognition: The technology able to work on manuscripts and handwritten documents by segmentation of word within the image content.

Complete Chapter List

Search this Book:
Reset