A New Topological Method for Examining Historical Inscriptions

A New Topological Method for Examining Historical Inscriptions

Loránd Lehel Tóth (Budapest University of Technology and Economics, Budapest, Hungary) and Gábor Hosszú (Budapest University of Technology and Economics, Budapest, Hungary)
Copyright: © 2019 |Pages: 16
DOI: 10.4018/JITR.2019040101

Abstract

The article presents a new method developed to increase the efficiency of the identification algorithm for historical inscriptions of unknown origin. The authors extracted topological properties of the symbols containing different script relics, and analyzed them by using statistical tools. The considered topological properties are circular loop, oblique lines, vertical section, and crossing, among others. The article describes the use of the number of a three-circle grapheme intersection point vectors to identify unknown symbols. The number of intersections of the three circles and the examined symbol is stored in a feature vector. By supplementing the feature vectors with the circle vectors, the authors succeeded in improving the efficiency of the algorithm designed to decipher hard to read historical inscriptions.
Article Preview
Top

Introduction

The humanities-based paleography is a science that deals with the reading of ancient writing systems (scripts). Differently, computational paleography, in other words engineering in paleography, as a branch of applied computer science, deals with the spatial analysis of various glyphs, models the evolution of different scripts (Hosszú & Kovács, 2016), and provides a support for deciphering ancient inscriptions (Tóth et al., 2015), among others. Computational paleography extends the engineering modeling methods to any data of the written cultural heritage.

In this article the basic written units of the scripts are called graphemes, in other word characters. A grapheme is the smallest semantically (Sukkarieh et al., 2012) or phonetically distinguishing element in a writing system. The grapheme can be of different types: letters, ligatures, pictograms, ideographs, numbers, logograms, punctuation marks, etc. The graphemes can have various properties. They have a transliteration value in angle brackets, one or more shapes called glyphs, one or more sound values between slashes, and terms of usage. Among the glyphs there is one that was used to be treated with highest priority; it is usually called a typical glyph. A normalized (“ideal”) glyph is a designed typical glyph of a grapheme in accordance to the most significant visual properties of a grapheme. The most significant visual properties of a grapheme are called visual identity. The visual identity, the topology, the phonetic meaning and the semantic usage of a grapheme constitute the components of a layered grapheme model developed by Pardede et al. (2016).

The productions of using a writing system are the so-called inscriptions. Inscriptions—independently the applied writing technology (carving a wall, writing onto paper, etc.)—is generally composed of symbols, which are the smallest individual units of the inscriptions from a visual perspective. Consequently, a symbol is the materialization of a particular glyph of a grapheme, and the glyph of a grapheme is the abstraction of a symbol (Hosszú, 2014). The symbols and glyphs are called together shapes.

The task of examining ancient inscriptions is largely mapping symbols of the inscription to be deciphered to the grapheme set of a certain script. This problem differs from the goal of the well-known optical character recognition (OCR). Namely, while in case of OCR it can be assumed that normalized glyph of the written symbols is well-known and visual information found in the inscription should be assimilated to some well-known grapheme, the visual information on the other hand found in the inscription during examining historical inscriptions has to be assigned to a grapheme in such a way, that the typical glyph of that grapheme in a certain age is not known either. In the performed examinations the visual information of the glyphs is described by providing set of topological parameter values suitably chosen by human intervention and the resulting set of parameters is used as a series of input data for identification procedure.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 13: 4 Issues (2020): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2019)
Volume 11: 4 Issues (2018)
Volume 10: 4 Issues (2017)
Volume 9: 4 Issues (2016)
Volume 8: 4 Issues (2015)
Volume 7: 4 Issues (2014)
Volume 6: 4 Issues (2013)
Volume 5: 4 Issues (2012)
Volume 4: 4 Issues (2011)
Volume 3: 4 Issues (2010)
Volume 2: 4 Issues (2009)
Volume 1: 4 Issues (2008)
View Complete Journal Contents Listing