Designing High Accuracy Statistical Machine Translation for Sign Language Using Parallel Corpus: Case Study English and American Sign Language

Designing High Accuracy Statistical Machine Translation for Sign Language Using Parallel Corpus: Case Study English and American Sign Language

Achraf Othman (Research Lab. LaTICE, University of Tunis, Tunis, Tunisia) and Mohamed Jemni (Research Lab. LaTICE, University of Tunis, Tunis, Tunisia)
Copyright: © 2019 |Pages: 25
DOI: 10.4018/JITR.2019040108

Abstract

In this article, the authors deal with the machine translation of written English text to sign language. They study the existing systems and issues in order to propose an implantation of a statistical machine translation from written English text to American Sign Language (English/ASL) taking care of several features of sign language. The work proposes a novel approach to build artificial corpus using grammatical dependencies rules owing to the lack of resources for sign language. The parallel corpus was the input of the statistical machine translation, which was used for creating statistical memory translation based on IBM alignment algorithms. These algorithms were enhanced and optimized by integrating the Jaro–Winkler distances in order to decrease training process. Subsequently, based on the constructed translation memory, a decoder was implemented for translating English text to the ASL using a novel proposed transcription system based on gloss annotation. The results were evaluated using the BLEU evaluation metric.
Article Preview

Introduction

We can easily exchange our ideas, collaborate, and build strategies together, if we could speak all languages. Alternatively, we should have a communication tool that allows such innovations. Communication is the essence of human interaction. One of the most effective approaches of communication between human beings is through languages, which were born from human interaction. Communication and language exhibit a relation of interdependence. However, this natural method of communication can establish an obstacle in the cases where languages are postponed. Nobody can ignore the obvious barrier of communication between languages of different modalities, worth knowing the vocal and sign languages.

In fact, Sign Languages (SLs), used by the deaf communities, which do not hear or hear badly, are visual-gesture languages. The message is transmitted by gestures and movements and received by the visual channel. The vocal languages, as for them, exhibit an audio-phonatory character. Their message is emitted through a phonatory canal and received thanks to the auditory canal. Canals used by the SLs are thus different from those used typically, and the SLs distinguish them from other languages. The linguistic knowledge acquired after several years of research and studies on diverse vocal languages are with difficulty transposable in the SL. Therefore, SL becomes a novel object of linguistic study, and it started from the 1960s. This research field is thus more recent, and it explains that the knowledge of the linguistic research on SL is constantly evolving.

This paper concerns NLP and is interested, in particular, in the case of ASL. Exotic by their implementation of gestures/movements and not sounds, these leave the traditional phonological frame and do not have a standard phonetic script similar to the vocal languages to transcribe their realizations. Moreover, exotic by their multi-linearity, they utilize their common canal to convey several elements of information simultaneously, whereas the voice device allows the production of only a sound at a time. The language models raise natural questions that are different from those suggested by the vocal languages.

Transcription is the operation that substitutes a grapheme or a group of graphemes of a writing system for every phoneme or for every sound. It thus depends on the target language, a unique phoneme that can correspond to various graphemes following the considered language. In short, it is the writing of words or pronounced sentences in a given system. The transcription also aims at being without loss, so that it should be ideally possible to reconstitute the original pronunciation from this one by knowing the rules of transcription.

From all cited perspectives, this paper concerns the transcription of SL and the implementation of a Statistical Machine Translation and concentrates more particularly on the machine translation of a text in English to ASL and conversely. The study produced articulates around four axes:

  • Presenting an overview about SL and a state of the art about Sign Language Processing;

  • Proposing a novel transcription system to write SL called XML-Gloss Annotation System;

  • Building an artificial parallel corpus between English and ASL using XML-Gloss Annotation System;

  • Implementing a Statistical Machine Translation between English and ASL based on the Artificial Parallel Corpus.

WebSign Project

This work is a part of the WebSign Project (Chabeb et al., 2008) which is a project developed within Research Laboratory LaTICE of the University of Tunis. The predominant objective of this project is to establish a communication system for the enhancement of the accessibility of deaf people to information. This tool allows to increase the autonomy of the deaf and does not require non-deaf to acquire special skills to communicate with them. This project is a multi-community, and it offers the possibility of registering a sign in several languages. In fact, it incorporates an interactive interface that allows the creation of dictionaries. The synthesis is based on a virtual avatar (Jemni et al., 2008). WebSign integrates a system of transcription Sign Modeling Language (SML). The synthesis kernel enabled them to develop several applications (Othman et al., 2010).

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 13: 4 Issues (2020): Forthcoming, Available for Pre-Order
Volume 12: 4 Issues (2019): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2018)
Volume 10: 4 Issues (2017)
Volume 9: 4 Issues (2016)
Volume 8: 4 Issues (2015)
Volume 7: 4 Issues (2014)
Volume 6: 4 Issues (2013)
Volume 5: 4 Issues (2012)
Volume 4: 4 Issues (2011)
Volume 3: 4 Issues (2010)
Volume 2: 4 Issues (2009)
Volume 1: 4 Issues (2008)
View Complete Journal Contents Listing