Translation of Medical Texts using Neural Networks

Translation of Medical Texts using Neural Networks

Krzysztof Wolk (Polish-Japanese Academy of Information Technology, Warsaw, Poland) and Krzysztof P. Marasek (Polish-Japanese Academy of Information Technology, Warsaw, Poland)
Copyright: © 2016 |Pages: 16
DOI: 10.4018/IJRQEH.2016100104
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The quality of machine translation is rapidly evolving. Today one can find several machine translation systems on the web that provide reasonable translations, although the systems are not perfect. In some specific domains, the quality may decrease. A recently proposed approach to this domain is neural machine translation. It aims at building a jointly-tuned single neural network that maximizes translation performance, a very different approach from traditional statistical machine translation. Recently proposed neural machine translation models often belong to the encoder-decoder family in which a source sentence is encoded into a fixed length vector that is, in turn, decoded to generate a translation. The present research examines the effects of different training methods on a Polish-English Machine Translation system used for medical data. The European Medicines Agency parallel text corpus was used as the basis for training of neural and statistical network-based translation systems. A comparison and implementation of a medical translator is the main focus of our experiments.
Article Preview

Introduction

Machine Translation (MT) is a computer’s translation of text with no human assistance. MT systems have no knowledge of language rules. Instead, they translate by analyzing large amounts of text in language pairs. They can be trained for specific domains or applications using additional data germane to a selected domain. MT systems typically deliver translations that sound fluent, although they tend to be less consistent than human translations. Statistical machine translation (SMT) has rapidly evolved in recent years. However, existing SMT systems are far from perfect, and their quality decreases significantly in specific domains. For instance, Google Translate would work well for generalized texts like dialogs or traveler aid but would fail when translating medical or law terms. To show the contrast we will compare the results to Google Translate engine. The scientific community has been engaged in SMT research. Among the greatest advantages of statistical machine translation is that perfect translations are not required for most applications (Koehn, 2007). Users are generally interested in obtaining a rough idea of a text’s topic or what it means. However, some applications require much more than this. For example, the beauty and correctness of writing may not be important in the medical field, but the adequacy and precision of the translated message is very important. A communication or translation error between a patient and a physician in regard to a diagnosis may have serious consequences on the patient’s health. Progress in SMT research has recently slowed down. As a result, new translation methods are needed. Neural networks provide a promising approach for translation (Costa-jussà, 2012) due the fact of their rapid progress in terms of methodology and computation power. They also bring opportunity to overcome limits of statistical-based methods that are not context-aware.

Machine translation has been applied to the medical domain due to the recent growth in interest in and success of language technologies. As an example, a study was done on local and national public health websites in the USA with an analysis of the feasibility of edited machine translations for health promotional documents (Kirchhoff, 2011). It was previously assumed that machine translation was not able to deliver high quality documents that can used for official purposes. However, language technologies have been steadily advancing in quality. In the not-too-distant future, it is expected that machine translation will be capable of translating any text in any domain at the required quality.

The medical data field is a bit narrow, but very relevant and a promising research area for language technologies. Medical records can be translated by use of machine translation systems. Access to translations of a foreign patient’s medical data might even save their life. Direct speech-to-speech translation systems are also possible. An automated speech recognition (ASR) system can be used to recognize a foreign patient’s speech. After it is recognized, the speech could be translated into another language with synthesis in real time. As an example, the EU-BRIDGE project intends to develop automatic transcription and translation technology. The project desires innovative multimedia translation services for audiovisual materials between European and non-European languages [http://www.eu-bridge.eu].

Making medical information understandable is relevant to both physicians and patients (Dušek, 2014). As an example, Healthcare Technologies for the World Traveler emphasizes that a foreign patient may need a description and explanation of their diagnosis, along with a related and comprehensive set of information. In most countries, residents and immigrants communicate in languages other than the official one (Worldwide, 1998-2015).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 7: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2014)
Volume 2: 4 Issues (2013)
Volume 1: 4 Issues (2012)
View Complete Journal Contents Listing