Development of a Text-to-Speech Scanner for Visually Impaired People

Minerva Sarma (A. D. Y. Patil University, India), Anuskha Kumar (A. D. Y. Patil University, India), Aditi Joshi (A. D. Y. Patil University, India), Suraj Kumar Nayak (National Institute of Technology Rourkela, India) and Biswajeet Champaty (A. D. Y. Patil University, India)
In this chapter, a low-cost, efficient, and real-time wearable text-to-speech scanner has been proposed that can enable blind persons to hear the contents of a text material. The device captures the images of the text and converts them to speech. The hardware of the device has been realized using Raspberry Pi 3, Pi camera, and an earphone. Optical character recognition (OCR) and text-to-speech synthesis (TTS) have been implemented using Raspberry Pi 3 to accomplish the working of the device. OCR technology converted the captured text images to editable text, whereas the TTS technology scanned the alphanumeric characters in the processed image and converted them to speech. The proposed technology imitates the ability of the human sensory organs and the nervous system, where the camera mimics human eye and the image processing in Raspberry Pi 3 substitutes the human brain. This proposed device can also help people suffering from diseases like dyslexia and nyctalopia, and inability to see in dim light or at night.
The work done by Akhlagi et al. (2003) on reader pen scanner is a major technological advancement for learning alphabets and is highly useful for people suffering from hindrance in reading such as dyslexia patients. The device is portable and pocket sized. It has an in-built dictionary and converts the text from Spanish to English and vice-versa. The user needs to pass the nib across a line and the pen will read the words aloud. The main drawback of this device is that it is not suitable for a visually impaired person (Akhlagi, Lonn, & Wittrup, 2003). Sung Wook Park (2008) developed the Voice Stick, which is a text scanning device and converts text to voice. The voice stick is shaped like a wand, which when filtered in the printed text material, the OCR perceives the content and changes the data into voice. The limitation of the device is that it is not user-friendly, especially for blind people. This may be attributed to the fact that the device has to be in contact with the text material and placing the device horizontally on every line properly is not an easy task (Park, 2008).

