Phoenix Precision Algorithm for Blind People With Enhanced Voice Assistant

Phoenix Precision Algorithm for Blind People With Enhanced Voice Assistant

Judy Flavia B. (SRM Institute of Science and Technology, India), S. Sridevi (SRM Instıtute of Science and Technology, India), V. Srivathsan (SRM Instıtute of Science and Technology, India), Aravindak Kumar R. K. (SRM Instıtute of Science snd Technology, India), Ashwin Kumar M. K. (SRM Instıtute of Science and Technology, India), S. Rubin Bose (SRM Instıtute of Science and Technology, India), and R. Regin (SRM Instıtute of Science and Technology, India)
DOI: 10.4018/979-8-3693-0502-7.ch016
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The chapter presents an innovative approach to object detection that combines the advantages of the DETR (DEtection TRansformer) and RetinaNet models and features a phoenix precision algorithm. Object tracking is a basic computer vision task for identifying and locating objects in an image. The DETR model revolutionized object detection by introducing a transformer-based architecture that eliminates the need for anchor boxes rather than maximum damping, resulting in industry-leading performance. On the other hand, RetinaNet is a popular single-stage object detection model known for its efficiency and accuracy. This chapter proposes a hybrid model that uses both DETR and RetinaNet. The transformer-based architecture of the DETR model provides an excellent understanding of the overall context and allows you to capture long-range dependencies and maintain object associations. Meanwhile, RetinaNet's pyramid array (FPN) and focus loss enable precise localization and manipulation of objects at different scales.
Chapter Preview
Top

Introduction

Visual impairment is a major challenge affecting millions of people worldwide. According to the World Health Organization, there are more than 285 million visually impaired people worldwide, of whom 39 million are completely blind (Abdullahi et al., 2023). The lack of visual cues in daily activities creates difficulties, such as recognizing objects, navigating unfamiliar environments, and performing routine tasks (Anand et al., 2023). These difficulties can significantly limit the autonomy and quality of life of people with visual impairments (Angeline et al., 2023).

Recent advances in computer vision have enabled the development of assistive devices that can increase the visually impaired’s independence and quality of life (Kanyimama, 2023). One such technology is object recognition, which allows users to recognize and interact with objects in their environment (Arslan et al., 2021). Object detection algorithms use image processing techniques to identify objects in the image and use envelopes to locate them (Aryal et al., 2022). These algorithms can detect various objects, including but not limited to vehicles, pedestrians, street signs, and obstacles. Several object detection algorithms are available in the literature, including but not limited to YOLO, Faster R-CNN, and RetinaNet (Bansal et al., 2023). These algorithms have successfully improved object detection accuracy and reduced computation time (Bansal et al., 2022). However, these algorithms rely solely on visual cues, which may not be enough for visually impaired people who rely on other senses to perceive the world around them (Das et al., 2022) (Fig.1).

Figure 1.

Phoenix precision backbone architecture

979-8-3693-0502-7.ch016.f01

There has been growing interest in developing object detection algorithms that can help visually impaired people in recent years (Bhardwaj et al., 2023). These algorithms use other senses, such as hearing, touch, and smell, to enhance the user’s ability to recognize and recognize objects (Gunturu et al., 2023). Delayed Acoustic Feedback (TDAF) is one technique in which the acoustic feedback of the user’s voice or other sounds is delayed. This technique could help blind users to be more aware of their surroundings and recognize objects using audible cues (Shifat et al., 2023). By providing auditory feedback to the user, TDAF can greatly improve object recognition for the visually impaired (Chaturvedi et al., 2022). The Transformer Detection Algorithm (DETR) is the latest advance in object detection that has shown promise for improving detection accuracy while reducing computation time (Cirillo et al., 2023). This algorithm uses a transformer-based architecture that can simultaneously detect and classify objects, making it an effective object detection tool (Uthiramoorthy et al., 2023). Combining DETR and TDAF can provide the blind with an efficient and accurate object recognition system using audio and visual cues (Gaayathri et al., 2023). RetinaNet is another object detection algorithm that has attracted attention due to its high accuracy and ability to handle objects at multiple scales (Devi & Rajasekaran, 2023; Srivastava & Roychoudhury, 2021). The algorithm is designed to improve object detection when objects of different sizes are in the same frame (Goswami et al., 2022). Combining RetinaNet with DETR could further improve the accuracy of object(obstacle) recognition for the blind (Kosuru & Venkitaraman, 2022; Uike et al., 2022). The article proposes an innovative approach to object recognition for the blind, combining DETR, TDAF, and RetinaNet (Sivapriya et al., 2023). The proposed system uses DETR to recognize and classify objects in an image, while RetinaNet is used to enhance the accuracy of the object detection process (Jeba et al., 2023). TDAF provides auditory feedback to users, helping to improve the user’s ability to detect objects and obstacles they will encounter in their day-to-day lives (Kumar Jain, 2022; Venkitaraman & Kosuru, 2023). The proposed system is evaluated using real-world scenarios and shows promising results in detecting and classifying objects accurately and efficiently (Krishna Das et al., 2022; Srivastava & Roychoudhury, 2020) (Fig.2).

Complete Chapter List

Search this Book:
Reset