Recent Advances in Feature Extraction and Description Algorithms: Hardware Designs and Algorithmic Derivatives

Recent Advances in Feature Extraction and Description Algorithms: Hardware Designs and Algorithmic Derivatives

Ehab Najeh Salahat (Australian National University, Australia) and Murad Qasaimeh (Iowa State University, USA)
DOI: 10.4018/978-1-5225-2848-7.ch016
OnDemand PDF Download:
List Price: $37.50


Computer vision is one of the most active research fields in technology today. Giving machines the ability to see and comprehend the world at the speed of sight creates endless applications and opportunities. Feature detection and description algorithms are considered as the retina for machine vision. However, most of these algorithms are typically computationally intensive, which prevents them from achieving real-time performance. As such, embedded vision accelerators (FPGA, ASIC, etc.) can be targeted due to their inherent parallelizability. This chapter provides a comprehensive study on some of the recent feature detection and description algorithms and their hardware solutions. Specifically, it begins with a synopsis on basic concepts followed by a comparative study, from which the maximally stable extremal regions (MSER) and the scale invariant feature transform (SIFT) algorithms are selected for further analysis due to their robust performance. The chapter then reports some of their recent algorithmic derivatives and highlights their recent hardware designs and architectures.
Chapter Preview

1. Introduction

Features analysis (e.g. extraction, and description) in static and dynamic environments is an active area of research, particularly in the robotics and computer vision research communities. It is primarily aiming towards object detection, recognition and tracking from a stream of frames and to describe the semantics from the object’s behavior (Hu, Tan, Wang, & Maybank, 2004). It has also a wide spectrum of promising applications, both governmental and commercial (which include, but are not limited to, access control to sensitive areas, population and crowd flux statistics, human detection and recognition, traffic analysis, detection of anomalous behaviors, vehicular tracking, drones, detection of military targets, etc.).

Given the remarkable increase in the amount of homogenous and inhomogeneous visual inputs (which is partly due to (1) the availability of cheap capturing devices such as the built-in cameras in smartphones, and (2) the availability of free image hosting websites and servers, the need for novel, robust, and automated features analysis algorithms and platforms that adapt to the application’s needs and requirements are of paramount importance. The current research tends to merge multiple disciplines such as digital signal/image processing, pattern recognition and classification, machine learning, circuit design and so on.

Moreover, as it is the case with many computer vision applications, there is a need for high performance algorithms and processing platforms that are capable of supporting real-time applications. The required intensive computations (e.g. real-time feature detection and extraction from high-definition video stream or with high-resolution satellite imagery applications) also need massive processing capabilities. Digital Signal Processors (DSPs), Field Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASIC), System-on-Chip (SoC), and Graphic Processing Units (GPUs) platforms with smarter, parallelizable, and pipelinable hardware processing designs could be targeted to alleviate this issue. However, hardware-constrains (e.g. memory, power, scalability, format interfacing, etc.), constitute a major bottleneck. The typical solution to these hardware-related issues is to scale down the resolution or to sacrifice (tradeoff) the accuracy and performance of the application. Moreover, the state-of-the-art in computer vision has also confirmed that it is the processing algorithms that will make a substantial contribution to resolve these issues (Liu, Chen, & Naoyuki Kubota, 2013) (Wang, Tao, Di, Ye, & Shi, 2012). That is, processing algorithms might be targeted to overcome most of those issues associated with the power- and memory-thirsty hardware requirements, and might yield breakthroughs (Ngo, Ives, Rakvic, & Broussard, 2013). The challenge now is, however, to devise, implement and deploy these new (enhanced) algorithms, that mainly fall in the feature detection and description category, which are the fundamental tools for many visual computations applications.

To ensure the robustness of vision algorithms, an essential prerequisite is to be designed to cover a wide range of possible scenarios with a high-level of repeatability and affinity. Ultimately, studying all of the possible scenarios is virtually impossible, yet a clear understanding of all these variables is critical for a successful design. Key factors influencing real-time performance include the processing platform (and the associated constrains on memory, power and frequency in FPGAs, SoCs, GPUs, etc., that may result in algorithmic changes that can impact the performance), monitored environment (illuminations, reflections, shadows, view orientation and angle), and applications of interest (targets of interest, tolerable miss detection/false alarm rates and the desired tradeoffs, and allowed latency). Consequently, a careful study and analysis of potential computer vision algorithms is essential.

Complete Chapter List

Search this Book: