Detection of Small Oranges Using YOLO v3 Feature Pyramid Mechanism

Detection of Small Oranges Using YOLO v3 Feature Pyramid Mechanism

Francisco de Castro, Angelin Gladston
Copyright: © 2021 |Pages: 15
DOI: 10.4018/IJNCR.2021100102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Existing approaches to fruit detection experience difficulty in detecting small fruits with low overall detection accuracy. The reasons why many detectors are unable to handle small fruits better are that fruit data sets are small, and they are not enough to train previous models of YOLO. Further, these models used in fruit detection are initialized by a pre-trained model and then fine-tuned on fruit data sets. The pre-trained model was trained on the ImageNet data set whose objects have a bigger scale than that of the fruits in the fruit pictures. Fruit detection being a fundamental task for automatic yield estimation, the goal is to detect all the fruits in images. YOLO-V3 uses multi-scale prediction to detect the final target, and its network structure is more complex. Thus, in this work, YOLO-V3 is used to predict bounding boxes on different scales and to make multi-scale prediction, thereby making YOLO-V3 more effective for detecting small targets. The feature pyramid mechanism integrates multi-scale feature information to improve the detection accuracy.
Article Preview
Top

1. Introduction

One of the largest economic sectors is agriculture and it plays the vital role in the global economy. This means that the world’s agriculture productivity has to increase sustainably, and more independent of human work. With the development of modern agricultural robots, more and more labour work in fields has been replaced with robots, such as weeding (Lottes et., al, 2018), pesticide spraying (Berenstein et. al., 2018), plant growth monitoring (Das et. al., 2015), yield estimation (Bargoti et. al., 2017) as well as fruit harvesting (Vitzrabin et. al., 2016). Therein, for yield estimation, planters sample one fruit field at a time in order to estimate the total yield. A traditional way of fruit yield estimation is taking pictures of the fruit trees and then counting fruits in the pictures. However, such a method is inefficient and not scalable. At present, with the help of modern agricultural robots, fruit images can be collected remotely and conveniently before the collected data are processed by computer vision.Vision based fruit detection is a critical component for infield automation in agriculture. With accurate knowledge of individual fruit locations in the field, it is possible to perform yield estimation and mapping, which is important for growers as it facilitates efficient utilisation of resources and improves returns per unit area and time.

Precise localisation of the fruit is also a necessary component of an automated robotic harvesting system, which can help mitigate one of the most labour intensive tasks in an orchard. Typically, prior work utilises hand engineered features to encode visual attributes that discriminate fruit from non-fruit regions. Although these approaches are well suited for the dataset they are designed for, feature encoding is generally unique to a specific fruit and the conditions under which the data were captured. More recently, advances in the computer vision community have translated to agrovision which implies computer vision in agriculture, achieving state-of-the-art results with the use of Deep Neural Networks (DNNs) for object detection and semantic image segmentation. These networks avoid the need for hand-engineered features by automatically learning feature representations that discriminately capture the data distribution. Deep neural network based detectors have been demonstrated to be effective for fruit detection.

YOLO is being used for any object detection. This methodcomprises of a single deep convolutional neural network which isnormally a version of GoogLeNet, later updated and the updated one referred as DarkNet. It is based on VGG that splits the input into a grid of cells and in that each of those cells directly predict a bounding box and perform object classification. This results in a large number of candidate bounding boxes that are then consolidated into a final object prediction by means of a post-processing step.

YOLOgenerally achieves high accuracy while also being able to run in real-time. The algorithm basically works on “only looks once”at the imagelogic, since it requires only one forward propagation to pass through the neural network to make the object predictions. After the non-max suppression which makes sure that the object detection algorithm only detects each object once, it then provides all the recognized objects together with the respective bounding boxes.YOLO uses an innovative strategy to resolve object detection as a regression problem, it detects bounding box coordinates and class probabilities directly from the image, as opposed to previous algorithms that remodel classifiers for detection (Du et. al., 2021).

The objective of the paper is to use YOLOv3 for small orange detection. To utilize the non-max suppression for better small orange detection with improved accuracy. To facilitate an overall architecture for the orange fruit detection system. To use the Darknet53 classifier for feature extraction and to make use of multi-scale prediction based on different sampling layers for enhanced small fruit detection.

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 11: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 10: 4 Issues (2021)
Volume 9: 4 Issues (2020)
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 2 Issues (2017)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing