FPN-Based Small Orange Fruit Detection From Farm Images With Occlusion

FPN-Based Small Orange Fruit Detection From Farm Images With Occlusion

Francisco de Castro, Angelin Gladston
Copyright: © 2022 |Pages: 12
DOI: 10.4018/IJKBO.296394
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Fruit detection using deep learning is yielding very good performance, the goal of this work is to detect small fruits in images under these occlusion and overlapping conditions. The overlap among fruits and their occlusion can lead to false and missing detection, which decreases the accuracy and generalization ability of the model. Therefore, a small orange fruit recognition method based on improved Feature Pyramid Network was developed. To begin with, multi-scale feature fusion was used to fuse the detailed bottom features and high-level semantic features to detect small-sized orange to improve recognition rate. And then repulsion loss was used to take place of the original smooth L1 loss function. Besides, Soft non-maximum suppression was adopted to replace non-maximum suppression to screen the bounding boxes of orange to construct a recognition model of orange fruits. Finally, the network was trained and verified on the collected image data set. The results showed that compared with the traditional detection models, the mean average precision was improved from 79.7 to 82.8%.
Article Preview
Top

1. Introduction

Correct localisation of the fruit is a necessary step especially while trying to locate small oranges from an orchard image (Bargoti et. al., 2017). Typically, prior work utilises hand engineered features to encode visual attributes that discriminate fruit from non-fruit regions. Although these approaches are well suited for the dataset they are designed for, feature encoding is generally unique to a specific fruit and the conditions under which the data were captured. More recently, advances in the computer vision community have translated to computer vision in agriculture (Lottes et. al., 2018), achieving state-of-the-art results with the use of Deep Neural Networks for object detection and semantic image segmentation. These networks avoid the need for hand-engineered features by automatically learning feature representations that discriminately capture the data distribution. Deep neural network based detectors have been demonstrated to be effective for fruit detection (Berenstein et. al., 2018).

MASK-RCNN Object Detectionextends Faster R-CNN by adding a branch for predicting segmentation masks on each Region of Interest (RoI), in parallel with the existing branch for classification and bounding box regression. The mask branch is a small FCN applied to each RoI, predicting a segmentation mask in a pixel-to-pixel manner. Mask R-CNN is simple to implement and train given the Faster R-CNN framework, which facilitates a wide range of flexible architecture designs. Additionally, the mask branch only adds a small computational overhead, enabling a fast system and rapid experimentation. In principle Mask R-CNN is an intuitive extension of Faster R-CNN, yet constructing the mask branch properly is critical for good results. Most importantly, Faster RCNN was not designed for pixel-to-pixel alignment between network inputs and outputs. This is most evident in how RoIPool, the de facto core operation for attending to instances, performs coarse spatial quantization for feature extraction.

To fix the misalignment a simple quantization-free layer is proposed, called RoIAlignthat faithfully preserves exact spatial locations. Despite being a seemingly minor change, RoIAlign has a large impact: it improves mask accuracy by relative 10% to 50%, showing bigger gains under stricter localization metrics. Second, it was found essential to decouple mask and class prediction: a binary mask is predicted for each class independently, without competition among classes, and rely on the network’s RoI classification branch to predict the category.

Without bells and whistles, Mask R-CNN surpasses all previous state-of-the-art single-model results on the COCO instance segmentation task, including the heavily engineered entries from the 2016 competition winner. As a by-product, this method also excels on the COCO object detection task. In ablation experiments, multiple basic instantiations were evaluated, which allows us to demonstrate its robustness and analyze the effects of core factors. This models can run at about 200ms per frame on a GPU, and training on COCO takes one to two days on a single 8-GPU machine. It is believed the fast train and test speeds, together with the framework’s flexibility and accuracy, will benefit and ease future research on instance segmentation. Finally, a showcase the generality of this framework via the task of human pose estimation on the COCO keypoint dataset. By viewing each keypoint as a one-hot binary mask, with minimal modification Mask R-CNN can be applied to detect instance-specific poses. Mask R-CNN surpasses the winner of the 2016 COCO keypoint competition, and at the sametime runs at 5 fps. Mask R-CNN, therefore, can be seen more broadly as a flexible framework for instance-level recognition and can be readily extended to more complex tasks.

This work was motivated by the problem of the detection of small fruits (Das et. al., 2015) under the occlusion and overlapping backgrounds for yield estimation but it also applies to the detection of other small objects in the same condition. Existing approaches to fruit detection experience difficulty in detecting small fruits that are hard to detect when they are occluded by leaves or overlapped among them, and the overall detection accuracy suffers as a result.

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing