1. School of Information Science and Engineering, Shandong Normal University, Ji’nan 250358, China
2. Key Laboratory of Facility Agriculture Measurement and Control Technology and Equipment of Machinery Industry, Zhenjiang 212013, China
3. School of Engineering, Cardiff University, Cardiff CF24 3AA, United Kingdom
Abstract:In the visible spectrum range, the accurate recognition of target fruit is the fundamental guarantee for achieving orchard yield measurement and machine automatic picking. However, this task is susceptible to many interferences, such as the complex unstructured orchard environment, the close color between green apples and background leaves, etc., which significantly restrict the detection accuracy of target fruits and bring great challenges to recognition of machine vision. It targeted the different illumination environments and fruit postures under the complex orchard environment. An optimized convolution and one-stage (FCOS) fully neural network model for green apple recognition is proposed in this study. Firstly, the new model combines the feature extraction ability of convolutional neural network (CNN) based on FCOS, eliminates the dependence of previous detectors on anchor boxes, and switches to a novel manner of one-stage, full convolution and anchor-free for predicting the fruit confidence and boxes offsets, which greatly improves the recognition speed of the model while ensuring the detection accuracy simultaneously. Secondly, the bottom-up feature fusion architecture is embedded after the feature pyramid to provide more accurate positioning information for high -levels and thus further optimize the detection effect of green apple. Finally, the overall loss function is designed to complete the iterative training given three output branches of FCOS. To simulate the real orchard environment as possible, we collected green apple images in various environments with different lighting environments, illumination angle, occlusion type, camera distance for data sets generation and model training, and then evaluated the optimal model on validation set containing different scenes. The experimental results show that our proposed model’s average precision (AP) is 85.6%, which is 0.9, 10.5, 2.5 and 1.9 percentage points higher than the state-of-the-art detection models Faster- R-CNN, SSD, RetinaNet and FSAF, respectively. In the aspect of model design, the model parameters of FCOS and the calculation of the whole detection process are 32.0 M and 47.5 GFLOPs (billion floating-point operations), respectively, which are 9.5 M and 12.5 GFLOPs lower than those of Faster R-CNN. Comparisons of experimental results show that the new model has higher detection accuracy and recognition efficiency in the visible spectrum, which can provide theoretical and technical support for orchard yield measurement and automatic picking. In addition, the new model can also provide theoretical references for other kinds of fruits and vegetables.
Key words:FCOS network; Green fruits; Object detection
张中华, 贾伟宽, 邵文静, 侯素娟, Ji Ze, 郑元杰. 优化FCOS网络复杂果园环境下绿色苹果检测模型[J]. 光谱学与光谱分析, 2022, 42(02): 647-653.
ZHANG Zhong-hua, JIA Wei-kuan, SHAO Wen-jing, HOU Su-juan, Ji Ze, ZHENG Yuan-jie. Green Apple Detection Based on Optimized FCOS in Orchards. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(02): 647-653.
[1] WANG Dan-dan, SONG Huai-bo, HE Dong-jian(王丹丹, 宋怀波, 何东健). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2017, 33(10): 59.
[2] Jia W K, Zhang Y, Lian J, et al. International Journal of Advanced Robotic Systems, 2020, 17(5-6): 25310.
[3] LI Da-hua, ZHAO Hui, YU Xiao(李大华,赵 辉,于 晓). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2019, 39(9): 2974.
[4] Koirala A, Walsh K B, Wang Z, et al. Computers and Electronics in Agriculture, 2019, 162: 219. [5] Boogaard F P, Rongen K S A H, Kootstra G W. Biosystems Engineering, 2020, 192: 117.
[6] Bargoti S, Underwood J P. Journal of Field Robotics, 2017, 34(6): 1039.
[7] Jia W, Tian Y, Luo R, et al. Computers and Electronics in Agriculture, 2020, 172: 105380.
[8] XIONG Jun-tao, ZHENG Zhen-hui, LIANG Jia-en, et al(熊俊涛, 郑镇辉, 梁嘉恩, 等). Transactions of the Chinese Society for Agricultural Machinery(农业机械学报), 2020, 51(4): 199.
[9] Lin T Y, Maire M, Belongie S, et al. Micosoft COCO: Common Objects in Context, European Conference on Computer Vision. Springer, Cham, 2014: 740.
[10] Tian Z, Shen C, Chen H, et al. FCOS: Fully Convolutional One-Stage Object Detection, Proceedings of the IEEE International Conference on Computer Vision,2019. 9627.
[11] Ren S, He K, Girshick R, et al. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137.
[12] Liu W, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector, European Conference on Computer Vision. Springer, Cham, 2016. 21.
[13] Lin T Y, Goyal P, Girshick R, et al. Focal Loss for Dense Object Detection, Proceedings of the IEEE International Conference on Computer Vision, 2017. 2980.
[14] Zhu C, He Y, Savvides M. Feature Selective Anchor-Free Module for Single-Shot Object Detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.