Strawberry Defect Detection and Visualization Via Hyperspectral Imaging
ZHAO Lu-lu1, 2, ZHOU Song-bin1, 2, LIU Yi-sen1, 2*, PANG Kun-kun1, 2, YIN Ze-xuan1, 2, CHEN Hong1, 2
1. Guangdong Institute of Intelligent Manufacturing, Guangzhou 510000,China
2. Guangdong Key Laboratory of Modern Control Technology, Guangzhou 510000, China
Abstract:Strawberries can be easily damaged during harvesting, transportation, storage, packaging, and sales. The damages and defects encountered include bruising, frost damage, and fungal infections, which can cause great economic losses to fruit farmers and sellers. Hyperspectral technology combines spectral sensing and machine vision to non-destructively detect various quality defects in fruits. However, there are currently two problems in modeling hyperspectral fruit detection: First, the input information is mainly based on average spectra, and the hyperspectral image information is not adequately utilized. Secondly, convolutional networks (CNN) have become the main focus of development in hyperspectral data processing. Still, CNNs have a relatively narrow domain of perception, and it is difficult to obtain long-term correlations for spectral segments or image information. To solve the above problems and accurately detect and recognize various strawberry defects, a spatial-spectral transformation network (SSTN) was proposed to classify the near-infrared hyperspectral data (900~1 700 nm) of four categories of strawberries (healthy, bruised, frost damaged, and infected). The SSTN uses the Vision Transformer (ViT) network as the main body and hyperspectral data patches with encoded position information are used as inputs to achieve “spectra-spatial” modeling. The model's internal multi-head attention mechanism can also capture long-distance spectral/spatial correlations. In the experiment, 502 strawberries were sampled, including 128 healthy, 128 bruised, 128 frost damaged, and 118 infected strawberries. The training and test sets were randomly divided according to a1∶1 ratio. Half of the data was used to train the model to classify defects, and the other half was used to test the model's performance. The results show that SSTN achieved a maximum classification accuracy of 99.20%. Compared with one-dimensional convolutional neural network (1D-CNN), two-dimensional convolutional neural network (2D-CNN), and convolutional network with attention mechanism (CBAM-CNN), our SSTN model achieved accuracy improvements of 3.8%, 3.3%, and 1.5%, respectively. The trained 2D-CNN, CBAM-CNN, and SSTN models were combined with Score-CAM for visualization to further visualize the location of various strawberry defects. The results of defect visualization show that the convolutional attention mechanism in CBAM-CNN can improve the accuracy of the defect location, while the SSTN model with multi-head attention mechanism combined with Score-CAM had the best visualization performance, which can be used to accurately display the location and outline of the defect shape. This study provides a reference for establishing a fast, non-destructive, and automatic detection method for strawberry defects.
[1] Shanthini K S, Francis J, George S N, et al. Food Control, 2025, 167: 110794.
[2] Ktenioudaki A, Esquerre C A, Nunes C M D N, et al. Biosystems Engineering, 2022, 221: 105.
[3] Liu Q, Sun K, Zhao N, et al. Postharvest Biology and Technology, 2019, 153: 152.
[4] Siedliska A, Baranowski P, Zubik M, et al. Postharvest Biology and Technology, 2018, 139: 115.
[5] Weng S, Yu S, Dong R, et al. International Journal of Food Properties, 2020, 23(1): 269.
[6] Liu Q, Sun K, Peng J, et al. Food Analytical Methods, 2018, 11(5): 1518.
[7] Zhang C, Guo C, Liu F, et al. Journal of Food Engineering, 2016, 179(6): 11.
[8] Shen F, Zhang B, Cao C, et al. Journal of Food Process Engineering, 2018, 41: e12866.
[9] Liu Y, Zhou S, Han W, et al. Analytica Chimica Acta, 2019, 1086: 46.
[10] Gao Z, Shao Y, Xuan G, et al. Artificial Intelligence in Agriculture, 2020, 4: 31.
[11] Chun S W, Song D J, Lee K H, et al. Postharvest Biology and Technology, 2024, 214: 112918.
[12] TIAN Qing-lin, GUO Bang-jie, YE Fa-wang, et al(田青林,郭帮杰,叶发旺,等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2022, 42(3): 873.
[13] Lu Y, Gong M, Li J, et al. Agronomy, 2023, 13(9): 2217.
[14] TIAN You-wen, WU Wei, LIN Lei, et al( 田有文,吴 伟,林 磊,等). Transactions of the Chinese Society for Agricultural Machinery(农业机械学报), 2023, 54(1): 393.
[15] Echim S V, Tǎiatu I M, Cercel D C, et al. Explainability—Driven Leaf Disease Classification Using Adversarial Training and Knowledge Distillation, Conference on Agents and Artificial Intelligence, 2023, doi: 10.48550/arXiv:2401.00334.
[16] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale, 2020, doi: 10.48550/arXiv.2010.11929.
[17] Bieniek A, Moga A. Pattern Recognition, 2000, 33(6): 907.
[18] Hu X, Yang W, Wen H, et al. Sensors, 2021, 21(5): 1751.
[19] Hong D, Han Z, Yao J, et al. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1.
[20] Wang H, Wang Z, Du M, et al. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020, 111(doi: 10.1109/CVPRW50498.2020.00020).