Fast Classification Method of Black Goji Berry (Lycium Ruthenicum Murr.) Based on Hyperspectral and Ensemble Learning
LU Wei1, CAI Miao-miao1, ZHANG Qiang2, LI Shan3
1. Jiangsu Provincial Laboratory of Modern Facility Agriculture Technology and Equipment Engineering, College of Artificial Intelligence, Nanjing Agricultural University, Nanjing 210031, China
2. School of Water Resources and Hydropower, Qinghai University, Xining 810016, China
3. School of Life Science and Technology, Tongji University, Shanghai 200092, China
Abstract:Black goji berry has high nutrition and medical value. Different grades of black goji berry have different quality, and prices are also significantly different. However, due to the lack of effective detection and grading methods, the black goji berry market is chaotic, and the bad become mixed with the good, which affects the black goji berry market’s quality supervision. To achieve fast, non-destructive and high-precision classification of black goji berry, this paper proposes a fast non-destructive classification method of black goji berry based on hyperspectral and ensemble learning. First of all, for Nomhong 1st grade (NMH-grade1), Nomhong 2nd grade (NMH-grade2), Nomhong 3rd grade (NMH-grade3), Nomhong 4th grade (NMH-grade4), select 200 for each grade. Then, in two placement modes (carpopodium up and overall horizontal after removing the carpopodium), the spectral image cube with a spectral range of 391.6~2 528.1 nm is acquired using a GaiaSorter-Dual wide-band hyperspectral sorter. Through the mask processing, automatically extract single black goji berry ROI hyperspectral information with cell counting algorithm. The spectral information of black goji berry in the range of 500~2 400 nm is extracted. After FD(First Derivative), FFT(Fast Fourier Transform), HT(Hilbert Transform), SG(Savitzky Golay), Normalize, SNV(Standard Normal Variate) preprocessing, the spectral information of the characteristic wavelength is extracted by PCA(Principal Components Analysis), SPA(Successive Projection Algorithm), CARS(Competitive Adaptive Reweighted Sampling). Then build LIBSVM, LDA(Latent Dirichlet Allocation), KNN(k-Nearest Neighbor), RF(Random Forest), NB(Naive Bayes) detection models. The combination of sarcocarp-Normalize-SPA-LDA, sarcocarp-FD-CARS-RF and sarcocarp-SNV-CARS-LIBSVM is the best, with accuracy rates of 0.941 7, 0.941 7 and 0.937 5, respectively. At the same time, it can be found that in the pretreatment, FD, HT, Normalize, and SNV have better effects. In the dimensionality reduction method, the models of SPA and CARS have better effects. And in the models established by LIBSVM, LDA, KNN, RF, and NB, the number of test set accuracy rates of not less than 0.9 are 2, 7, 0, 4, and 1, respectively, so the three classifiers LDA, RF, and LIBSVM work best. To further improve the classification accuracy of black goji berry, LDA, RF and LIBSVM are used as meta-models to build a fast and non-destructive classification model of black goji berry Stacking ensemble learning. When the sarcocarp-FD-SPA-Stacking is combined, the accuracy can be improved from 0.941 7 to 0.983 3. A total of 17 characteristic wavelengths is extracted, respectively (in nm): 591.6, 609.1, 721.6, 989.1, 1 083.3, 1 111.3, 1 296.1, 1 564.9, 1 844.9, 1 934.5, 1 996.1, 2 046.5, 2 130.5, 2 292.9, 2 315.3, 2 320.9, 2 348.9. Among them, there are C-H frequency doubling peaks and absorption peaks near 721.6, 1 083.3, 1 111.3, 2 130.5, 2 292.9, 2 315.3, 2 320.9, 2 348.9, O—H frequency doubling peaks and absorption peaks near 721.6, 989.1, 1 934.5, 1 996.1, 2 292.9, and C—O absorption peaks near 2 130.5 and 2 292.9. Research has shown that fast and non-destructive classification of black goji berry based on hyperspectral combined with ensemble learning is feasible.
Key words:Spectroscopy; Black goji berry; Ensemble learning; Anthocyanin; Non-destructive testing
卢 伟,蔡苗苗,张 强,李 珊. 高光谱和集成学习的黑枸杞快速分级方法[J]. 光谱学与光谱分析, 2021, 41(07): 2196-2204.
LU Wei, CAI Miao-miao, ZHANG Qiang, LI Shan. Fast Classification Method of Black Goji Berry (Lycium Ruthenicum Murr.) Based on Hyperspectral and Ensemble Learning. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41(07): 2196-2204.