%A LI Wei, TAN Feng, ZHANG Wei, GAO Lu-si, LI Jin-shan
%T Application of Improved Random Frog Algorithm in Fast Identification of Soybean Varieties
%0 Journal Article
%D 2023
%J SPECTROSCOPY AND SPECTRAL ANALYSIS
%R 10.3964/j.issn.1000-0593(2023)12-3763-07
%P 3763-3769
%V 43
%N 12
%U {https://www.gpxygpfx.com/CN/abstract/article_13613.shtml}
%8 2023-12-01
%X Rapid and accurate identification of soybean varieties play an important role for identifying seed quality, purifying the seed market and ensuring food security. The traditional identification methods of crop varieties have the problems of poor accuracy and low efficiency. Therefore a PLS identification model was established by Raman spectroscopy combined with characteristic wavelength extraction to fast identify four high-oil soybean varieties (Heinong 87, Heinong 89, Suinong 38 and Suinong 77) in Heilongjiang Province. RF is a new characteristic wavelength selection algorithm that determines the importance of variables by iteratively calculating the selected probability, which can remove redundant information to a great extent in the full spectrum. However, this method has the disadvantages of the random initial variable set, a large number of iterations and uncertain threshold selection. Therefore, an improved random frog (MRF) algorithm based on LASSO regression was proposed. In order to get rid of the randomness of the initial variable set in the RF algorithm, LASSO was used to extract the characteristic wavelength point most related to the attribute variable as an initial variable set F0. On this basis, iterative calculations were carried out to reduce the number of useless iterations and improve the model's prediction accuracy. In addition, RF selects variables by setting a threshold, which leads to the uncertainty of the extracted characteristic wavelength. The improvements were as follows: Firstly, the variables with the selected probability of 0 were removed, taking 10 wavelength points as intervals for the sorted variables. Then, the partial least squares discriminant analysis model between the characteristic wavelengths and soybean varieties was built by adding one interval each time, and taking the wavelength subset with the smallest RMSECV as the selected characteristic wavelengths. The PLS-DA model was established with the selected characteristic wavelengths of MRF as the input variables and compared the prediction performance with full spectrum and other characteristic wavelength selection methods of RF, LASSO and ElasticNet algorithms. The results indicated that the MRF algorithm selected 300 characteristic wavelength points, accounting for only 9.37% of the full spectrum, which effectively screened the key characteristic variables and simplified the complexity of the model. The RMSEP and *R*^{2}_{p} were 0.246 9 and 0.951 2 respectively, and the identification accuracy reached 100%, which was the best among all models. Therefore, Raman spectroscopy combined with MRF algorithm could achieve the fast identification of soybean varieties and provide a new technique for the fast identification of other crop varieties.