Application of Improved Random Frog Algorithm in Fast Identification of Soybean Varieties
LI Wei1, TAN Feng2*, ZHANG Wei1, GAO Lu-si3, LI Jin-shan4
1. College of Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China
2. College of Electrical and Information, Heilongjiang Bayi Agricultural University, Daqing 163319, China
3. Suihua Branch of Heilongjiang Academy of Agricultural Sciences, Suihua 152052, China
4. Daqing Green Agricultural Products Monitor Center, Daqing 163311, China
Abstract:Rapid and accurate identification of soybean varieties play an important role for identifying seed quality, purifying the seed market and ensuring food security. The traditional identification methods of crop varieties have the problems of poor accuracy and low efficiency. Therefore a PLS identification model was established by Raman spectroscopy combined with characteristic wavelength extraction to fast identify four high-oil soybean varieties (Heinong 87, Heinong 89, Suinong 38 and Suinong 77) in Heilongjiang Province. RF is a new characteristic wavelength selection algorithm that determines the importance of variables by iteratively calculating the selected probability, which can remove redundant information to a great extent in the full spectrum. However, this method has the disadvantages of the random initial variable set, a large number of iterations and uncertain threshold selection. Therefore, an improved random frog (MRF) algorithm based on LASSO regression was proposed. In order to get rid of the randomness of the initial variable set in the RF algorithm, LASSO was used to extract the characteristic wavelength point most related to the attribute variable as an initial variable set F0. On this basis, iterative calculations were carried out to reduce the number of useless iterations and improve the model's prediction accuracy. In addition, RF selects variables by setting a threshold, which leads to the uncertainty of the extracted characteristic wavelength. The improvements were as follows: Firstly, the variables with the selected probability of 0 were removed, taking 10 wavelength points as intervals for the sorted variables. Then, the partial least squares discriminant analysis model between the characteristic wavelengths and soybean varieties was built by adding one interval each time, and taking the wavelength subset with the smallest RMSECV as the selected characteristic wavelengths. The PLS-DA model was established with the selected characteristic wavelengths of MRF as the input variables and compared the prediction performance with full spectrum and other characteristic wavelength selection methods of RF, LASSO and ElasticNet algorithms. The results indicated that the MRF algorithm selected 300 characteristic wavelength points, accounting for only 9.37% of the full spectrum, which effectively screened the key characteristic variables and simplified the complexity of the model. The RMSEP and R2p were 0.246 9 and 0.951 2 respectively, and the identification accuracy reached 100%, which was the best among all models. Therefore, Raman spectroscopy combined with MRF algorithm could achieve the fast identification of soybean varieties and provide a new technique for the fast identification of other crop varieties.
Key words:Raman spectroscopy; Soybean; Characteristic wavelength selection; Random frog; LASSO
李 伟,谭 峰,张 伟,高陆思,李金山. 改进随机蛙跳算法在大豆品种快速鉴别中的应用[J]. 光谱学与光谱分析, 2023, 43(12): 3763-3769.
LI Wei, TAN Feng, ZHANG Wei, GAO Lu-si, LI Jin-shan. Application of Improved Random Frog Algorithm in Fast Identification of Soybean Varieties. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3763-3769.
[1] ZENG Xue-ming(曾学明). Chinese Journal of Agricultural Resources and Regional Planning(中国农业资源与区划), 2017, 38(9): 89.
[2] LIU Yao, LI Zi-nan, WU Tao, et al(刘 瑶, 李梓楠, 吴 涛, 等). Soybean Science(大豆科学), 2018, 37(4): 596.
[3] FEI Hong-li, RUAN Chang-qing, LI Zhi-jiang, et al(费洪立, 阮长青, 李志江, 等). China Oils and Fats(中国油脂), 2022, 47(2): 148.
[4] ZHANG Rui-jun, BAI Zhi-yuan, YANG Yu-hua, et al(张瑞军, 白志元, 杨玉花, 等). Molecular Plant Breeding(分子植物育种), 2021, 19(20): 6750.
[5] Nargis H F, Nawaz H, Bhatti H N, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2021, 246: 119034.
[6] Dibs R, Silva T V, Neto J A G, et al. Vibrational Spectroscopy, 2021, 112: 103183.
[7] SHA Min, GUI Dong-dong, ZHANG Zheng-yong, et al(沙 敏, 桂冬冬, 张正勇, 等). Journal of the Chinese Cereals and Oils Association(中国粮油学报), 2020, 35(1): 168.
[8] Liu Dongli, Wu Yixuan, Gao Zongmei, et al. Crop & Pasture Science, 2019, 70:437.
[9] Li Hongdong, Xu Qingsong, Liang Yizeng. Analytica Chimica Acta, 2012, 740: 20.
[10] WANG Kai-yi, YANG Sheng, GUO Cai-yun, et al(王恺怡, 杨 盛, 郭彩云, 等). Journal of Instrumental Analysis(分析测试学报), 2022, 47(2): 398.
[11] Imani M, Ghassemian H. International Journal of Remote Sensing, 2015, 36(6): 1728.
[12] Genis D O, Sezer B, Durna S, et al. Food Chemistry, 2021, 336: 127699.
[13] Li Xiong, Liu Yande, Jiang Xiaogang, et al. Journal of Molecular Structure, 2020, 1210: 127760.