|
|
|
|
|
|
Wavelength Selection of Near-Infrared Spectra Based on Improved SiPLS-Random Frog Algorithm |
CHENG Jie-hong1, CHEN Zheng-guang1,2* |
1. College of Electrical and Information, Heilongjiang Bayi Agricultural University, Daqing 163319, China
2. Helongjiang Engineering Technology Research Center for Rice Ecological Seedlings Device and Whole Process Mechanization, Daqing 163319, China |
|
|
Abstract In the modeling and prediction analysis of near-infrared spectroscopy, the redundancy and collinearity of the data will seriously affect the prediction accuracy and robustness of the model. The feature wavelength selection is an effective method to improve the prediction accuracy of quantitative analysis. Random frog (RF) is a feature wavelength selection algorithm based on different variables with different probability of being selected. In recent years, it has shown good performance in feature wavelength selection. The method calculates the probability of each variable being selected by iteration, and takes the variable with high probability as the feature wavelength. However, the initial variable set V0 of RF is random and uncertain. It may contain useless or disturbing information. Moreover, it is difficult to guarantee the validity of the initial information, which makes the number of iterations too large and the running time too long. In this paper, an improved Si-RF feature wavelength selection algorithm is proposed based on RF. SiPLS is used to select the variables of the full spectrum. At this time, the wavelength obtained is the most sensitive to the change of the target variable. It is used as the initial variable subset of RF to solve the problem of long running time and low efficiency. On the other hand, when RF selects the feature wavelength, it selects the variable whose probability value is larger than the threshold value as the feature wavelength. However, there is no theoretical basis for setting the threshold value, which is easily influenced by human factors. In this paper, the MLR model is established by adding one variable each time in the descending order according to the probability values of being selected of each variable. The subset of variables with the lowest RMSEV value is taken as the feature wavelength, so as to find the wavelength subset contained in the highest prediction accuracy and improve the prediction accuracy. In view of the above two points, Si-RF was applied to soil near-infrared spectroscopy data sets. MLR model is established after selecting the feature wavelength, and the prediction accuracy was compared with that of RF-MLR and Full-PLSR models. The results show that the RF after 10 000 iterations, 10 wavelength points are selected, and the RMSEP of the MLR model is 1.627 6. The improved Si-RF only needs 1 000 iterations to select 17 wavelength points. The RMSEP of MLR model is reduced to 0.818 4, which greatly improves the prediction accuracy and the running efficiency. Compared with the full spectrum, it also greatly improves the prediction accuracy, simplifies the complexity of the model. It proves that improved Si-RF is an effective feature wavelength selection algorithm.
|
Received: 2019-10-15
Accepted: 2020-02-06
|
|
Corresponding Authors:
CHEN Zheng-guang
E-mail: ruzee@sina.com
|
|
[1] Li H D, Xu Q S, Liang Y Z. Analytica Chimica Acta, 2012, 740(none): 20.
[2] CHEN Li-dan,ZHAO Yan-ru(陈立旦, 赵艳茹). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2014, 30(8): 168.
[3] HU Meng-han, DONG Qing-li, LIU Bao-lin(胡孟晗, 董庆利, 刘宝林). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2016, 36(11): 3651.
[4] SUN Hong, ZHENG Tao, LIU Ning, et al(孙 红, 郑 涛, 刘 宁, 等). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2018, 34(1): 149.
[5] Yu K Q, Zhao Y R, Li X L, et al. PLoS One, 2014, 9(12): e116205.
[6] Zhao Y R, Yu K Q, He Y. Journal of Analytical Methods in Chemistry, 2015, 2015(2): 343782.
[7] BAI Ting, DING Jian-li, WANG Jing-zhe(白 婷,丁建丽,王敬哲). Journal of Drainage and Irrigation Machinery Engineering(排灌机械工程学报),2020,38(8):829.
[8] ZHU Ya-xing, YU Lei, HONG Yong-sheng, et al(朱亚星, 于 雷, 洪永胜, 等). Scientia Agricultura Sinica(中国农业科学), 2017, 50(22): 4325.
[9] YU Lei, HONG Yong-sheng, ZHOU Yong, et al(于 雷, 洪永胜, 周 勇, 等). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2016, 32(13): 95.
[10] GAO Zhi-hai, BAI Li-na, WANG Beng-yu, et al(高志海, 白黎娜, 王琫瑜, 等). Scientia Silvae Sinicae(林业科学), 2011, 47(6): 9. |
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
LI Wei1, TAN Feng2*, ZHANG Wei1, GAO Lu-si3, LI Jin-shan4. Application of Improved Random Frog Algorithm in Fast Identification of Soybean Varieties[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3763-3769. |
[4] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[5] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[6] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[7] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[8] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[9] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[10] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[11] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[12] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[13] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[14] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[15] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
|
|
|
|