|
|
|
|
|
|
Research on Near Infrared Spectral Feature Variable Selection Method Based on Improved Harmonic Search Algorithm |
ZHANG Lei1, DING Xiang-qian1, GONG Hui-li1, WU Li-jun2*, BAI Xiao-li2, LUO Lin2 |
1. College of Information Science and Engineering,Ocean University of China,Qingdao 266100,China
2. China Tobacco Yunnan Industry Co., Ltd., Technical Research Center, Kunming 650024,China |
|
|
Abstract Near-infrared spectroscopy has been widely used in many fields for detection and analysis because of its advantages of simplicity, speed, efficiency, low cost, and environment protection. However, the NIR spectra also contain interferences such as high variable dimension, multiple collinearities, redundant information, and high frequency noise. The direct construction of the prediction model not only increases the modeling complexity but also affects the prediction performance and generalization. For this purpose, a spectral feature variable selection method based on the improved Harmony Search algorithm (HS) is proposed. HS is often used to solve feature variable optimization problem. When the spectral variable selection is applied by the HS algorithm, the feature contribution of spectra is firstly calculated by the PLS loading coefficient as the disturbance weight of the improved HS. In the process of optimizing the spectral feature variables, the variable feature contribution is introduced as the excitation factor, and the initial solution vectors are generated by the combination of random traversal and excitation factor. When generating the new harmony vector, the feature contribution is applied as a penalty factor, and the parameters of HS are dynamically adjusted with the number of iterations by adding the balance factor, so as to adapt to the search of spectral variables. It enhances the ergodicity of the search process and the diversity of the population. In order to verify the effectiveness of the algorithm, the NIR PLS models of nicotine, total sugar and total nitrogen using tobacco samples are constructed. After pre-processing the original spectra, this method is used to optimize spectral variables. The prediction performance of each model corresponding to the number of different variables is calculated according to the cumulative frequency at which the variables are selected, and the final selected spectral variables are determined by the increasing trend of the Root Mean Square Error of Calibration (RMSEC) with the variables. The three PLS models are established on the training set and the test set respectively, and they are compared with the full spectrum, Uninformative Variables Elimination (UVE) and Particle Swarm Optimization (PSO). The experimental results show that the coefficient of determination (R2) of nicotine, total sugar and total nitrogen models using the selected variables is 0.921 1, 0.925 7 and 0.941 2, respectively; and the Root Mean Square Error of Prediction (RMSEP) is 0.102 3, 1.034 6 and 0.053 1. Compared with other methods, the RMSEP of this study is low, the R2 of these models is more than 0.92, and the spectral characteristic variables are small. It is shown that the improved HS algorithm can effectively filter the feature spectrum, reduce the modeling complexity, improve the model prediction performance and generalization ability.
|
Received: 2019-04-15
Accepted: 2019-08-04
|
|
Corresponding Authors:
WU Li-jun
E-mail: wallis8@126.com
|
|
[1] CHEN Li-ju,LIU Wei(陈丽菊, 刘 巍). Modern Physics(现代物理知识), 2016, 18(2):10.
[2] SUN Wen-ping,GONG Hui-li,WANG Mei-xun,et al(孙文苹, 宫会丽, 王梅勋, 等). Microcomputer & Its Applications(微型机与应用), 2015, 34(1): 78.
[3] LI Qian-qian,TIAN Kuang-da,LI Zu-hong,et al(李倩倩, 田旷达, 李祖红, 等). Chinese Journal of Analytical Chemistry(分析化学), 2013, 41(6): 917.
[4] XU Bao-ding,QIN Yu-hua,YANG Ning,et al(徐宝鼎, 秦玉华, 杨 宁, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2019, 39(3):717.
[5] WANG Yong,WANG Li-fu,ZOU Hui,et al(王 勇, 王李福, 邹 辉, 等). Computer Engineering and Design(计算机工程与设计), 2018, 378(6): 127.
[6] Moayedikia A, Ong K L, Boo Y L, et al. Engineering Applications of Artificial Intelligence, 2017, 57(C): 38.
[7] Enayatifar R, Yousefi M, Abdullah A H, et al. Communications in Nonlinear Science & Numerical Simulation, 2013, 18(12): 3481.
[8] ZHAI Jun-chang,GAO Li-qun,OUYANG Hai-bin,et al(翟军昌, 高立群, 欧阳海滨, 等). Control and Decision(控制与决策), 2015, 30(11): 1953.
[9] Khalili M, Kharrat R, Salahshoor K, et al. Applied Mathematics & Computation, 2014, 228(9): 195.
[10] Sutskever I, Hinton G E. Neural Computation, 2014, 20(11): 2629.
[11] OUYANG Hai-bin,GAO Li-qun,ZOU De-xuan,et al(欧阳海滨, 高立群, 邹德旋, 等). Control Theory and Applications(控制理论与应用), 2014, 31(1): 57.
[12] JIANG Hong,SU Yang(江 虹, 苏 阳). Laser and Infrared(激光与红外), 2016, 46(1): 119.
[13] Abdelgayed T S, Morsi W G, Sidhu T S. IEEE Transactions on Smart Grid, 2018, 9(2): 521.
[14] LIU Yan,CAI Wen-sheng,SHAO Xue-guang(刘 言, 蔡文生, 邵学广). Chinese Science Bulletin(科学通报), 2015,(8): 704. |
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
LI Yu1, ZHANG Ke-can1, PENG Li-juan2*, ZHU Zheng-liang1, HE Liang1*. Simultaneous Detection of Glucose and Xylose in Tobacco by Using Partial Least Squares Assisted UV-Vis Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 103-110. |
[3] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[4] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[5] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[6] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[7] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[8] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[9] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[10] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[11] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[12] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[13] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[14] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[15] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
|
|
|
|