|
|
|
|
|
|
Quantitative Analysis of Near Infrared Spectroscopy Based on Orthogonal Matching Pursuit Algorithm |
LI Si-hai1, LIU Dong-ling2 |
1. College of Information Engineering, Gansu University of Chinese Medicine, Lanzhou 730000, China
2. School of Pharmacy, Gansu University of Chinese Medicine, Lanzhou 730000, China |
|
|
Abstract Compressed sensing (CS) is a new technology of signal compression and sampling. Orthogonal Matching Pursuit (OMP), a greedy tracking algorithm, is widely used in sparse signal reconstruction in the compressed sensing field. In connection with the characteristics of high-dimensional small samples of near-infrared spectra signals and sparse prior signals, a novel near-infrared spectra variable selection method named Orthogonal Matching Pursuit Based Variable Selection (OMPBVS) is proposed, based on the compressed sensing theory, to further improve the flexibility and reliability of near-infrared spectra variable selection. By sparse reconstruction of the original spectral signal, OMPBVS can compress the regression coefficient of most variables to zero, and then indirectly realize the selection of spectral variables. In the specific process, the spectral matrix is adopted as the sensing matrix, the predictive variable as the observation variable and iteratively calculated residual and the inner product of the atom, and the inner product of the largest atom is chosen. During each iteration, the signal is projected onto the subspace spanned by all selected atoms, and then the coefficients are updated for all the selected atoms, enabling the residual error and all the selected atoms to be orthogonal. With the residual calculation to be the essence of Grammar-Schmidt Orthogonalization, the orthogonal projection can reduce the number of iterations and ensure the accuracy of signal reconstruction. OMPBVS can reduce the spectral dimension to the sample size scale, and its variable selection capability is comparable to LASSO. However, compared with LASSO, the optimization method of OMPBVS loss function is a forward selection algorithm, which reduces the number of iterations and can precisely control the number of selected variables. Variable selection experiments were performed on the beer dataset and Wheat kernels dataset to compare the performance of six variable selection methods: PLS, MCUVE, CARS, WMSCVS, LASSOLarsCV, and OMPBVS. There were 60 samples in the beer dataset, 36 samples of the training set and 24 samples of the test set were divided by Kennard Stone (KS) method, and the prediction variable was Original extract concentration. The Wheat kernels data set consisted of 523 samples, 415 training samples, and 108 test samples. The predicted value was protein content. The OMPBVS method selects the number of variables, RMSEC and RMSEP from the beer dataset as 2, 0.205 2 and 0.159 8, respectively. When on the Wheat kernels data set, the number of selected variables, RMSEC and RMSEP were 9, 0.450 2, and 0.412 5, respectively, and the variable selection ability and model performance was better than the other five methods, indicating that OMPBVS is an effective NIR spectral variable selection and quantitative analysis method. OMPBVS variable selection method has good generalization ability in the case of small samples, which can reduce the number of selected variables and improve the robustness of variable selection. Besides, spectral preprocessing methods based on SNV and MSC can reduce the number of selected variables to a certain extent and improve the interpretability of the model.
|
Received: 2020-03-02
Accepted: 2020-07-24
|
|
|
[1] Yun Y H,Li H D,Deng B C,et al.Trends in Analytical Chemistry,2019,113:102.
[2] Wu Y F,Peng S L,Xie Q,et al.Chemometrics and Intelligent Laboratory Systems,2019,185:114.
[3] Zhang R Q,Zhang F Y,Chen W C,et al. Chemometrics and Intelligent Laboratory Systems, 2018,175:47.
[4] Kvalheim O M. Journal of Chemometrics,2010,24(7-8):496.
[5] Afanador N L,Tran T N,Blanchet L,et al. Chemometrics and Intelligent Laboratory Systems,2014,139:139.
[6] Donoho D L. IEEE Transactions on Information Theory,2006,52(4):1289.
[7] BAI Lian-fa,WANG Xu,HAN Jing,et al(柏连发,王 旭,韩 静,等). Infrared and Laser Engineering(红外与激光工程),2019,48(6):603001.
[8] DING Qian,HU Mao-hai(丁 倩, 胡茂海). Infrared Technology(红外技术),2019,41(4):72.
[9] GAO Yue,ZANG Ming-xiang,GUO Fu-ying(高 悦,臧明相,郭馥英). Application Research of Computers(计算机应用研究),2017,34(12):3672.
[10] Needell D,Vershynin R. IEEE Journal of Selected Topics in Signal Processing,2010,4(2):310.
[11] Shi X S, Xing F Y, Guo Z H, et al. Neurocomputing,2019,349:164.
[12] Li H D, Xu Q S, Liang Y Z. Chemometrics and Intelligent Laboratory Systems,2018,176:34.
[13] Osborne M R,Presnell B,Turlach B A. IMA Journal of Numerical Analysis,2000,20(3):389.
[14] Yan H,Song X Z,Tian K D,et al. Spectrochimica Acta Part A:Molecular and Biomolecular Spectroscopy,2019,210:362.
[15] Huang X,Xia L. Chemometrics and Intelligent Laboratory Systems,2017,168:107. |
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
LI Yu1, ZHANG Ke-can1, PENG Li-juan2*, ZHU Zheng-liang1, HE Liang1*. Simultaneous Detection of Glucose and Xylose in Tobacco by Using Partial Least Squares Assisted UV-Vis Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 103-110. |
[3] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[4] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[5] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[6] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[7] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[8] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[9] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[10] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[11] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[12] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[13] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[14] |
LIU Bo-yang1, GAO An-ping1*, YANG Jian1, GAO Yong-liang1, BAI Peng1, Teri-gele1, MA Li-jun1, ZHAO San-jun1, LI Xue-jing1, ZHANG Hui-ping1, KANG Jun-wei1, LI Hui1, WANG Hui1, YANG Si2, LI Chen-xi2, LIU Rong2. Research on Non-Targeted Abnormal Milk Identification Method Based on Mid-Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3009-3014. |
[15] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
|
|
|
|