|
|
|
|
|
|
A Wavelength Selection Method Combining Direct Orthogonal Signal Correction and Monte Carlo |
XIE Lin-jiang, HONG Ming-jian*, YU Zhi-rong |
School of Big Data & Software Engineering, Chongqing University, Chongqing 401331, China |
|
|
Abstract In the analysis of near-infrared spectroscopy data, full-spectrum data has the characteristics of multiple wavelength points, large redundancy, and serious collinearity. This leads to some wavelength points that have no positive effect on the establishment of the correction model and even reduce the model’s predictive ability. Wavelength selection has proven to be an important method to avoid above problems effectively. Aiming at the characteristics of near-infrared spectroscopy, a wavelength selection algorithm based on the combination of Direct Orthogonal Signal Correction (DOSC) and Monte Carlo (MC) is proposed. Unlike most methods of selecting wavelength according to its “importance”, MC-DOSC selects wavelength according to its “unimportance”. The “unimportance” of wavelength is measured by the weight W of DOSC. Specifically, first, normalize was the probability of wavelength being filtered to establish the probability model of wavelength selection, and Monte Carlo random sampling is used to obtain the set of N wavelength subsets. The selected wavelength point is used to establish a PLS model in each sampling process, and the corresponding cross-validation root mean square error (RMSECV) is calculated. After N times of random sampling, the wavelength subset corresponding to the PLS model with minimum RMSECV is selected as the candidate subset. The spectral data contained in the candidate subset is used as a new spectral matrix, and the above process is repeated until the RMSECV no longer drops. After the iteration stops, the candidate subset with the smallest RMSECV is taken as the best wavelength subset. And compared with the three algorithms of Monte Carlo Uninformative Variable Elimination (MCUVE), Genetic Algorithm (GA) and Competitive Adaptive Weight Sampling (CARS). Experimental results show that the algorithm can greatly reduce the number of wavelength points, and the prediction ability of the corresponding PLS model is also improved. In the experimental results of the corn data set, the number of wavelength points is reduced from 700 in the full spectrum to 15. The correlation coefficient of the prediction set is increased from 0.828 2 to 0.931 4, and the RMSEP is reduced from 0.109 8 to 0.071 3. In the experimental results of the gasoline data set, the number of wavelength points was reduced from 301 in the full spectrum to 31. The correlation coefficient of the prediction set was increased from 0.987 5 to 0.993 9, and the RMSEP was reduced from 0.255 to 0.178 8. The performance of this algorithm in the two data sets is better than the three algorithms compared.
|
Received: 2021-01-07
Accepted: 2021-04-11
|
|
Corresponding Authors:
HONG Ming-jian
E-mail: hmj@cqu.edu.cn
|
|
[1] HAN Jian, LI Xue-zhao, CAO Zhi-min, et al(韩 建, 李雪昭, 曹志敏, 等). Chinese Journal of Scientific Instrument(仪器仪表学报), 2019, 40(6): 78.
[2] Chen J M, Yang C H, Zhu H Q, et al. Chemometrics and Intelligent Laboratory Systems, 2018, 182: 188.
[3] Han Q J, Wu H L, Cai C B, et al. Analytica Chimica Acta, 2008, 612: 121.
[4] Tang G, Wei B, Wu D C, et al. Journal of Applied Spectroscopy, 2018, 85(1): 109.
[5] Liu Y S, Zhou S B, Liu W X, et al. Journal of Near Infrared Spectroscopy, 2018, 26(1): 34.
[6] JIANG Wei-wei, LU Chang-hua, ZHANG Yu-jun, et al(蒋薇薇, 鲁昌华, 张玉钧, 等). Journal of Electronic Measurement and Instrument(电子测量与仪器学报), 2017, 31(12): 1960.
[7] SHI Ji-yong, ZOU Xiao-bo, ZHAO Jie-wen, et al(石吉勇, 邹小波, 赵杰文, 等). Journal of Infrared and Millimeter Waves(红外与毫米波学报), 2011, 30(10): 458.
[8] Deng B C, Yun Y H, Liang Y Z, et al. The Analyst, 2014, 139(19): 4836.
[9] Li H D, Liang Y Z, Xu Q S, et al. Analytica Chimica Acta, 2009, 648: 77.
[10] Li H D, Xu Q S, Liang Y Z, et al. Analytica Chimica Acta, 2012, 740: 20.
[11] ZHANG Feng, TANG Xiao-jun, TONG Ang-xin, et al(张 峰, 汤晓君, 仝昂鑫, 等). Chinese Journal of Scientific(仪器仪表学报), 2020, 41(1): 64.
[12] YU Lei, ZHU Ya-xing, HONG Yong-sheng, et al(于 雷, 朱亚星, 洪永胜, 等) . Transactions of the Chinese Society of Agricultural Engineering(农业工程学报) , 2016, 32(33): 138.
[13] ZHAO Huan, HUAN Ke-wei, SHI Xiao-guang, et al(赵 环, 宦克为, 石晓光, 等). Analytical Chemistry(分析化学), 2018, 46(1): 136.
[14] CHEN Yuan-yuan, WANG Zhi-bin, WANG Zhao-ba(陈媛媛, 王志斌, 王召巴). Infrared and Laser Engineering(红外与激光工程), 2014, 43(8): 2715.
[15] Westerhuis J A, de Jong S, Smilde A K. Chemometrics and Intelligent Laboratory Systems, 2001, 56(1): 13.
[16] Jiang H, Zhang H, Chen Q, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2015, 149: 1.
|
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
LI Yu1, ZHANG Ke-can1, PENG Li-juan2*, ZHU Zheng-liang1, HE Liang1*. Simultaneous Detection of Glucose and Xylose in Tobacco by Using Partial Least Squares Assisted UV-Vis Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 103-110. |
[3] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[4] |
LI Wei1, TAN Feng2*, ZHANG Wei1, GAO Lu-si3, LI Jin-shan4. Application of Improved Random Frog Algorithm in Fast Identification of Soybean Varieties[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3763-3769. |
[5] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[6] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[7] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[8] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[9] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[10] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[11] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[12] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[13] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[14] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[15] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
|
|
|
|