|
|
|
|
|
|
Wavelength Selection Algorithm Based on Minimum Correlation Coefficient for Multivariate Calibration |
CHENG Jie-hong1, CHEN Zheng-guang1, 2*, YI Shu-juan2 |
1. College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China
2. Heilongjiang Engineering Technology Research Center for Rice Ecological Seedlings Device and Whole Process Mechanization, Daqing 163319, China
|
|
|
Abstract In the quantitative analysis of near-infrared spectroscopy, as the instrument’s precision is getting higher and higher. The collected spectral data usually has a very high dimension. Therefore, wavelength selection is essential for eliminating noise and redundant variables, simplifying the model, and improving the model’s predictive performance. There are many methods for selecting characteristic wavelengths in NIR spectroscopy, but the problem of multicollinearity among variables is still a key issue that leads to poor model effects. Collinearity between variables can be analyzed by correlation coefficient. When the correlation coefficient is higher than 0.8, it indicates that there is multicollinearity. Therefore, this paper takes the correlation coefficient between variables as the selection criteria and proposes a wavelength selection method that minimizes the collinearity between the selected variables, called the Minimal Correlation Coefficient (MCC) method. This method is based on the correlation coefficient matrix of the spectrum data. It selects the wavelength with the smaller average and standard deviation of the correlation coefficients of other wavelengths as the candidate modeling wavelength set so that the linear correlation between the wavelengths in the set is minimized, and the model has eliminated Collinearity between variables. Then use the standard regression coefficient to select the wavelength that has a greater impact on the dependent variable to obtain the prediction model. In order to verify the effectiveness of the proposed algorithm the method is tested. Using two sets of opening NIRS data sets (diesel dataset and soil dataset), wavelength selection was carried out by MCC algorithm, and compared with several other commonly used wavelength selection methods, including successive projections algorithm (SPA), competitive adaptive reweighted sampling (CARS), random frog (RF) and iteratively retains informative variables (IRIV). The experimental results show that the MCC algorithm has good prediction performance, the model prediction accuracy based on MCC is better than that of SPA, CARS, RF, and is roughly the same as that of IRIV. Therefore, the minimum correlation coefficient method is an effective wavelength selection algorithm, which can reduce the dimension efficiently and improve the prediction precision of the model.
|
Received: 2021-02-10
Accepted: 2021-05-07
|
|
Corresponding Authors:
CHEN Zheng-guang
E-mail: ruzee@sina.com
|
|
[1] Weng S, Yu S, Dong R, et al. International Journal of Food Properties, 2020, 23(1): 269.
[2] Vohland M, Harbich M, Ludwig M, et al. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016, 9(9): 4011.
[3] Rahman A, Kondo N, Ogawa Y, et al. Biosystems Engineering, 2016, 141: 12.
[4] WANG Bing-yu, SUN Wei-jiang, HUANG Yan, et al(王冰玉, 孙威江, 黄 艳, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2017, 37(4): 1100.
[5] Acebal C C, Grünhut M, Lista A G, et al. Talanta, 2010, 82(1): 222.
[6] Li H D, Xu Q S, Liang Y Z, Analytica Chimica Acta, 2012, 740: 20.
[7] Yun Y H, Wang W T, Tan M L, et al. Analytica Chimica Acta, 2014, 807: 36.
[8] Yun Y H, Li H D, Deng B C, et al. TrAC Trends in Analytical Chemistry, 2019, 113: 102.
[9] ZHANG Xiao-ming, TANG Ning(张小鸣, 汤 宁). Modern Electronics Technique(现代电子技术), 2018, 41(22): 126.
[10] WANG Tao, BAI Tie-cheng, YU Cai-li, et al(王 涛, 白铁成, 喻彩丽, 等). Jiangsu Agricultural Sciences(江苏农业科学), 2018, 46(19): 269.
[11] WU Long-guo, WANG Song-lei, HE Jian-guo, et al(吴龙国, 王松磊, 何建国, 等). Chinese Journal of Luminescence(发光学报), 2017, 38(10): 1366.
[12] Chen J, Ren X, Zhang Q, et al. Journal of Cereal Science, 2013, 58(2): 241.
|
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
LI Wei1, TAN Feng2*, ZHANG Wei1, GAO Lu-si3, LI Jin-shan4. Application of Improved Random Frog Algorithm in Fast Identification of Soybean Varieties[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3763-3769. |
[4] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[5] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[6] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[7] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[8] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[9] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[10] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[11] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[12] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[13] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[14] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[15] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
|
|
|
|