%A CHENG Jie-hong^{1}, CHEN Zheng-guang^{1, 2*}, YI Shu-juan^{2}
%T Wavelength Selection Algorithm Based on Minimum Correlation Coefficient for Multivariate Calibration
%0 Journal Article
%D 2022
%J SPECTROSCOPY AND SPECTRAL ANALYSIS
%R 10.3964/j.issn.1000-0593(2022)03-0719-07
%P 719-725
%V 42
%N 03
%U {http://www.gpxygpfx.com/CN/abstract/article_12533.shtml}
%8 2022-03-01
%X In the quantitative analysis of near-infrared spectroscopy, as the instrument’s precision is getting higher and higher. The collected spectral data usually has a very high dimension. Therefore, wavelength selection is essential for eliminating noise and redundant variables, simplifying the model, and improving the model’s predictive performance. There are many methods for selecting characteristic wavelengths in NIR spectroscopy, but the problem of multicollinearity among variables is still a key issue that leads to poor model effects. Collinearity between variables can be analyzed by correlation coefficient. When the correlation coefficient is higher than 0.8, it indicates that there is multicollinearity. Therefore, this paper takes the correlation coefficient between variables as the selection criteria and proposes a wavelength selection method that minimizes the collinearity between the selected variables, called the Minimal Correlation Coefficient (MCC) method. This method is based on the correlation coefficient matrix of the spectrum data. It selects the wavelength with the smaller average and standard deviation of the correlation coefficients of other wavelengths as the candidate modeling wavelength set so that the linear correlation between the wavelengths in the set is minimized, and the model has eliminated Collinearity between variables. Then use the standard regression coefficient to select the wavelength that has a greater impact on the dependent variable to obtain the prediction model. In order to verify the effectiveness of the proposed algorithm the method is tested. Using two sets of opening NIRS data sets (diesel dataset and soil dataset), wavelength selection was carried out by MCC algorithm, and compared with several other commonly used wavelength selection methods, including successive projections algorithm (SPA), competitive adaptive reweighted sampling (CARS), random frog (RF) and iteratively retains informative variables (IRIV). The experimental results show that the MCC algorithm has good prediction performance, the model prediction accuracy based on MCC is better than that of SPA, CARS, RF, and is roughly the same as that of IRIV. Therefore, the minimum correlation coefficient method is an effective wavelength selection algorithm, which can reduce the dimension efficiently and improve the prediction precision of the model.