1. 三峡大学计算机与信息学院,湖北 宜昌 443002 2. 北京农业质量标准与检测技术研究中心,北京 100097 3. Department of Mathematics and Statistics, Curtin University, Perth 6845, Australia
The Characteristic Spectral Selection Method Based on Forward and Backward Interval Partial Least Squares
QU Fang-fang1, REN Dong1*, HOU Jin-jian1,2, ZHANG Zhong1, LU An-xiang2, WANG Ji-hua1,2, XU Hong-lei3
1. College of Computer and Information Technology, Three Gorges University, Yichang 443002, China 2. Beijing Research Center for Agricultural Standards and Testing, Beijing 100097, China 3. Department of Mathematics and Statistics, Curtin University, Perth 6845, Australia
Abstract:In the near-infrared spectroscopy, the Forward Interval Partial Least Squares (FiPLS) and Backward Interval Partial Least Squares (BiPLS) are commonly used modeling methods, which are based on the wavelength variable selection. These methods are usually of high prediction accuracy, but are strongly characteristic of greedy search, which causes that the intervals selected are not good enough to indicate the analyte information. To solve the problem, a spectral characteristic intervals selection strategy (FB-iPLS) based on the combination of FiPLS and BiPLS is proposed. On the basis of spectral segmentation, both FiPLSs are used to select useful intervals, and BiPLS is used to delete useless intervals, so as to perform the selection and deletion of the characteristic variables alternatively, which conducts a two-way choice of the target characteristic variables, and is used to improve the robustness of the model. The experiments on determining the ethanol concentration in pure water are conducted by modeling with FiPLS, BiPLS and the proposed method. Since different size of intervals will affect the result of the model, the experiments here will also examine the model results with different intervals of these three models. When the spectrum is divided into 60 segments, the FB-iPLS method obtains the best prediction performance. The correlation coefficients (r) of the calibration set and validation set are 0.967 7 and 0.967 0 respectively, and the cross-validation root mean square errors (RMSECV) are 0.088 8 and 0.057 1, respectively. Compared with FiPLS and BiPLS, the overall prediction performance of the proposed model is better. The experiments show that the proposed method can further improve the predictive performance of the model by resolving the greedy search feature against BiPLS and FiPLS, which is more efficient for and representative of the selection of characteristic intervals.
[1] SUN Hong-ye. Changchun University of Science and Technology, 2014. [2] Mall U, Wohler C, Grumpe A, et al. Advances in Space Research, 2013. [3] Teye E, Huang X, Lei W, et al. Food Research International, 2014, 55: 288. [4] JIA Sheng-yao, TANG Xu, YANG Xiang-long, et al. Spectroscopy and Spectral Analysis, 2014, 34(8): 2070. [5] FAN Shu-xiang, HUANG Wen-qian, LI Jiang-bo, et al. Spectroscopy and Spectral Analysis, 2014, 34(8): 18. [6] SHI Ji-yong, ZHOU Xiao-bo, ZHAO Jie-wen, et al. Journal of Infrared and Millimeter Waves, 2011, 5: 458. [7] CHU Xiao-li. Molecular Spectroscopy Analytical Technology Combined with Chemometrics and Its Applications. Beijing: Chemical Industry Press, 2011. 4. [8] Suhandy D, Yulia M, Ogawa Y, et al. Engineering in Agriculture, Environment and Food, 2013, 6(3): 111. [9] ZHOU Xiao-bo, ZHAO Jie-wen, HUANG Xing-yi. Chinese Mechanical Engineering Society,2006. 6. [10] WANG Chun-peng, YU Zuo-jun, MENG Fan-qiang. Journal of Chemical Industry and Engineering, 2013, 12: 4592. [11] ZHAN Xiao-ri, ZHU Xiang-rong, SHI Xin-yuan, et al. Spectroscopy and Spectral Analysis, 2009, 29(4): 964.