Near Infrared Spectroscopy Synergy Interval Wavelength Selection Method Using the LSSVM Model
PENG Xiu-hui, HUANG Chang-yi, LIU Fei*, LIU Yan
Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Institute of Automation, Jiangnan University, Wuxi 214122, China
摘要: 针对传统近红外光谱波长选择方法忽略模型中非线性因素的缺陷,采用具有非线性处理能力的最小二乘支持向量机,结合间隔策略的波长选择方法和联合区间的思想,提出了一种非线性模型下的波长筛选算法—联合区间最小二乘支持向量机(synergy interval least squares support vector machines, siLSSVM)。以苹果糖度近红外光谱数据为例,与传统siPLS波长筛选方法相比,新算法的预测集均方根误差(RMSEP)在PLS模型和LSSVM模型预测时分别提高了37.43%和47.88%,预测集相关系数(RP)在PLS模型和LSSVM模型预测时分别增加了6.04%和7.31%。实例表明,对于存在非线性因素较强的光谱数据,siLSSVM算法能够有效的挑选最优波长区间与提高模型的预测精度和鲁棒性,为近红外光谱在非线性因素下筛选波长提供了新前景。
关键词:联合区间最小二乘支持向量机;非线性;苹果糖度;近红外光谱;波长筛选
Abstract:The present paper proposes a wavelength selection algorithm based on nonlinear factors named Synergy interval least squares support vector machines (siLSSVM). siLSSVM combines the interval strategy of wavelength selection method with the idea of synergy interval and overcomes the disadvantages of the traditional wavelength selection methods, i.e. ignoring the nonlinear factors. Taking the near infrared spectrum data of apple sugar as performance verification object of this new algorithm, comparing new algorithm with siPLS, the model performance has been greatly improved. The root-mean-square error (RMSEP) in new algorithm has increased respectively by 37.43% and 47.88% under the model of PLS and LSSVM, with increases of 6.04% and 7.31% in the correlative coefficient (RP). The examples illustrate that siLSSVM can efficiently select the optimum wavelength interval for spectrum data with strong nonlinear factors. This algorithm greatly improves the prediction accuracy and robustness of the model, which provides a new prospect for near infrared spectral with nonlinear factors to select wavelength.
Key words:Synergy interval least squares support vector machines;Nonlinear factors;Data of apple sugar;Near infrared spectrum;Wavelength selection
[1] Mario H M K, Jarbas J R R, Celio P. Fuel, 2011,90(11): 3268. [2] LU Wan-zhen(陆婉珍). Modern Analysis Technique of Near Infrared Spectroscopy(现代近红外光谱分析技术). Beijing: China Petrochemical Press(北京: 中国石化出版社), 2007. 35. [3] Ye Shengfeng, Wang Dong, Min Shungeng. Chemometrics and Intelligent Laboratory Systems, 2008,91(2): 194. [4] Zou Xiaobo, Zhao Jiewen, Malcolm J W P, et al. Analytica Chimica Acta, 2010,667(1): 14. [5] Norgaard L, Saudland A, Wagner J, et al. Applied Spectroscopy, 2000,54(3): 413. [6] SUN Bai-ling, LIU Jun-liang, CAI Yu-bo, et al(孙柏玲,刘君良,蔡宇博,等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2011, 31(2): 366. [7] LI Jun-liang, WANG Cong-qing(李军良,王从庆). Journal of Zhejiang University·Agric.& Life Sci.(浙江大学学报·农业与生命科学版),2011, 37(4): 453. [8] Vapnik V N. The Nature of Statistical Learning Theory. Springer-Verlag, New York, 1995. [9] Zou Hongyan, Wu Hailong, Fu Haiyan. Talanta, 2010,80(5): 1698. [10] Li Hongdong, Liang Yizeng, Xu Qingsong. Chemometrics and Intelligent Laboratory Systems, 2009,95(2): 188. [11] GU Yan-ping, ZHAO Wen-jie, WU Zhan-song(顾燕萍,赵文杰,吴占松). Journal of Tsinghua University·Science and Technology(清华大学学报·自然科学版), 2010, 50(7):1068. [12] Devos O, Ruckebusch C, Durand A. Chemometrics and Intelligent Laboratory Systems, 2009,96(1): 27. [13] Peng Jiangtao, Jiang An, Peng Silong. Analytica Chimica Acta, 2010,667(1): 14.