1. 北京中医药大学中药信息工程研究中心, 北京 100029 2. Ghent University-iMINDS, Department of Information Technology, Gent B-9050, Belgium 3. 河南中医学院, 河南 郑州 450008
Genetic Algorithm Based Multi-Objective Least Square Support Vector Machine for Simultaneous Determination of Multiple Components by Near Infrared Spectroscopy
XU Bing1, WANG Xing1, 3, Dhaene Tom2, SHI Xin-yuan1*, Couckuyt Ivo2, BAI Yan3, QIAO Yan-jiang1*
1. Research Center of TCM Information Engineering, Beijing University of Chinese Medicine, Beijing 100029, China 2. Department of Information Technology, Ghent University- iMINDS, B-9050 Gent, Belgium 3. Henan College of Traditional Chinese Medicine, Zhengzhou 450008, China
Abstract:The near infrared (NIR) spectrum contains a global signature of composition, and enables to predict different properties of the material. In the present paper, a genetic algorithm and an adaptive modeling technique were applied to build a multi-objective least square support vector machine (MLS-SVM), which was intended to simultaneously determine the concentrations of multiple components by NIR spectroscopy. Both the benchmark corn dataset and self-made Forsythia suspense dataset were used to test the proposed approach. Results show that a genetic algorithm combined with adaptive modeling allows to efficiently search the LS-SVM hyperparameter space. For the corn data, the performance of multi-objective LS-SVM was significantly better than models built with PLS1 and PLS2 algorithms. As for the Forsythia suspense data, the performance of multi-objective LS-SVM was equivalent to PLS1 and PLS2 models. In both datasets, the over-fitting phenomena were observed on RBFNN models. The single objective LS-SVM and MLS-SVM didn’t show much difference, but the one-time modeling convenience allows the potential application of MLS-SVM to multicomponent NIR analysis.
Key words:Multi-objective least square support vector machine;Genetic algorithm;Near infrared;Multicomponent quantification;Adaptive modeling
[1] Haughey S A, Graham S F, Cancouёt E, et al. Food Chemistry, 2013, 136(3-4): 1557. [2] Xu B, Wu Z, Lin Z, et al. Analytica Chimica Acta, 2012, 720: 22. [3] Blanco M, Peguero A. Talanta, 2008, 77(2): 647. [4] Gallego J, Arroyo J. Analytica Chimica Acta, 2001, 437: 247. [5] Picón Z, Martínez G, Garrido F, et al. Analyst, 2000, 125(6): 1167. [6] Chen Q, Guo Z, Zhao J, et al. Journal of Pharmaceutical and Biomedical Analysis, 2012, 60: 92. [7] Qu J, Zuo M J. Expert Systems with Applications, 2012, 39(5): 6089. [8] Chen A, Wu Z, Yang G. Theory and Applications of Models of Computation. Berlin: Springer,2006: 99. [9] Thissen U, üstün B, Melssen W J, et al. Analytical Chemistry, 2004, 76(11), 3099. [10] Debruyne M, Serneels S, Verdonck T. Journal of Chemometrics, 2009, 23(9): 479. [11] Ni Y N, Mei M H, Koko S. Chemometrics and Intelligent Laboratory Systems, 2011, 105(2): 147. [12] Gorissen D, Couckuyt I, Laermans E, et al. Engineering with Computers-Germany, 2010, 26(1): 81. [13] http://www.eigenvector.com/data/index.htm. [14] WANG Xing, BAI Yan, CHEN Zhi-hong, et al(王 星, 白 雁, 陈志红, 等). China Journal of Chinese Materia Medica(中国中药杂志), 2009, 34(16): 2071. [15] Gorissen D, Crombecq K, Couckuyt I, et al. Journal of Machine Learning Research, 2010, 11: 2051.