Abstract:Gaussian process (GP) is applied in the present paper as a chemometric method to explore the complicated relationship between the near infrared (NIR) spectra and ingredients. After the outliers were detected by Monte Carlo cross validation (MCCV) method and removed from dataset, different preprocessing methods, such as multiplicative scatter correction (MSC), smoothing and derivate, were tried for the best performance of the models. Furthermore, uninformative variable elimination (UVE) was introduced as a variable selection technique and the characteristic wavelengths obtained were further employed as input for modeling. A public dataset with 80 NIR spectra of corn was introduced as an example for evaluating the new algorithm. The optimal models for oil, starch and protein were obtained by the GP regression method. The performance of the final models were evaluated according to the root mean square error of calibration (RMSEC), root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP) and correlation coefficient (r). The models give good calibration ability with r values above 0.99 and the prediction ability is also satisfactory with r values higher than 0.96. The overall results demonstrate that GP algorithm is an effective chemometric method and is promising for the NIR analysis.
Key words:Gaussian process;Near-infrared spectroscopy;Monte Carlo cross validation;Uninformative variable elimination;Quantitative analysis
[1] YAN Yan-lu(严衍禄). Near Infrared Spectroscopy and Its Application(近红外光谱分析基础与应用). Beijing: China Light Industry Press(北京:中国轻工业出版社), 2005. 1. [2] Burns D A, Ciurczak E W. Handbook of Near-Infrared Analysis. Third Edition. USA: CRC, 2005. [3] FANG Li-min, LIN Min(方利民,林 敏). Chinese Journal of Analytical Chemistry(分析化学), 2008, 36(6): 815. [4] FANG Li-min, LIN Min(方利民,林 敏). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2009, 29(8): 2083. [5] Zou X B, Zhao J W, Malcolm J W P, et al. Analytical Chimica Acta, 2010, 667: 14. [6] Williams C K I, Rasmussen C E. Gaussian Processes for Regression, in: Touretzky D S, Mozer M C, Hasselmo M E (Eds.), Advances in Neural Information Processing Systems. The MIT Press, Massachusetts, 1996. [7] Chen T, Martin E. Analytical Chimica Acta, 2009, 631: 13~21. [8] Zhou P, Tian F F, Lv F L, et al. Journal of Chromatography A, 2009, 1216: 3107. [9] Obrezanova O, Csányi G, Gola J M R, et al. Journal of Chemical Information and Modeling, 2007, 47: 1847. [10] http://software.eigenvector.com/Data/Corn/corn.mat. [11] LIU Zhi-chao, CAI Wen-sheng, SHAO Xue-guang(刘智超,蔡文生,邵学广). Science in China Series B: Chemistry(中国科学B:化学), 2008, 38(4): 316. [12] Center V, Massart D L, Noord D O E, et al. Analytical Chemistry, 1996, 68: 3851.