Study on the Application of Supervised Principal Component Regression Procedure to Near-Infrared Spectroscopy Quantitative Analysis
LIU Xu-hua1, 2, XU Xing-zhong1, HE Xiong-kui2, ZHANG Lu-da2*
1. College of Science, Beijing Institute of Technology,Beijing 100081, China 2. College of Science, China Agricultural University,Beijing 100193, China
Abstract:The present paper introduces the principle of a new modeling method, called supervised principal component regression, with which the model of the near-infrared (NIR) spectroscopy quantitative analysis was established. Usually, there are many difficulties such as collinearity when establishing the quantitative analysis model for the high dimension of the spectral data. Using this new method, firstly according to some criterion, the wavelength information is selected in order to reduce the dimension of spectral data. Then the selected lower dimensional spectral data set is used to establish the principal component regression model. Taking sixty-six wheat samples as experiment materials, forty samples were chosen randomly to establish the predicting model, while the remaining twenty-sixth wheat samples were viewed as prediction set. In this example, 4 wavelengths, 4 632, 4 636, 5 994 and 5 997 cm-1, were selected at first according to the coefficients between the response variable and each spectral data. Then two principal components of the spectral data at those four wavelengths were extracted to establish the principal component regression model. The model was used to the prediction set. The coefficient was 0.991 and the average relative error was 1.5% between the model predication results and Kjeldahl’s value for the protein content. It is very important to select the most significant part of wavelength information from a large number of spectral data, not only because this procedure can alleviate the influence of collinearity in modeling, but also because it can be used to guide the design of special NIR analysis instrument for analyzing specific component in some samples.
Key words:Near infrared spectroscopy;Supervised principal component regression;Quantitative analysis
刘旭华1, 2,徐兴忠1,何雄奎2, 张录达2* . 有监督主成分回归法在近红外光谱定量分析中的应用研究[J]. 光谱学与光谱分析, 2009, 29(11): 2959-2961.
LIU Xu-hua1, 2, XU Xing-zhong1, HE Xiong-kui2, ZHANG Lu-da2* . Study on the Application of Supervised Principal Component Regression Procedure to Near-Infrared Spectroscopy Quantitative Analysis . SPECTROSCOPY AND SPECTRAL ANALYSIS, 2009, 29(11): 2959-2961.
[1] LU Wan-zhen, YUAN Hong-fu, XU Guang-tong, et al(陆婉珍, 袁洪福, 徐广通, 等). Technology of Modern Near-Infrared Spectral Analysis(现代近红外光谱分析技术). Beijing: China Petrochemical Press(北京: 中国石化出版社), 2000. 4. [2] Burns Donald A, Ciurczak Emil W. Handbook of Near-Infrared Analysis. New York: Marcel Dekker Inc., 1992. [3] YAN Yan-lu, ZHAO Long-liang, HAN Dong-hai, et al(严衍禄, 赵龙莲, 韩东海, 等). Foundation of NIR Spectral Analysis and Its Application(近红外光谱分析基础与应用). Beijing: China Light Industry Press(北京: 中国轻工业出版社), 2005. [4] CHU Xiao-li, YUAN Hong-fu, LU Wan-zhen(褚小立, 袁洪福, 陆婉珍). Progress in Chemistry(化学进展), 2004, 16(4): 535. [5] Nguyen D V, Rocke D M. Bioinformatics, 2002, 18: 39. [6] Wold H. Soft Modelling by Latent Variables: The Nonlinear Iterative Partial Least Squares (NIPALS) Approach, in Perspectives in Probability and Statistics, In Honor of Bartlett M S, 1975. [7] Myers R H. Classical and Modern Regression with Application, Boston, Massachusetts: Duxbury, 1986. [8] Mardia K, Kent J, Bibby J. Multivariate Analysis, Academic Press, 1979. [9]Kerr M K, Martin M, Churchill G. A. Journal of Computational Biology, 2000, 7: 819. [10] Bair E, Hastie T, Paul D, et al. J. Am. Statist Assoc., 2006, 101: 119.