近红外光谱定量分析的新方法：半监督最小二乘支持向量回归机

doi:10.3964/j.issn.1000-0593(2011)10-2702-04

摘要
参考文献
相关文章 (15)

全文: PDF (987 KB)
输出: BibTeX | EndNote (RIS)

摘要：在近红外光谱定量分析中，样品化学值测定的准确度是运用数学模型进行定量分析精确度的理论极限。但能够准确获取化学值的样品数量比较少，许多模型在建模时只考虑这部分样品数据，而不考虑大量的无化学值的样品数据。针对该问题，本文在LS-SVR的基础上，提出了可以同时利用有化学值(标签)和无化学值样品数据的半监督LS-SVR(S²LS-SVR)模型。类似于LS-SVR，该模型也只需求解一个线性方程组。最后，以烤烟样品数据集为实验材料，建立了四种样品成分(总糖、还原糖、总氮和烟碱)的定量分析模型。四种样品成分的预测值与实际值的平均误差分别为6.62%，7.56%，6.11%和8.20%，相关系数分别为0.974 1，0.973 3，0.923 0和0.948 6。经分析比较发现S²LS-SVR模型优于PLS和LS-SVR，从而验证了S²LS-SVR模型的可行性和有效性。

关键词：近红外光谱;化学计量学;半监督LS-SVR(S²LS-SVR)

Abstract：In near infrared spectral quantitative analysis, the precision of measured samples’ chemical values is the theoretical limit of those of quantitative analysis with mathematical models. However, the number of samples that can obtain accurately their chemical values is few. Many models exclude the amount of samples without chemical values, and consider only these samples with chemical values when modeling sample compositions’ contents. To address this problem, a semi-supervised LS-SVR(S²LS-SVR) model is proposed on the basis of LS-SVR, which can utilize samples without chemical values as well as those with chemical values. Similar to the LS-SVR, to train this model is equivalent to solving a linear system. Finally, the samples of flue-cured tobacco were taken as experimental material, and corresponding quantitative analysis models were constructed for four sample compositions’ content(total sugar, reducing sugar, total nitrogen and nicotine) with PLS regression, LS-SVR and S²LS-SVR. For the S²LS-SVR model, the average relative errors between actual values and predicted ones for the four sample compositions’ contents are 6.62%, 7.56%, 6.11% and 8.20%, respectively, and the correlation coefficients are 0.974 1, 0.973 3, 0.923 0 and 0.948 6, respectively. Experimental results show the S²LS-SVR model outperforms the other two, which verifies the feasibility and efficiency of the S²LS-SVR model.

Key words：Near infrared spectrum;Chemometrics;Semi-supervised LS-SVR(S²LS-SVR)

收稿日期: 2010-12-18 修订日期: 2011-03-21

中图分类号:

O657.3

通讯作者: 徐硕 E-mail: xush@istic.ac.cn

引用本文:

李林¹，徐硕^2*，安欣³，张录达⁴ . 近红外光谱定量分析的新方法：半监督最小二乘支持向量回归机[J]. 光谱学与光谱分析, 2011, 31(10): 2702-2705.
LI Lin¹, XU Shuo^2*, AN Xin³, ZHANG Lu-da⁴ . A Novel Approach to NIR Spectral Quantitative Analysis： Semi-Supervised Least-Squares Support Vector Regression Machine . SPECTROSCOPY AND SPECTRAL ANALYSIS, 2011, 31(10): 2702-2705.

链接本文:

https://www.gpxygpfx.com/CN/10.3964/j.issn.1000-0593(2011)10-2702-04 或 https://www.gpxygpfx.com/CN/Y2011/V31/I10/2702

[1] YAN Yan-lu, ZHAO Long-lian, HAN Dong-hai, et al(严衍禄，赵龙莲，韩东海，等). Foundation of Near Infrared Spectral Analysis and its Applications(近红外光谱分析基础与应用). Beijing: China Light Industry Press(北京：中国轻工业出版社), 2005.
[2] Abdi H. Partial Least Squares (PLS) Regression. Encyclopedia for Research Methods for the Social Sciences, Lewis-Beck M, Bryman A, Futing T, eds. Sage, Thousand Oaks, CA, 2003. 792.
[3] Vapnik V N. The Nature of Statistical Learning Theory, 2nd Edition. New York: Springer Verlag, 1999.
[4] Suykens J A K, Van Gestel T, Brabanter J D，et al. Least Squares Support Vector Machines. World Scientific Pub. Co., Singapore, 2002.
[5] Blum A, Mitchell T. Combining Labeled and Unlabeled Data with Co-Training. Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT), Madison, Wisconsin, United States, 1998. 92.
[6] Zhu X. Semi-Supervised Learning Literature Survey. Technical Report 1530, Department of Computer Sciences, University of Wisconsin, Madison, 2008.
[7] Chapelle O, Schlkopf B, Zien A. Semi-Supervised Learning. Cambridge: MIT Press, 2006.
[8] Chapelle O, Sindhwani V, Keerthi S S. Journal of Machine Learning Research, 2008, 9(2)：203.
[9] Cortes C, Mohri M. On Transductive Regression. Advances in Neural Information Processing Systems 19, Schlkopf B, Platt J, Hoffman T, eds. MIT Press, Cambridge, MA, 2007. 305.
[10] Brefeld U, Crtner T, Scheffer T，et al. Efficient Co-Regularised Least Squares Regression. Proceedings of the 23nd International Conference on Machine Learning(ICML), 2006. 137.
[11] Zhou Z-H, Li M. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(11): 1479.
[12] Van Gestel T, Suykens J A K, Baesens B，et al. Machine Learning, 2004, 54(1): 5.
[13] Shawe-Taylor J, Cristianini N. Kernel Methods for Pattern Analysis. Cambridge: Cambridge University Press, 2004.
[14] Keerthi S S, Lin C J. Neural Computation, 2003, 15(7): 1667.
[15] Lin H T, Lin C J. A Study on Sigmoid Kernels for SVM and the Training of Non-PSD Kernels by SMO-Type Methods. Technical Report, Department of Computer Science, National Taiwan University, 2003.
[16] Hsu C-W, Chang C-C, Lin C-J. A Practical Guide to Support Vector Classification. Available [online]: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf.
[17] Xu S, Ma F J, Tao L. Learn from the Information Contained in the False Splice Sites as well as in the True Splice Sites using SVM. Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering(ISKE), Chengdu, China, 2007. 1360.