Wavelength Selection by siPLS-LASSO for NIR Spectroscopy and Its Application
MEI Cong-li1,2, CHEN Yao2, YIN Liang2, JIANG Hui2, CHEN Xu2, DING Yu-han2, LIU Guo-hai2
1. School of Electrical Engineering, Zhejiang University of Water Resources and Electric Power, Hangzhou 310018, China
2. School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China
Abstract:Near-infrared spectroscopy (NIR) is widely used in entire production processes and product quality test, especially in food and drug industries. It has many advantages, e. g. no requirement of sample pretreatment, low cost, non-destructive detection, and fast determination. However, the application of the whole spectrum data in modeling can lead to complexity and poor stability. The synergy interval PLS(siPLS) is the most common dimensionality reduction method for spectral data. However, it cannot deal with the collinearity problem of spectral data. Least absolute shrinkage and selection operator (LASSO) is a relatively new method for data dimensionality reduction. However, when it comes to small samples, its instability cannot be ignored. For disadvantages of siPLS and LASSO in NIR calibration, a novel wavelength selection method named siPLS-LASSO was proposed. It was validated in a wheat-straw solid-state fermentation process by monitoring pH values. In the method, siPLS was firstly used to selected intervals of NIR spectroscopy. Secondly, LASSO was used to select wavelengths on the selected intervals. Finally, the selected wavelengths were used to construct PLS model for prediction. For comparisons, several conventional wavelength selection methods were also studied. In the case study, 33 wavelengths were eventually selected by the siPLS-LASSO method and used for PLS modelling. The RMSEP and Rp of the model were 0.071 1 and 0.980 8 respectively. Results showed that the proposed siPLS-LASSO was an effective method of wavelength selection and can improve prediction performance of models.
Key words:NIR spectroscopy; Wavelength selection; LASSO; siPLS; Solid state fermentation process
[1] Porep J U, Kammerer E R, Carle R. Trends in Food Science & Technology, 2015, 46(2): 211.
[2] Zou X, Zhao J, Povey M J W, et al. Analytica Chimica Acta, 2010, 667(1-2): 14.
[3] He K, Cheng H, Du W, et al. Chemometrics and Intelligent Laboratory Systems, 2014, 134: 79.
[4] Simon N, Friedman J, Hastie T, et al. Journal of Computational and Graphical Statistics, 2013, 22(2): 231.
[5] Tibshirani R. Regression Shrinkage Selection via the LASSO, 2011, 73(3): 273.
[6] Zhao P, Yu B. Journal of Machine Learning Research, 2006, 7(12): 2541.
[7] Wang S, Nan B, Rosset S, et al. Annals of Applied Statistics, 2011, 5(1): 468.
[8] Norgaard L, Saudland A, Wagner J, et al. Applied Spectroscopy, 2000, 54(3): 413.
[9] Jiang H, Liu G, Mei C, et al. Analytical & Bioanalytical Chemistry, 2012, 404(2): 603.
[10] Dai Q, Cheng J H, Sun D W, et al. Food Chemistry, 2016, 197(Pt A): 257.