|
|
|
|
|
|
Study on the Selection of Spectral Preprocessing Methods |
DIWU Peng-yao, BIAN Xi-hui*, WANG Zi-fang, LIU Wei |
State Key Laboratory of Separation Membranes and Membrane Processes, School of Environmental and Chemical Engineering, Tianjin Polytechnic University, Tianjin 300387, China |
|
|
Abstract Spectral signals of complex samples are usually disturbed by stray light, noise, baseline drift and other undesirable factors, which can affect the final qualitative and quantitative analysis results. Therefore, it is necessary to pretreat the raw spectra before modeling. How to find a proper preprocessing method from the existing spectral preprocessing methods is a difficult problem. One strategy is to choose the optimal preprocessing by observing the characteristics of the spectral signal directly, which does not require modeling and is more explanatory. However, it may be difficult and subjective for subtle or multiple interferences and lead to misleading results. Another strategy is based on the modeling performance, which does not need observe the spectral characteristics, but numerous processing methods need to investigate which is time-consuming for large datasets. In summary, it is necessary to explore which selection method is more scientific and reasonable. In this study, nine datasets were used to investigate the necessity of preprocessing and the choice of preprocessing methods by arranging and combining of 10 preprocessing methods. Firstly, the latent variables of partial least squares (PLS), the window size of first derivative (1st Der), second derivative (2nd Der) and SG smoothing, the wavelet function and decomposition scale of continuous wavelet transform (CWT) were optimized, respectively. Then, non-preprocessing and 10 preprocessing methods including 1st Der, 2nd Der, CWT, multiplicative scatter correction (MSC), standard normal variate (SNV), SG smoothing, mean centering, normalization, Pareto scaling, auto scaling were combined in order of baseline correction, scattering correction, smoothing and scaling. A total of 120 preprocessing and their combinations were obtained. Finally, the characteristics of spectral signals and the root mean squared error of prediction (RMSEP) with PLS for 120 preprocessing methods were analyzed for the nine datasets and the same dataset with different components. Results show that compared with observing the characteristics of spectral signals, the optimal preprocessing method can be selected more accurately according to the modeling performance of the spectra and predictive components. For most datasets, appropriate preprocessing method can improve the modeling performance. For different datasets, the optimal preprocessing method is different because of the different information and complexity of the datasets. For the same dataset, the optimal preprocessing methods for different components are also different even if the spectra are the same. Thus, it can be concluded that no universal preprocessing method exists. The optimal preprocessing method is related to the spectra and the predictive components. Furthermore, it is an effective way to select the optimal pretreatment method by sorting and combining the existing preprocessing methods according to the preprocessing purpose.
|
Received: 2018-08-02
Accepted: 2018-12-18
|
|
Corresponding Authors:
BIAN Xi-hui
E-mail: bianxihui@163.com
|
|
[1] Li P, Du G R, Cai W S, et al. Journal of Pharmaceutical and Biomedical Analysis, 2012, 70: 288.
[2] Devos O, Downey G, Duponchel L. Food Chemistry, 2014, 148: 124.
[3] Liu Y J, Yu Y D, Zhou X G, et al. Chemometrics and Intelligent Laboratory Systems, 2017, 161: 8.
[4] Gerretzen J, Szymanska E, Jansen J J, et al. Analytical Chemistry, 2015, 87(24): 12096.
[5] Engel J, Gerretzen J, Szymanska E, et al. Trends in Analytical Chemistry, 2013, 50: 96.
[6] Zhu Z Q, Yuan H F, Song C F, et al. Sensors and Actuators B: Chemical, 2018, 268: 299.
[7] Qiao X X, Wang C, Feng M C, et al. Spectroscopy Letters, 2017, 50(3): 156.
[8] Li Y K, Shao X G, Cai W S. Talanta, 2007, 72(1): 217.
[9] Wulfert F, Kok W T, Smilde A K. Analytical Chemistry, 1998, 70: 1761.
[10] Gerretzen J, Szymanska E, Bart J, et al. Analytica Chimica Acta, 2016, 938: 44.
[11] Bian X H, Li S J, Lin L G, et al. Analytica Chimica Acta, 2016, 925: 16.
[12] Liu P, Wang J, Li Q, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2018, doi:10.1016. |
[1] |
LI Yu1, ZHANG Ke-can1, PENG Li-juan2*, ZHU Zheng-liang1, HE Liang1*. Simultaneous Detection of Glucose and Xylose in Tobacco by Using Partial Least Squares Assisted UV-Vis Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 103-110. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
CHENG Gang1, CAO Ya-nan1, TIAN Xing1, CAO Yuan2, LIU Kun2. Simulation of Airflow Performance and Parameter Optimization of
Photoacoustic Cell Based on Orthogonal Test[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3899-3905. |
[4] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
[5] |
JIA Hao1, 3, 4, ZHANG Wei-fang1, 3, LEI Jing-wei1, 3*, LI Ying-ying1, 3, YANG Chun-jing2, 3*, XIE Cai-xia1, 3, GONG Hai-yan1, 3, DING Xin-yu1, YAO Tian-yi1. Study on Infrared Fingerprint of the Classical Famous
Prescription Yiguanjian[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3202-3210. |
[6] |
WU Yong-qing1, 2, TANG Na1, HUANG Lu-yao1, CUI Yu-tong1, ZHANG Bo1, GUO Bo-li1, ZHANG Ying-quan1*. Model Construction for Detecting Water Absorption in Wheat Flour Using Vis-NIR Spectroscopy and Combined With Multivariate Statistical #br#
Analyses[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(09): 2825-2831. |
[7] |
LIU Rui-min, YIN Yong*, YU Hui-chun, YUAN Yun-xia. Extraction of 3D Fluorescence Feature Information Based on Multivariate Statistical Analysis Coupled With Wavelet Packet Energy for Monitoring Quality Change of Cucumber During Storage[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(09): 2967-2973. |
[8] |
YANG Dong-feng1, HU Jun2*. Accurate Identification of Maize Varieties Based on Feature Fusion of Near Infrared Spectrum and Image[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(08): 2588-2595. |
[9] |
LUO Dong-jie, WANG Meng, ZHANG Xiao-shuan, XIAO Xin-qing*. Vis/NIR Based Spectral Sensing for SSC of Table Grapes[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(07): 2146-2152. |
[10] |
CHEN Wan-jun1, XU Yuan-jie2, LU Zhi-yun3, QI Jin-hua3, WANG Yi-zhi1*. Discriminating Leaf Litters of Six Dominant Tree Species in the Mts. Ailaoshan Based on Near-Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(07): 2119-2123. |
[11] |
WANG Bin1, 2, ZHENG Shao-feng2, GAN Jiu-lin1, LIU Shu3, LI Wei-cai2, YANG Zhong-min1, SONG Wu-yuan4*. Plastic Reference Material (PRM) Combined With Partial Least Square (PLS) in Laser-Induced Breakdown Spectroscopy (LIBS) in the Field of Quantitative Elemental Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(07): 2124-2131. |
[12] |
MA Zhong-kai1, LI Mao-gang2, YAN Chun-hua1, LIU Hao-sen1, TAO Shu-hao1, TANG Hong-sheng2, ZHANG Tian-long2*, LI Hua1, 2*. Application of Raman Spectroscopy Combined With Partial Least Squares Method in Rapid Quantitative Analysis of Diesel n-Butanol[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(07): 2153-2157. |
[13] |
CHENG Xiao-xiang1, WU Na2, LIU Wei2*, WANG Ke-qing2, LI Chen-yuan1, CHEN Kun-long1, LI Yan-xiang1*. Research on Quantitative Model of Corrosion Products of Iron Artefacts Based on Raman Spectroscopic Imaging[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(07): 2166-2173. |
[14] |
ZHANG Hai-liang1, XIE Chao-yong1, TIAN Peng1, ZHAN Bai-shao1, CHEN Zai-liang1, LUO Wei1*, LIU Xue-mei2*. Measurement of Soil Organic Matter and Total Nitrogen Based on Visible/Near Infrared Spectroscopy and Data-Driven Machine Learning Method[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(07): 2226-2231. |
[15] |
ZHANG Mei-zhi1, ZHANG Ning1, 2, QIAO Cong1, XU Huang-rong2, GAO Bo2, MENG Qing-yang2, YU Wei-xing2*. High-Efficient and Accurate Testing of Egg Freshness Based on
IPLS-XGBoost Algorithm and VIS-NIR Spectrum[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(06): 1711-1718. |
|
|
|
|