光谱学与光谱分析 |
|
|
|
|
|
A Principal Components Selection Method Based on the Modified Randomization Test for Avoiding Over-Fit and Under-Fit in Spectra Calibration |
LI Li-na, LI Qing-bo, YAN Hou-lai, ZHANG Guang-jun* |
Key Laboratory of Precision Opto-Mechatronics Technology, Ministry of Education, School of Instrument Science and Opto-Electronics Engineering, Beihang University, Beijing 100191, China |
|
|
Abstract More or less principal components often give an over-fit or under-fit quantitative calibration model. In order to avoid over-fit or under-fit in spectra calibration, a principal components selection method based on a modified randomization test is proposed. Three near infrared spectra experiments (the complexity of the sample components in each experiment is increasing by degrees) are introduced in this paper for evaluating the proposed method. The method is compared with the cross-validation method. And the spectra model complexity of how to affect the prediction performance of calibration is discussed. Then the adaptability of this modified randomization test to the uncertainty complex spectra model is also discussed. The results indicate that the proposed method has no process of leaving some samples out like cross-validation does, and all the training samples are considered when selecting principal components, so the problem of over-fit or under-fit can be avoided, which is benefit to improve prediction performance of calibration in spectral analysis. And the modified randomization test method is different with the commonly used randomization test that a simplified criterion is introduced here and it is easy to implement. With the proposed method, the authors can have a visualized and interactive process when selecting principal components. For these three experiments, 4, 5 and 8 selected principal components are employed in calibration respectively and the prediction result is the best for the independent external prediction sets. It is also implied that the proposed method is adaptable to the complex samples with more variables and little samples.
|
Received: 2009-12-02
Accepted: 2010-03-06
|
|
Corresponding Authors:
ZHANG Guang-jun
E-mail: gjzhang@buaa..edu.cn
|
|
[1] Li Li-na, Li Qing-bo, Zhang Guang-jun. Journal of Infrared, Millimeter, and Terahertz Waves, 2009, 30(11): 1191. [2] Gourvénec S, Pierna J A Fernández, Massart D L, et al. Chemom. Intell. Lab. Syst., 2003, 68: 41. [3] Denham M C. J. Chemometr., 2000, 14(4): 351. [4] Martens Harald, Ns Tormod. Multivariate Calibration. New York: John Wiley & Sons, 1991. 261. [5] Lazraq A, Cléroux R. J. Chemometr., 2001, 15(6): 523. [6] Li B, Morris J, Martin E B. Chemometr. Intell. Lab. Syst., 2002, 64(1): 79. [7] Xu Q-S, Liang Y-Z, Du Y-P. J. Chemometr., 2004, 18(2): 112. [8] Vogt F, Mizaikoff B. J. Chemometr., 2003, 17: 346. [9] van der Voet H. Chemometr. Intell. Lab. Syst., 1994, 25(2): 313. [10] Faber N M, Rajko R. Anal. Chim. Acta, 2007, 595(1-2): 98. [11] Wiklund Susanne, Nilsson David, Eriksson Lennart, et al. J. Chemometr., 2007, 21(10-11): 427. [12] Windig Willem, Guilment Jean. Anal. Chem.,1991, 63(14): 1425. [13] Araújo Mário César Ugulino, Saldanha Teresa Cristina Bezerra, Galvo Roberto Kawakami Harrop, et al. Chemom. Intell. Lab. Syst., 2001, 57(2): 65. |
[1] |
FAN Ping-ping,LI Xue-ying,QIU Hui-min,HOU Guang-li,LIU Yan*. Spectral Analysis of Organic Carbon in Sediments of the Yellow Sea and Bohai Sea by Different Spectrometers[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 52-55. |
[2] |
YANG Chao-pu1, 2, FANG Wen-qing3*, WU Qing-feng3, LI Chun1, LI Xiao-long1. Study on Changes of Blue Light Hazard and Circadian Effect of AMOLED With Age Based on Spectral Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 36-43. |
[3] |
LI Yu1, ZHANG Ke-can1, PENG Li-juan2*, ZHU Zheng-liang1, HE Liang1*. Simultaneous Detection of Glucose and Xylose in Tobacco by Using Partial Least Squares Assisted UV-Vis Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 103-110. |
[4] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[5] |
WANG Cai-ling1,ZHANG Jing1,WANG Hong-wei2*, SONG Xiao-nan1, JI Tong3. A Hyperspectral Image Classification Model Based on Band Clustering and Multi-Scale Structure Feature Fusion[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 258-265. |
[6] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[7] |
LI Qi-chen1, 2, LI Min-zan1, 2*, YANG Wei2, 3, SUN Hong2, 3, ZHANG Yao1, 3. Quantitative Analysis of Water-Soluble Phosphorous Based on Raman
Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3871-3876. |
[8] |
LIANG Jin-xing1, 2, 3, XIN Lei1, CHENG Jing-yao1, ZHOU Jing1, LUO Hang1, 3*. Adaptive Weighted Spectral Reconstruction Method Against
Exposure Variation[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3330-3338. |
[9] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[10] |
FANG Zheng, WANG Han-bo. Measurement of Plastic Film Thickness Based on X-Ray Absorption
Spectrometry[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3461-3468. |
[11] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[12] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[13] |
JIA Zong-chao1, WANG Zi-jian1, LI Xue-ying1, 2*, QIU Hui-min1, HOU Guang-li1, FAN Ping-ping1*. Marine Sediment Particle Size Classification Based on the Fusion of
Principal Component Analysis and Continuous Projection Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3075-3080. |
[14] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
[15] |
XUE Fang-jia, YU Jie*, YIN Hang, XIA Qi-yu, SHI Jie-gen, HOU Di-bo, HUANG Ping-jie, ZHANG Guang-xin. A Time Series Double Threshold Method for Pollution Events Detection in Drinking Water Using Three-Dimensional Fluorescence Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3081-3088. |
|
|
|
|