光谱学与光谱分析 |
|
|
|
|
|
An Optimal Selection Method of Samples of Calibration Set and Validation Set for Spectral Multivariate Analysis |
LIU Wei1, ZHAO Zhong1*, YUAN Hong-fu2, SONG Chun-feng2, LI Xiao-yu2 |
1. College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China 2. College of Materials Science and Engineering, Beijing University of Chemical Technology, Beijing 100029, China |
|
|
Abstract The side effects in spectral multivariate modeling caused by the uneven distribution of sample numbers in the region of the calibration set and validation set were analyzed, and the “average” phenomenon that samples with small property values are predicted with larger values, and those with large property values are predicted with less values in spectral multivariate calibration is showed in this paper. Considering the distribution feature of spectral space and property space simultaneously, a new method of optimal sample selection named Rank-KS is proposed. Rank-KS aims at improving the uniformity of calibration set and validation set. Y-space was divided into some regions uniformly, samples of calibration set and validation set were extracted by Kennard-Stone(KS) and Random-Select(RS) algorithm respectively in every region, so the calibration set was distributed evenly and had a strong presentation. The proposed method were applied to the prediction of dimethylcarbonate (DMC) content in gasoline with infrared spectra and dimethylsulfoxide in its aqueous solution with near infrared spectra. The “average” phenomenon showed in the prediction of multiple linear regression (MLR) model of dimethylsulfoxide was weakened effectively by Rank-KS. For comparison, the MLR models and PLS1 models of MDC and dimethylsulfoxide were constructed by using RS, KS, Rank-Select, sample set partitioning based on joint X- and Y-blocks (SPXY) and proposed Rank-KS algorithms to select the calibration set, respectively. Application results verified that the best prediction was achieved by using Rank-KS. Especially, for the distribution of sample set with more in the middle and less on the boundaries, or none in the local, prediction of the model constructed by calibration set selected using Rank-KS can be improved obviously.
|
Received: 2013-07-01
Accepted: 2013-10-15
|
|
Corresponding Authors:
ZHAO Zhong
E-mail: zhaozhong@mail.buct.edu.cn
|
|
[1] Daszykowski M,Walczak B,Massart D L. Analytical Chimica Acta,2002,468:91. [2] Gabriel G Siano,Héctor C Goicoechea. Chemometrics and Intelligent Laboratory Systems,2007,88:204. [3] YUAN Hong-fu, CHU Xiao-li, TIAN Gao-you, et al (袁洪福,褚小立,田高友,等). Standard Guidelines for Molecular Spectroscopy Multivariate Calibration Quantitative Analysis(分析光谱多元校正定量分析通则). National Standard(中华人民共和国国家标准). [4] Kanduc K R, Zupan J, Majcen N. Chemometrics and Intelligent Laboratory Systems, 2003, 65(2): 221. [5] Kennard R W,Stone L A. Technometrics, 1969, 11: 137. [6] Snee R D. Technometrics, 1977, 19(4): 415. [7] WU Jing-zhu(吴静珠). Research of NIR-Based Technology on Agriculture Products Detection(农产品品质检测中的近红外光谱分析技术研究). Beijing: China Agricultural University(北京:中国农业大学), 2006. [8] Roberto Kawakami Harrop Galvo,Mário César Ugulino Araujb, Gledson Emídio José,et al. Talanta, 2005,67: 736. [9] Christian Hakemeyera, Ulrike Straussa, Silke Werza, et al. Talanta, 2012,15: 12. [10] XIE Jun, PAN Tao, CHEN Jie-mei, et al(谢 军,潘 涛,陈洁梅,等). Chinese Journal of Analytical Chemistry(分析化学), 2010, 38(3): 342.
|
[1] |
CHENG Jia-wei1, 2,LIU Xin-xing1, 2*,ZHANG Juan1, 2. Application of Infrared Spectroscopy in Exploration of Mineral Deposits: A Review[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 15-21. |
[2] |
LI Jie, ZHOU Qu*, JIA Lu-fen, CUI Xiao-sen. Comparative Study on Detection Methods of Furfural in Transformer Oil Based on IR and Raman Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 125-133. |
[3] |
YANG Cheng-en1, 2, LI Meng3, LU Qiu-yu2, WANG Jin-ling4, LI Yu-ting2*, SU Ling1*. Fast Prediction of Flavone and Polysaccharide Contents in
Aronia Melanocarpa by FTIR and ELM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 62-68. |
[4] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[5] |
LIU Jia, ZHENG Ya-long, WANG Cheng-bo, YIN Zuo-wei*, PAN Shao-kui. Spectra Characterization of Diaspore-Sapphire From Hotan, Xinjiang[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 176-180. |
[6] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[7] |
GUO Ya-fei1, CAO Qiang1, YE Lei-lei1, ZHANG Cheng-yuan1, KOU Ren-bo1, WANG Jun-mei1, GUO Mei1, 2*. Double Index Sequence Analysis of FTIR and Anti-Inflammatory Spectrum Effect Relationship of Rheum Tanguticum[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 188-196. |
[8] |
LI Xiao-dian1, TANG Nian1, ZHANG Man-jun1, SUN Dong-wei1, HE Shu-kai2, WANG Xian-zhong2, 3, ZENG Xiao-zhe2*, WANG Xing-hui2, LIU Xi-ya2. Infrared Spectral Characteristics and Mixing Ratio Detection Method of a New Environmentally Friendly Insulating Gas C5-PFK[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3794-3801. |
[9] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[10] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[11] |
SUN Wei-ji1, LIU Lang1, 2*, HOU Dong-zhuang3, QIU Hua-fu1, 2, TU Bing-bing4, XIN Jie1. Experimental Study on Physicochemical Properties and Hydration Activity of Modified Magnesium Slag[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3877-3884. |
[12] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[13] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[14] |
DANG Rui, GAO Zi-ang, ZHANG Tong, WANG Jia-xing. Lighting Damage Model of Silk Cultural Relics in Museum Collections Based on Infrared Spectrum[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3930-3936. |
[15] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
|
|
|
|