|
|
|
|
|
|
Application of High-Dimensional Infrared Spectral Data Preprocessing in the Origin Identification of Traditional Chinese Medicinal Materials |
JIN Cheng-liang1, WANG Yong-jun2*, HUANG He2, LIU Jun-min3 |
1. School of Information and Engineering, Wenzhou Business College, Wenzhou 325035, China
2. School of Artificial Intelligence, Wenzhou Polytechnic, Wenzhou 325035, China
3. School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China
|
|
|
Abstract To improve the effectiveness of identifying the origin of Chinese Medicinal Materials based on infrared spectroscopic data with high dimensions, appropriate data preprocessing(DP) should be firstly used, and advanced algorithms can be considered secondly if necessary. Faced with the dataset consists of 658 samples with wavelengths from 551 to 3 998 nm, with the help of support vector machine (SVM) algorithm, ten sample-based DP methods (namelynon-DP, maximum and minimum normalization, standardization, centralization, moving average smoothing, SG smoothing filtering, multivariate scattering correction, regularization, first order derivative followed by second order derivative calculation), five spectral feature based methods (i. e., non-DP, centralization, maximum and minimum normalization, standardization and regularization) and their combinations (50 kinds in total) were investigated accord to the prediction effectiveness and stability. Numerical results show that the right DP is conducive to improving the model accuracy. Moreover the standard variate and Max-Min average DP methods achieve higher scores (the coefficient R2 is approximately 85%) among 10 sample based methods. Feature based only methods get little model improvement. The sample based only and feature-based only methods get the approximately equal average ratio of 64%. The combined methods of standard normal variate or normalization processing followed by second order derivative DP achieve the relatively highest prediction score with R2 of nearly 94%. However, the DP approach of data regularization added to centralization performs most poorly. The suggestions are also given. The research is valuable for further analysis of medicinal efficacy and chemical composition. Furthermore, it can be a reference to infrared spectral data analysis. Moreover, the research also provides references for modeling data with high dimensional small samples.
|
Received: 2022-05-29
Accepted: 2022-10-09
|
|
Corresponding Authors:
WANG Yong-jun
E-mail: wangyjmcvti@qq.com
|
|
[1] LI Zhi-gang(李志刚). Spectral Data Processing and Quantitative Analysis Technology(光谱数据处理与定量分析技术). Beijing: Beijing University of Posts and Telecommunications Press(北京:北京邮电大学出版社), 2017.
[2] LIU Shu-hua, ZHANG Xue-gong, SUN Su-qin(刘沭华, 张学工, 孙素琴). Chinese Science Bulletin(科学通报) , 2005,50(4): 393.
[3] ZHU Yan, CUI Xiu-ming, SHI Li-ping(朱 艳, 崔秀明, 施莉屏). Research and Practice on Chinese Medicines(现代中药研究与实践), 2006,20(1): 58.
[4] WANG Yong, LI Hao, WANG Jing(汪 勇, 李 好, 王 静). Statistics & Decision(统计与决策), 2020, 36(24): 15.
[5] WANG Xin(王 欣). Science & Technology Information(科技资讯), 2013, 336(15): 2.
[6] WANG Zhi-hong, LIU Jie, WANG Jing-ru, et al(王智宏, 刘 杰, 王婧茹, 等). Journal of Jilin University(Engineering and Technology Edition)[吉林大学学报(工学版)], 2013, 43(4): 1017.
[7] LÜ Mei-rong, REN Guo-xing, LI Xue-ying, et al(吕美蓉,任国兴,李雪莹, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2020, 40(8): 2409.
[8] Windig W, Shaver J, Bro R, et al. Applied Spectroscopy, 2008, 62(10): 1153.
[9] WANG Jian-feng, ZHANG Lei, CHEN Guo-xing, et al(王健峰, 张 磊, 陈国兴, 等). Applied Science and Technology(应用科技), 2012, 39(3): 28.
|
[1] |
HUANG You-ju1, TIAN Yi-chao2, 3*, ZHANG Qiang2, TAO Jin2, ZHANG Ya-li2, YANG Yong-wei2, LIN Jun-liang2. Estimation of Aboveground Biomass of Mangroves in Maowei Sea of Beibu Gulf Based on ZY-1-02D Satellite Hyperspectral Data[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3906-3915. |
[2] |
LIU Fei1, TAN Jia-jin1*, XIE Gu-ai2, SU Jun3, YE Jian-ren1. Early Diagnosis of Pine Wilt Disease Based on Hyperspectral Data and Needle Resistivity[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3280-3285. |
[3] |
LI Xin-xing1, 2, ZHANG Ying-gang1, MA Dian-kun1, TIAN Jian-jun3, ZHANG Bao-jun3, CHEN Jing4*. Review on the Application of Spectroscopy Technology in Food Detection[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(08): 2333-2338. |
[4] |
YANG Liu1, GUO Zhong-hui1, JIN Zhong-yu1, BAI Ju-chi1, YU Feng-hua1, 2, XU Tong-yu1, 2*. Inversion Method Research of Phosphorus Content in Rice Leaves Produced in Northern Cold Region Based on WPA-BP[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(05): 1442-1449. |
[5] |
ZHAO Ting-ting1, 3, WANG Ke-jian1, 3*, SI Yong-sheng1, 3, SHU Ying2, HE Zhen-xue1, 3, WANG Chao1, 3, ZHANG Zhi-sheng2*. Freshness Detection of Lamb Based on AW-OPS Hyperspectral
Wavelength Selection Method[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(03): 830-837. |
[6] |
LI Chun-qiang1, 2, GAO Yong-gang1, 2, XU Han-qiu1, 2*. Cross Comparison Between Landsat New Land Surface Temperature
Product and the Corresponding MODIS Product[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(03): 940-948. |
[7] |
XIA Tong, LIU Yi-wei, GAO Yuan, CHENG Jie*, YIN Jian. Model-Fitting Methods for Mineral Raman Spectra Classification[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(02): 583-589. |
[8] |
LIU Qing-song1, DAN You-quan1, YANG Peng2, XU Luo-peng1, YANG Fu-bin1, DENG Nan1. Simulation of Emission Spectrum of Abyssal Methane Based on
HITRAN Database[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(09): 2714-2719. |
[9] |
BAI Zi-jin1, PENG Jie1*, LUO De-fang1, CAI Hai-hui1, JI Wen-jun2, SHI Zhou3, LIU Wei-yang1, YIN Cai-yun1. A Mid-Infrared Spectral Inversion Model for Total Nitrogen Content of Farmland Soil in Southern Xinjiang[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(09): 2768-2773. |
[10] |
ZHANG Yan1, 2, 3,WU Hua-rui1, 2, 3,ZHU Hua-ji1, 2, 3*. Hyperspectral Latent Period Diagnosis of Tomato Gray Mold Based on TLBO-ELM Model[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(09): 2969-2975. |
[11] |
ZHENG Yi1, 2, 3, WANG Yao1, 2, LIU Yan1, 2*. Study on Classification and Recognition of Mountain Meadow Vegetation Based on Seasonal Characteristics of Hyperspectral Data[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(06): 1939-1947. |
[12] |
PAN Dong-rong1, 2, HAN Tian-hu2, YAN Hao-wen1*. Spatiotemporal Dynamics of Vegetation Coverage in Different Ecological Areas of the Qilian Mountains Based on Spectral Data[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(04): 1192-1198. |
[13] |
DENG Shi-yu1, 2, LIU Cheng-zhi1, 4*, TAN Yong3*, LIU De-long1, ZHANG Nan1, KANG Zhe1, LI Zhen-wei1, FAN Cun-bo1, 4, JIANG Chun-xu3, LÜ Zhong3. A Combination of Multiple Deep Learning Methods Applied to Small-Sample Space Objects Classification[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(02): 609-615. |
[14] |
LI Yan-kun1*, DONG Ru-nan1, ZHANG Jin2, HUANG Ke-nan3, MAO Zhi-yi4. Variable Selection Methods in Spectral Data Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41(11): 3331-3338. |
[15] |
CHEN Mei-chen, YU Hai-ye, LI Xiao-kai, WANG Hong-jian, LIU Shuang, KONG Li-juan, ZHANG Lei, DANG Jing-min, SUI Yuan-yuan*. Response Analysis of Hyperspectral Characteristics of Maize Seedling Leaves Under Different Light and Temperature Environment[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41(11): 3545-3551. |
|
|
|
|