|
|
|
|
|
|
Research on the Twin Check Abnormal Sample Detection Method of
Mid-Infrared Spectroscopy |
ZHANGZHU Shan-ying1, 2, 3, ZHANG Ruo-jing1, 2, 3, GU Han-wen5, XIE Qin-lan1, 2, 3*, ZHANG Xian-wen4*, SA Ji-ming5, LIU Yi6, 3 |
1. College of Biomedical Engineering, South-Central Minzu University, Wuhan 430074, China
2. Key Laboratory of Cognitive Science, State Ethnic Affairs Commission, Wuhan 430074, China
3. Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, Wuhan 430074, China
4. Linyi Grepo Garden Machinery Co., Ltd., Linyi 276700, China
5. School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China
6. School of Mechanical and Electrical Engineering, Wuhan University of Technology, Wuhan 430070, China
|
|
|
Abstract Mid-infrared absorption spectroscopy is one of the most promising non-invasive blood glucose measurement techniques. The accuracy of blood glucose concentration measurement results of the mid-infrared absorption spectrum is closely related to the reliability of spectral signals. However, collecting mid-infrared spectral signals is susceptible to environmental or human factors, and an anomaly spectrum containing a large amount of interference information will be generated. The existence of an anomaly spectrum will reduce the effectiveness and reliability of the prediction model, so the detection and removal of abnormal samples are crucial. This study proposes that the twin check abnormal sample detection method can accurately screen and eliminate abnormal samples. This algorithm is divided into two stages. Firstly, the Monte Carlo cross-validation abnormal sample detection method is used to preliminarily screen abnormal samples and improve the stability of the spectral sample set. Secondly, based on the theory that Mahalanobis distance square approximately obeys chi-square distribution, the optimal threshold is adaptively determined, and the remaining data sets are re-identified with abnormal samples. 64 samples of the glucose-mixed imitated solution containing glucose, albumin, urea, lactic acid, fructose and cholesterol were studied. The twin check method first uses the characteristic that the sum of squared prediction errors is sensitive to abnormal samples to make a preliminary judgment on the abnormal samples in the spectral data set, and a total of 3 abnormal samples are detected. The PLS correction model is established after removing the abnormal samples from the spectral data set. The correlation coefficient of this model is 0.91, and RMSECV is 60.17 mg·dL-1. Secondly, the twin check method is based on the theory of Mahalanobis distance square approximately conforming to chi-square distribution, which realizes the adaptive identification of abnormal samples. A total of 12 abnormal samples were detected. The performance of the PLS model constructed after removing all abnormal samples was improved, with the correlation coefficient reaching 0.99 and RMSECV reaching 57.77 mg·dL-1. By comparing the results of the twin check method with the non-abnormal sample removal, PCA-MD method and Monte Carlo method, the superiority of this algorithm in abnormal sample detection is proved. Compared with the PLS model without removing abnormal samples, the correlation coefficient increased from 0.86 to 0.99, and RMSECV decreased from 67.51 to 57.77 mg·dL-1, increasing by 15.12% and 14.42%, respectively. This study provides a good solution strategy for the problem of false detection of normal samples or missing detection of abnormal samples due to the easy influence of threshold of existing abnormal sample detection methods, which is conducive to the method's accurate detection and elimination of abnormal samples, thus improving the accuracy and prediction performance of the prediction model. This method provides a way to eliminate the abnormal samples of mid-infrared absorption spectrum accurately.
|
Received: 2023-05-31
Accepted: 2023-09-25
|
|
Corresponding Authors:
XIE Qin-lan, ZHANG Xian-wen
E-mail: xieqinlan@126.com;zxwen84@163.com
|
|
[1] Marjan Gusev, Lidija Poposka, Gjoko Spasevski, et al. Journal of Sensors, 2020, 2020: 9628281.
[2] Mekonnen B K, Yang W, Hsieh T H, et al. Biomedical Signal Processing and Control, 2020, 59: 101923
[3] Baba A M, Midi H, Adam M B, et al. Symmetry,2021, 13(11): 2030.
[4] ZHAO Zhi-lei, WANG Xue-mei, LIU Dong-dong, et al(赵志磊,王雪妹,刘冬冬,等). Spectroscopy and Spectral Analysis(光谱学与光谱分析),2022,42(9):2836.
[5] LIAO Wen-hui, HUANG Ying-qiang, HE Zhi-feng, et al(廖文辉, 黄颖强, 何志锋, 等). Journal of Applied Statistics and Management(数理统计与管理), 2021, 40(5): 822.
[6] CHEN Bo-wen, ZHUAN Sun-hao, JIN Yong-qi, et al(陈博文, 颛孙浩, 金咏琪, 等). Journal of Forest and Environment(森林与环境学报), 2022, 42(1): 88.
[7] GUO Liang, LIU Jian-ya, LU Ruo-dan. Science China Mathematics, 2021, 64(1): 197.
[8] ZHOU Yang, WANG Chun-lin, GUO Rui(周 杨, 王春林, 郭 锐). Modern Electronic Technology(现代电子技术), 2023, 46(8): 143.
[9] JIANG Ming-wei, WANG Cai-hong, ZHANG Qing-hui(姜明伟, 王彩红, 张庆辉). Journal of Henan University of Technology(Natural Science Edition[河南工业大学学报(自然科学版)], 2020, 41(6): 91.
[10] YUAN Ying, WANG Xue-feng(袁 莹, 王雪峰). Scientia Silvae Sinicae(林业科学), 2022, 58(9): 36.
[11] Ine L Jernelv, Karina Strom, Dag Roar Hjelme, et al. Sensors, 2019, 19(23): 5130.
[12] Fuglerud S S, Ellingsen R, AksnesA, et al. Journal of Biophotonics, 2021, 14(5): e202000450.
[13] CHEN Ting, LIU Qing-jun, WU Yan-wen, et al(陈 婷,刘清珺,武彦文,等). Journal of Food Safety & Quality(食品安全质量检测学报), 2015, 6(3): 836.
[14] SHI Lu-zhen, CHEN Jie, ZHANG Shu-yan, et al(石鲁珍, 陈 杰, 张树艳, 等). Jiangsu Agricultural Sciences(江苏农业科学), 2018, 46(14): 205.
[15] Etherington Thomas R. Peer J, 2019, 7: e6678.
|
[1] |
NI Jin1, SUO Li-min1*, LIU Hai-long1, ZHAO Rui2. Identification of Corn Varieties Based on Northern Goshawk Optimization Kernel Based Extreme Learning Machine[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(06): 1584-1590. |
[2] |
YU Shui1, HUAN Ke-wei1*, LIU Xiao-xi2, WANG Lei1. Quantitative Analysis Modeling of Near Infrared Spectroscopy With
Parallel Convolution Neural Network[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(06): 1627-1635. |
[3] |
WEI Zi-chao1, 2, LU Miao1, 2, LEI Wen-ye1, 2, WANG Hao-yu1, 2, WEI Zi-yuan1, 2, GAO Pan1, 2, WANG Dong1, 2, CHEN Xu1, 2*, HU Jin1, 2*. A Nondestructive Method Combined Chlorophyll Fluorescence With Visible-NIR Spectroscopy for Detecting the Severity of Heat Stress on Tomato Seedlings[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(06): 1613-1619. |
[4] |
GE Qing, LIU Jin*, HAN Tong-shuai, LIU Wen-bo, LIU Rong, XU Ke-xin. Influence of Medium's Optical Properties on Glucose Detection
Sensitivity in Tissue Phantoms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(05): 1262-1268. |
[5] |
LIU Yu-ming1, 2, 3, WANG Qiao-hua1, 2, 3*, CHEN Yuan-zhe1, LIU Cheng-kang1, FAN Wei1, ZHU Zhi-hui1, LIU Shi-wei1. Non-Destructive Near-Infrared Spectroscopy of Physical and Chemical
Indicator of Pork Meat[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(05): 1346-1353. |
[6] |
YANG Zeng-rong1, 2, WANG Huai-bin1, 2, TIAN Mi-mi1, 2, LI Jun-hui1, 2, ZHAO Long-lian1, 2*. Early Apple Bruise Detection Based on Near Infrared Spectroscopy and Near Infrared Camera Multi-Band Imaging[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(05): 1364-1371. |
[7] |
XU Cui-xiang1, CHEN Yu-di2, ZOU Tao2, YANG Ying2. Mineralogical and Spectral Characteristics of Azurite Ores From Different Origins[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(05): 1372-1378. |
[8] |
XU Rong1, AO Dong-mei2*, XU Xin1, 2, WANG Zhan-lin1, 2, HU Ying2, LIU Sai1, QIAO Hai-li1, XU Chang-qing1*. Study on the Identification Method of Lycium Barbarum Cultivars in Ningxia Based on Infrared Spectrum and Cluster Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(05): 1386-1391. |
[9] |
HU Jin-quan1, 2, YANG Hui-hua1, ZHAO Guo-liang3, ZHOU Rui-zhi4, LI Ling-qiao5. Prediction Method of Wool Content in Waste Spinning Samples Based on Semi Supervised Regression of Generative Adversarial Network[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(05): 1417-1424. |
[10] |
YANG Cheng-en1, 2, LI Meng3, WANG Tian-ci1, 2, WANG Jin-ling4, LI Yu-ting2*, SU Ling1*. Identification of Aronia Melanocarpa Fruits From Different Areas by Mid-Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(04): 991-996. |
[11] |
LI Zhen, HOU Ming-yu, CUI Shun-li, CHEN Miao, LIU Ying-ru, LI Xiu-kun, CHEN Huan-ying, LIU Li-feng*. Rapid Detection Method of Flavonoid Content in Peanut Seed Based on Near Infrared Technology[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(04): 1112-1116. |
[12] |
MENG Qi1, 3, ZHAO Peng2, HUAN Ke-wei2, LI Ye2, JIANG Zhi-xia1, 3, ZHANG Han-wen2, ZHOU Lin-hua1, 3*. Non-Invasive Blood Glucose Measurement Based on Near-Infrared
Spectroscopy Combined With Label Sensitivity Algorithm and
Support Vector Machine[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(03): 617-624. |
[13] |
GE Xue-feng, SHI Bin, TANG Meng-yuan, JI Kang, ZHANG Yin-ping, GU Min-fen*. Spectral Detection of Desloratadine[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(03): 751-755. |
[14] |
YUAN Hui, LIU Dan, XU Guang-tong*. Determination of Trace Gaseous Contaminants in FCV Hydrogen Fuel by Modular Fourier Transform Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(03): 853-858. |
[15] |
TANG Jie1, LUO Yan-bo2, LI Xiang-yu2, CHEN Yun-can1, WANG Peng1, LU Tian3, JI Xiao-bo4, PANG Yong-qiang2*, ZHU Li-jun1*. Study on One-Dimensional Convolutional Neural Network Model Based on Near-Infrared Spectroscopy Data[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(03): 731-736. |
|
|
|
|