Study on Improving Stability of Near-Infrared Spectra by Normal
Distribution Screening Method
LI Xiao-xing1, 2, XIAO Jin-feng1*, ZHANG Hong-ming2*, LÜ Bo2, 3*, YIN Xiang-hui1, ZHAO Ming4, MA Fei5, FU Jia2, HU Yan1, 2, LI Zhi-hao1, 2, WANG Fu-di2, SHEN Yong-cai6, DAI Shu-yu7
1. College of Electrical Engineering, University of South China, Hengyang 421001,China
2. Institute of Plasma Physics, Hefei Institutes of Physical Sciences, Chinese Academy of Sciences, Hefei 230031,China
3. Science Island Branch, Graduate School, University of Science and Technology of China, Hefei 230031,China
4. College of Biological and Food Engineering, Anhui Polytechnic University, Hefei 241000,China
5. College of Food and Biological Engineering, Hefei University of Technology, Hefei 230009,China
6. College of Physics and Materials Engineering, Hefei Normal University, Hefei 230601,China
7. College of Physics, Dalian University of Technology, Dalian 116024,China
Abstract:In the near-infrared online detection of the fermentation process, bubbles are often generated in the fermentation broth due to the need to continuously pass oxygen into the fermentation broth to promote microbial growth and metabolic activities. When the bubbles in the fermentation broth pass in front of the probe, they will interfere with the intensity of the near-infrared (NIR) spectrum. To eliminate the abnormal spectra caused by bubbles collected during the near-infrared online detection of fermentation broth and reduce spectral fluctuations, a normal distribution screening method is proposed in this study. In this study, 600 g of glucose solution with a mass fraction of 10% was prepared, adding 2 g of glucose solution to a reactor containing 600 mL of distilled water every 30 s, stirring well, then calculating and recording the mass fraction of glucose solution in the reactor, and generating bubbles by passing oxygen to the bottom of the reactor, and collecting the NIR spectra of the glucose solution in the reactor by using NIR spectrometer, respectively. After the anomalous spectra affected by the air bubbles were excluded by principal component analysis (PCA) combined with Mahalanobis distance method, Euclidean distance method, isolated forest, and normal distribution screening method, the sample set of spectra was randomly divided into the correction set and the prediction set according to the ratio of 4∶1, and then, after the spectral pre-processing, the glucose concentration prediction model was established for the correction set using the partial least squares method (PLSR) and the prediction set was analyzed by the established PLSR model. The correlation coefficient of the correction set, the root mean square error of the correction set, and the correlation coefficient and root mean square error of the prediction set were compared and analyzed. The results of the constructed model after removing the anomalous spectra affected by bubbles using the four methods are as follows: the correlation coefficient R2c of the correction set obtained after removing the anomalous spectra by PCA combined with the Mahalanobis Distance Method is 0.998 208, and the root-mean-square error RMSECV is 0.000 764, and the correlation coefficient R2p of the prediction set is 0.997 994, and the root-mean-square error RMSEP is 0.000 764; The correction set R2c obtained after removing the anomalous spectra by the Euclidean distance method is 0.998 628, the root mean square error RMSECV is 0.000 652, the prediction set correlation coefficient R2p is 0.998 628, and the root mean square error RMSEP is 0.000 655; the correction set R2c obtained after removing the anomalous spectra by the isolated forest method is 0.998 255, the RMSECV is 0.000 739, the prediction set R2p is 0.998 132, and the RMSEP is 0.000 740; the correction set R2c obtained after the removal of anomalous spectra by the normal distribution screening method is 0.998 641, with a root mean square error RMSECV of 0.000 645, and the prediction set R2p is 0.998 628, with a RMSEP of 0.000 636. Comparing the four methods, the normal distribution screening method can effectively reduce the fluctuation of spectral intensity and eliminate abnormal spectra more effectively than other methods.
Key words:Near-infrared spectroscopy; Abnormal spectrum removal; Mahalanobis distance; Normal distribution screening method
李晓星,肖金凤,张洪明,吕 波,尹相辉,赵 明,马 飞,符 佳,胡 艳,李志豪,王福地,沈永才,戴舒宇. 正态分布筛选法提高近红外光谱稳定性研究[J]. 光谱学与光谱分析, 2025, 45(06): 1566-1577.
LI Xiao-xing, XIAO Jin-feng, ZHANG Hong-ming, LÜ Bo, YIN Xiang-hui, ZHAO Ming, MA Fei, FU Jia, HU Yan, LI Zhi-hao, WANG Fu-di, SHEN Yong-cai, DAI Shu-yu. Study on Improving Stability of Near-Infrared Spectra by Normal
Distribution Screening Method. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2025, 45(06): 1566-1577.
[1] CHU Xiao-li, SHI Yun-ying, CHEN Pu, et al(褚小立,史云颖,陈 瀑,等). Journal of Instrumental Analysis(分析测试学报), 2019, 38(5): 603.
[2] LONG Ruo-lan,FENG Dan,LUO Xi,et al(龙若兰,冯 丹,罗 西,等). Journal of Instrumental Analysis(分析测试学报), 2023, 42(8): 920.
[3] SHI Lu-zhen,ZHANG Jing-chuan,WANG Yan-qun, et al(石鲁珍, 张景川, 王彦群,等). Journal of Chinese Agricultural Mechanization(中国农机化学报), 2016, 37(6): 99.
[4] Zhang L X, Li P W, Mao J,et al. Journal of Computational Chemistry, 2015, 36:1902.
[5] Chen W C, Du Y P, Zhang F Y, et al. Journal of Chemometrics, 2017, 32(11):e2933.
[6] WANG Ya-dong, JIA Jun-wei, TAN Wei-jun, et al(王亚栋, 贾俊伟, 谭韦君,等) . Journal of Instrumental Analysis(分析测试学报), 2024, 43(4): 607.
[7] LUO Lin, TUO Xian-guo, ZHANG Gui-yu, et al(罗 林,庹先国,张贵宇,等). Journal of Food Safety and Quality(食品安全质量检测学报), 2022, 13(9): 3017.
[8] CHEN Bin, ZOU Xian-yong, ZHU Wen-jing(陈 斌, 邹贤勇, 朱文静). Journal of Jiangsu University (Natural Science Edition)[江苏大学学报(自然科学版)], 2008, 29(4): 277.
[9] WANG Hao, WANG Peng, YU Jia-hang, et al(王 浩,王 鹏,余嘉航,等). Science and Technology of Food Industry(食品工业科技), 2021, 42(17): 235.
[10] HUANG Yuan-cheng, XUE Yuan-yuan, LI Peng-fei(黄远程,薛园园,李朋飞). Acta Geodaetica et Cartographica Sinica(测绘学报), 2021, 50(3): 416.
[11] XIONG Kun,DING Qiang,ZHU Hong-mei(熊 坤,丁 强,祝红梅). Computer Application and Software(计算机应用与软件), 2023, 40(1): 84.
[12] QIN Wen-hu, DONG Kai-yue, DENG Zhi-chao(秦文虎,董凯月,邓志超). Soils(土壤), 2023, 55(6): 1347.
[13] LI Chun-yan, YE Li-ming, XUE Jin-tao(李春燕,叶利明,薛金涛). Physical Testing and Chemical Analysis (Part B: Chemical Analysis)[理化检验(化学分册)], 2023, 59(12): 1407.
[14] QU Qiong, HAN Li-zhu, ZHAO Xiao-mei, et al(屈 琼,韩立柱,赵小梅,等). Chinese Traditional and Herbal Drugs(中草药), 2023, 54(21): 7017.
[15] XU Hong-fa, LIU Zheng-hui, ZHANG Hong-mei, et al(徐宏发,刘正辉,张红梅,等). Journal of Zhejiang University(Agriculture and Life Sciences)[浙江大学学报(农业与生命科学版)], 2024, 50(3): 393.
[16] HAN Xiang,HE Xiao-gang, YU Jia-ping,et al(韩 想,贺小刚,于佳萍,等). Physical Testing and Chemical Analysis (Part B: Chemical Analysis)[理化检验(化学分册)], 2023, 59(7): 792.
[17] ZHANG Xun, HUANG Xiao-xuan, YIN Jin-ke, et al(张 勋,黄晓萱,殷金可,等). Chinese Journal of Modern Applied Pharmacy(中国现代应用药学), 2023, 40(19): 2702.
[18] SUN Tong, LI Han-lin, KONG Ling-fei, et al(孙 通,李翰林,孔令飞,等). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2023, 39(24): 298.
[19] WANG Qi, MA Hui-feng, CAI Jian-bo, et al(王 启,马辉峰,蔡建波,等). China Brewing(中国酿造), 2023, 42(12): 161.