Application of Various Algorithms for Spectral Variable Selection in NIRS Modeling of Red Ginseng Extraction
CHEN Bei1, ZHENG En-rang1*, GUO Tuo2
1. School of Electrical and Control Engineering, Shaanxi University of Science & Technology, Xi’an 710021, China
2. School of Electronic Information and Artificial Intelligence, Shaanxi University of Science & Technology, Xi’an 710021, China
Abstract:As an effective active component in red ginseng extraction, ginsenoside content has an important impact on the quality of follow-up products. Traditional chemical detection quality control methods have high costs and time-delay. Existing studies have shown that the fast and non-destructive near-infrared detection method is feasible for red ginseng extraction. However, the existing methods heavily rely on the data processing software algorithm of the instrument, which cannot meet the actual production accuracy and speed requirements. In order to monitor the extraction process rapidly and accurately, a variety of intelligent spectral selection algorithms are applied in the near-infrared spectral(NIRS) modeling, and the performance and robustness of different spectral selection algorithms are compared in this study. In order to detect the high content of ginsenoside Rg1 and the low content Rc in the red ginseng extract, 128 samples of red ginseng extract in the first two times extracted were collected from three batches, 1 000~2 499 nm band NIRS data were obtained online, and the content of ginsenoside was determined by using the international standard high-performance liquid chromatography (HPLC). Firstly, the dimension of the input wavelength was reduced by using four wavelength selection algorithms, namely, competitive adaptive reweighting sampling (CARS), the uninformative variable elimination (UVE), random frog (RF) and successive projection algorithm (SPA). Then the selected wavelength was used for the linear partial least squares (PLS) quantitative model establishment. At last, the performance of the model was evaluated by the root mean square error (RMSE), coefficient of determination (R2) and relative analysis error (RPD), etc. According to the PLS modeling results of four wavelength optimization algorithms, after RF optimization, the characteristic wavelength variable of the modeling decreased to 0.67% of the original, R2 of the ginsenoside Rg1 and Rc content in red ginseng extract reached above 0.94, the RMSE of the prediction was 0.024 6 and 0.013 5 respectively, and the RPD of prediction set reached above 4.84, which reduced the difficulty of the modeling and improved the accuracy of modeling. From the comparison of RF and CARS modeling in the original spectrum, full-spectrum and SNV pretreated full spectrum, the overall performance of the RF wavelength selection algorithm model is better. Different spectral ranges and pretreatment methods have little impact on the performance and good robustness. In conclusion, RF is a relatively ideal wavelength selection algorithm for the modeling of red ginseng extract. PLS based on RF realizes the one-time modeling of two red ginseng extracts, which can be used to rapidly detect ginsenoside content in the extract. The study provides theoretical support for the online extraction control of medicine.
Key words:Near infrared spectroscopy; Red ginseng extraction; RF; Robustness; Ginsenoside
[1] Park H H, Choi S W, Lee G J, et al. Journal of Ginseng Research, 2019, 43(1): 86.
[2] WANG Min, LIU Yong-li, DUAN Ji-ping, et al(王 敏, 刘永利, 段吉平,等). Chinese Pharmaceutical Affairs(中国药事),2017, 31(6): 647.
[3] Chinese Pharmacopoeia Commission(国家药典委员会). Pharmacopoeia of the People’s Republic of China(中华人民共和国药典). Beijing: China Medical Science Press(北京:中国医药科技出版社),2020. 160.
[4] HOU Xin-lian, HUANG Lu, PENG Cheng, et al(侯新莲, 黄 露, 彭 成,等). China Pharmacy(中国药房), 2020, 31(10): 1228.
[5] Akkaya M R. Journal of Food Science and Technology, 2018, 55(6): 2318.
[6] Razuc M, Grafia A, Gallo L, et al. Drug Development & Industrial Pharmacy, 2019, 45(10): 1.
[7] HAO Pan-yun, MENG Yan-jun, ZENG Fan-gui, et al(郝盼云,孟艳军,曾凡桂,等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2020,40(3):787.
[8] LIU Huan, LI Ling-ming, YU He-shui, et al(刘 唤, 李灵明, 余河水, 等). Chinese Traditional and Herbal Drugs(中草药), 2018, 49(9): 2210.
[9] XIAO Xue, LI Jun-shan, ZHANG Bo, et al(肖 雪, 李军山, 张 博,等). Acta Scientiarum Naturalium Universitatis Nankaiensis(南开大学学报·自然科学版), 2017, 50(3): 44.
[10] LÜ Xiao-han,JIANG Jin-lin,YANG Jing, et al(吕晓菡, 蒋锦琳, 杨 静,等). Journal of Zhejiang University·Agriculture and Life Sciences(浙江大学学报·农业与生命科学版), 2019, 45(6): 760.
[11] Li H D, Liang Y Z, Xu Q S, et al. Analytica Chimica Acta, 2009, 648(1): 77.
[12] Centner V, Massart D L, De Noord O E. Analytical Chemistry, 1996, 68(21): 3851.
[13] Zhang J K, Rivard B Rogge D M. Sensors, 2008, 8(2): 1321.
[14] Eusuff M M, Lansey K E. Journal of Water Resources Planning and Management, 2003, 129(3): 210.
[15] Li J, Zhang H, Zhan B, et al. Infrared Physics & Technology, 2019, 104: 103154.