Support Vector Machine Optimized by Near-Infrared Spectroscopic
Technique Combined With Grey Wolf Optimizer Algorithm to
Realize Rapid Identification of Tobacco Origin
GENG Ying-rui1, SHEN Huan-chao1, NI Hong-fei2, CHEN Yong1, LIU Xue-song1*
1. College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310030, China
2. Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Hangzhou 310018, China
Abstract:Tobacco is a natural plant with complex compositions, the quality of tobacco leaves is directly affected by several external factors such as geographic location and growth conditions. Tobacco leaves are widely planted in China, and they cultivated in different areas, they have different styles. Different blended ratios play a decisive role in the quality of cigarettes. Thus, there is an emerging need for accurate and rapid identification of the origin of tobacco leaves. Near-infrared spectroscopy technology provides a new rapid, and convenient method to automatically evaluate tobacco areas. On this basis, we proposed the grey wolf optimizer (GWO) algorithm to optimize the performance of the support vector machine model (SVM) for the first time to identify and classify tobacco leaves from different origins. This study was conducted with 824 tobacco leaf samples from eight different origins, and 617 training set samples and 207 test set samples were obtained using Set partitioning based on joint x-y distance (SPXY). The wavelength selection methods such as Competitive adaptive reweighted sampling (CARS) and Random frog (RF) algorithms were applied to reduce spectral redundant information and screen the characteristic wavelengths in the -full spectrum of the samples, and 141 and 534 were selected from all 1 609 variables, respectively. Then they were used as the input parameters of the SVM classifier. The optimization effect of GWO on the SVM model was contrasted to the Particle swarm optimization (PSO) and Genetic algorithm (GA) optimization in the same search range. The analysis showed that the spectral variables screened by RF had a better modeling performance than CARS. Among them, the RF-GWO-SVM model achieved the best predictive performance with an accuracy of 96.62% in identifying tobacco leaves from 8 producing areas. More than that, the running time of RF-GWO-SVM was 156 and 131 min shorter than RF-PSO-SVM and RF-GA-SVM, respectively. To sum up, RF-GWO-SVM has the advantages of higher accuracy and faster convergence speed. It can be seen that GWO has a more efficient optimization capability for model parameters, and the support vector machine model optimized by GWO can be used for rapid identification of tobacco origin.
Key words:Near-infrared spectroscopy; Grey wolf optimizer; Support vector machine; Tobacco; Origin identification
耿莹蕊,沈欢超,倪鸿飞,陈 勇,刘雪松. 近红外光谱结合灰狼算法优化支持向量机实现烟叶产地快速鉴别[J]. 光谱学与光谱分析, 2022, 42(09): 2830-2835.
GENG Ying-rui, SHEN Huan-chao, NI Hong-fei, CHEN Yong, LIU Xue-song. Support Vector Machine Optimized by Near-Infrared Spectroscopic
Technique Combined With Grey Wolf Optimizer Algorithm to
Realize Rapid Identification of Tobacco Origin. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(09): 2830-2835.
[1] Zimmer G F, Santos R O, Teixeira I D, et al. Journal of Chemometrics, 2020, 34(12): e3303.
[2] Wu Lijun, Wang Baoxing, Zhang Lei, et al. Journal of Near Infrared Spectroscopy, 2020, 28(3): 153.
[3] Xiang Boka, Cheng Changhe, Xia Jun, et al. Vibrational Spectroscopy, 2020, 111: 103182.
[4] Li Ruidong, Zhang Xiaobing, Li Keqiang, et al. Spectroscopy Letters, 2020, 53(9): 685.
[5] Gu Li, Xue Lichun, Song Qi, et al. Journal of Bioinformatics and Computational Biology, 2016, 14(6): 1650033.
[6] Deng Jun, Chen Weile, Liang Ce, et al. Journal of Loss Prevention in the Process Industries, 2021, 71: 104439.
[7] Subudhi U, Dash S. Journal of Industrial Information Integration, 2021, 22: 100204.
[8] LI Qing-bo, BI Zhi-qi, SHI Dong-dong(李庆波, 毕智棋, 石冬冬). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2020, 40(9): 2804.
[9] Li Tao, Su Chen. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2018, 204: 131.
[10] Ren Guangxin, Liu Ying, Ning Jingming, et al. Journal of Food Composition and Analysis, 2021, 98: 103810.
[11] Tharwat A, Schenck W. Expert Systems with Applications, 2020, 167(5): 114430.
[12] Zhang Jiawei, Tittel F K, Gong Longwen, et al. Environmental Modeling & Assessment, 2016. 21(4): 531.
[13] Mirjalili S, Mirjalili S M, Lewis A. Advances in Engineering Software, 2014, 69: 46.
[14] Zhang Lin, Sun Jun, Zhou Xin, et al. Journal of Food Processing and Preservation, 2020, 44(8): e14591.
[15] Zhang Dongyan, Yang Yi, Chen Gao, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2020, 248: 119139.