Identifying Multi-Class Drugs by Using Near-Infrared Spectroscopy and Variational Auto-Encoding Modeling
ZHENG An-bing1, YANG Hui-hua1,2*, PAN Xi-peng1,2, YIN Li-hui3, FENG Yan-chun3
1. School of Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China
2. School of Computer and Information Security, Guilin University of Electronic Science and Technology, Guilin 541004, China
3. China Institute for Food and Drug Control, Beijing 100050, China
Abstract:With the expansion of online pharmacies, more and more counterfeit drugs without drug patents or licenses will appear in the markets with forged brand packaging. It is inevitable that the low-cost drug products will be sold at a high price if there are no methods to identify the source. These drugs evade drug supervision and approval procedures, harm the interests of consumers and bring great risks to the whole drug market. Near infrared spectroscopy (NIR) has the advantages of low cost, direct measurement, non-destructive testing and on-site testing. It is especially suitable for the rapid modeling and analysis of drugs in the condition that there are effective feature extraction and appropriate classifiers. Meanwhile, Auto-encoding is an important branch of deep learning method, which is mainly used for extracting non-linear dimensional reduction feature of data, and Variational Auto-encoding (VAE) is the most popular Auto-encoding algorithm in recent years, it has strong feature extraction ability and is widely used in computer vision, speech recognition and other fields, yet there is no report on the NIR analysis. Based on VAE, through a specially designed artificial neural network structure and loss function, this paper constructs NIR classification model for multi-category and multi-manufacturer drugs. Four kinds of drugs (metformin hydrochloride tablets, chlorpromazine hydrochloride tablets, chlorphenamine maleate tablets, cefuroxime ester tablets) produced by 29 manufacturers were used as the experimental objects to establish the multi-class classification and identification experiments. Compared with SVM, BP-ANN, PLS-DA and sparse Auto-coding (SAE), deep belief network (DBN), deep convolution network (CNN), etc., the algorithm has excellent classification performance, good robustness and scalability.
Key words:Near infrared spectroscopy; Drug identification; Multi-class classification; Deep learning; Variational Auto-encoding
[1] Parixit Prajapati, Ragini Solanki, Vishalkumar Modi, et al. IJPCA, 2016, 3(3): 117.
[2] Chu Xiaoli. Molecular Spectroscopy Analytical Technology Combined With Chemometrics and Its Application. Beijing: Chemical Industry Press, 2011: 95.
[3] Yong Nian, Ni Wei, Lin. Chinese Chemical Letters, 2011,(12): 91.
[4] Fu Haiyan, Huang Dongchen, Yang Tianming, et al. Chinese Chemical Letters, 2013, 24(7): 639.
[5] Elizarova T E, Shtyleva S V, Pleteneva T V. Pharmaceutical Chemistry Journal, 2008, 42(7): 432.
[6] Weng Xinxin, Mao Danzhuo, Yang Yongjian. Computers and Applied Chemistry, 2012, 29(8): 995.
[7] Gong Liping, Wang Weijian, Yang Na, et al. Chinese Journal of Pharmaceutical Analysis, 2011, 31(8): 1571.
[8] Rodionova O Y, Titova A V, Balyklova K S, et al. Talanta, 2019, 205: 120150.
[9] Byvatov E, Fechner U, Sadowski J, et al. Journal of Chemical Information and Computer Sciences, 2003, 43(6): 1882.
[10] Wu W, Massart D L, et al. Chemometrics and Intelligent Laboratory Systems, 1996, 35(1): 127.
[11] Zhang Weidong, Li Lingqiao, Hu Jinquan, et al. Chinese Journal of Analytical Chemistry, 2018.
[12] Yang Huihua, Hu Baichao, Pan Xipeng, et al. Journal of Innovative Optical Health Sciences, 2016: S1793545816300111.
[13] LI Ling-qiao, PAN Xi-peng, FENG Yan-chun, et al. Sepctroscopy and Spectral Anlysis, 2019, 39(11): 3606.
[14] Kingma D P, Welling M, et al. ArXiv Preprint arXiv: 1312.6114, 2013.
[15] Razavi A, Van den Oord A, Vinyals O, et al. Advances in Neural Information Processing Systems,2019: 14837.
[16] Tanaka K, Kameoka H, Morikawa K. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018. 5779.