Deep Convolution Network Application in Identification of Multi-Variety and Multi-Manufacturer Pharmaceutical
LI Ling-qiao1, 2, PAN Xi-peng1, FENG Yan-chun3*, YIN Li-hui3, HU Chang-qin3, YANG Hui-hua1, 2*
1. School of Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China
2. School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
3. National Institutes for Food and Drug Control, Beijing 100050, China
Abstract:As near infrared spectroscopy (NIR) has many advantages, such as high efficiency, being non-destructive and environment-friendly and on-site detection, it is especially suitable for rapid modeling and analysis of drugs. However, there are some shortcomings such as weak absorption intensity and overlapping bands. It is necessary to establish a robust and reliable chemometrics model to analyze NIR. Deep convolution neural network (DCNN) is an important branch of deep learning method, which extracts data features layer by layer, combines and transforms them to form higher-level semantic features. It is widely used in computer vision, speech recognition and other fields, and has achieved great success, but has not been reported in drug NIR analysis yet. Based on the deep convolution network model, this paper studies the multi-class modeling of drug NIR. According to the characteristics of drug NIR data, several one-dimensional deep convolution network models for multi-class and multi-manufacturer drug NIR classification are designed. The overlapping arrangement of convolution layer and pool layer in the model is employed to extract NIR data features layer by layer, and the output layer is connected with the softmax classifier to predict the classification probability of NIR data. Before the output layer, the global maximum pooling layer is used to solve the problem of restricting the size of input dimension and too many parameters in the full connection layer. At the same time, batch normalization and dropout are introduced in the network model to prevent the gradient vanishing and reduce the risk of network overfitting. The impact on the modeling effect with different convolutional network layers and different convolution kernel sizes is analyzed. At the same time, the influence of five classical data preprocessing methods is explored. Taking NIR samples of cefixime and phenytoin tablets as experimental datasets, a multi-class and multi-manufacturer classification model of drugs is established. The model achieved good classification results in the experiments of binary-classification and multi-classification. In eighteen classification experiments, when the ratio between training set and test set was 7∶3, the classification accuracy was 99.37±0.45, which achieved better classification performance than SVM, BP, AE and ELM. At the same time, inference speed of deep convolution neural network was faster than SVM and ELM, but training speed was slower than both. A large number of experimental results showed that the deep convolutional neural network can accurately and reliably distinguish the NIR data of multi-class and multi-manufacturer drugs, with good robustness and scalability. The proposed method can also be extended to the application of NIR data classification in tobacco, petrochemical and other fields.
李灵巧,潘细朋,冯艳春,尹利辉,胡昌勤,杨辉华. 深度卷积网络的多品种多厂商药品近红外光谱分类[J]. 光谱学与光谱分析, 2019, 39(11): 3606-3613.
LI Ling-qiao, PAN Xi-peng, FENG Yan-chun, YIN Li-hui, HU Chang-qin, YANG Hui-hua. Deep Convolution Network Application in Identification of Multi-Variety and Multi-Manufacturer Pharmaceutical. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2019, 39(11): 3606-3613.
[1] Ma H L, Wang J W, Chen Y J, et al. Food Chemistry, 2017, 215: 108.
[2] Lê L M M, Eveleigh L, Hasnaoui I, et al. Journal of Pharmaceutical and Biomedical Analysis, 2017, 138: 249.
[3] Xue J T, Ye L M, Li C Y, et al. Optik, 2018, 170: 30.
[4] Risoluti R, Materazzi S, Gregori A, et al. Talanta, 2016, 153: 407.
[5] Deconinck E, Sacré P Y, Coomans D, et al. Journal of Pharmaceutical and Biomedical Analysis, 2012, 57: 68.
[6] ZHANG Wei-dong, LI Ling-qiao, HU Jin-quan, et al(张卫东, 李灵巧, 胡锦泉, 等). Chinese Journal of Analytical Chemistry(分析化学), 2018, 46(9): 1446.
[7] Yang H H, Hu B C, Pan X P, et al. Journal of Innovative Optical Health Sciences, 2016, 10(2): 1630011.
[8] Lecun Y, Bengio Y, Hinton G. Nature, 2015, 521(7553): 436.
[9] Nassif A B, Shahin I, Attili I, et al. IEEE Access, 2019, 7: 19143.
[10] Lai D, Tian W, Chen L. Pattern Recognition, 2019, 88: 547.
[11] LU Meng-yao, YANG Kai, SONG Peng-fei, et al(鲁梦瑶, 杨 凯, 宋鹏飞, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2018, 38(12): 3724.
[12] Acquarelli J, Laarhoven T V, Gerretzen J, et al. Analytica Chimica Acta, 2017, 954: 22.
[13] Srivastava N, Hinton G, Krizhevsky A, et al. Journal of Machine Learning Research, 2014, 15(1): 1929.