基于深度学习的太赫兹时域光谱识别研究

doi:10.3964/j.issn.1000-0593(2021)01-0094-06

摘要
参考文献
相关文章 (15)

全文: PDF (3211 KB)
输出: BibTeX | EndNote (RIS)

摘要：太赫兹时域光谱技术，由于其具有物质“指纹谱”特性，是一种可以快速无损地鉴别物质的重要手段，在毒品和爆炸物的无损检测等方面有广阔的应用前景。其中，光谱识别是太赫兹时域光谱技术应用研究的重要方向之一。现有的光谱识别方法多是依靠手工选取特征后进行机器学习分类，或是通过设置吸收峰阈值门限进行判断。由于一些物质在太赫兹波段内并没有明显的吸收峰特征，同时样品浓度、空气湿度、各类噪声等会对太赫兹时域光谱造成干扰从而使信噪比下降，这些方法并不能很好地适应，并且物质类别和数量的增加也会导致计算量不断增加。近年来，随着深度学习技术兴起，以卷积神经网络（CNN）和循环神经网络（RNN)为代表的方法在计算机视觉和自然语言处理等领域得到广泛应用，相比于传统的机器学习方法其效果有了很大的提升。由于深度学习技术强大的非线性分类能力，基于RNN和CNN设计了两个网络用于光谱识别：基于RNN的一维谱线分类网络和基于CNN的二维谱图分类网络。模拟实际应用场景，在非真空环境下采集了12种物质的两万多个光谱数据作为训练集和测试集。在分析了样品浓度、空气湿度对光谱特征的影响后，使用S-G（Savitzky-Golay）滤波对光谱进行降噪。实验结果表明，对比未处理和经过S-G预处理的数据，处理后的光谱特征更加明显，识别准确率更高；与传统的机器学习算法k最近邻（k-NN）方法相比，RNN和CNN方法在测试集上有更好的准确率，且算法速度更快；对于光谱识别，CNN方法比RNN方法能够更好地克服噪声的影响。因此，深度学习技术可以对太赫兹时域光谱进行快速有效的识别，能够为新型无损安全检查技术提供理论和实验基础。

关键词：太赫兹时域光谱；光谱识别；卷积神经网络；循环神经网络；预处理

Abstract：Terahertz time-domain spectroscopy (THz-TDS) is an important method for rapid and nondestructive material identification due to its spectral fingerprint properties, which has broad application exploitation in the nondestructive inspection of drugs and explosives. Spectral identification is one of the most important aspects of the applied research of THz-TDS. Most existing spectral identification methods are machine-learning based classification of manually selected features or thresholding classification of absorption spectral peak. Those methods are not adapt well to low signal-to-noise ratio, because some materials have few or no spectral absorption peaks features in the terahertz waveband and spectra are affected easily by concentrations of samples, air humidity and noises. Meanwhile computational cost increases with data quantity and category. In recent years, with the rise of deep learning technology, the methods represented by CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) have been widely applied to fields such as computer vision and natural language processing where they have been shown to produce better results than traditional machine learning methods. Due to the strong nonlinear classification capability of deep learning technology, two networks respectively were designed based on RNN and CNN for spectral identification in this paper: one-dimensional spectral line classifier based on RNN and two-dimensional spectral image classifier based on CNN. To simulate the practical application scenario, over 20 000 terahertz time-domain spectra of 12 materials were measured in a non-vacuum environment as training-set and test-set. After analyzing the effects of concentrations of samples and air humidity on spectra, S-G(Savitzky-Golay) filter was introduced to reduce noises of spectra. Experimental results show that S-G filter could improve the identification accuracy, because processed spectra have more obvious feature compared with the unprocessed spectra; the proposed methods based on RNN and CNN are more accurate and faster on the test-set, compared with traditional machine learning algorithm k-NN (k-Nearest Neighbor); CNN demonstrated better robustness to noises than RNN on spectral identification task. Therefore, deep learning technology could be utilized for quick and effective identification terahertz time-domain spectra, which provide a theoretical and experimental basis for new nondestructive safety inspection techniques.

Key words：Terahertz time-domain spectroscopy; Spectral identification; Convolutional neural network; Recurrent neural network; Preprocessing

收稿日期: 2019-11-15 修订日期: 2020-03-12

中图分类号:

TP391.4

基金资助: 安徽省重点研究和开发计划项目（201904e01020005）资助

作者简介: 胡其枫，1991年生，博微太赫兹信息科技有限公司算法工程师 e-mail: fengmaomao1991@126.com

引用本文:

胡其枫，蔡健. 基于深度学习的太赫兹时域光谱识别研究[J]. 光谱学与光谱分析, 2021, 41(01): 94-99.
HU Qi-feng, CAI Jian. Research of Terahertz Time-Domain Spectral Identification Based on Deep Learning. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41(01): 94-99.

链接本文:

https://www.gpxygpfx.com/CN/10.3964/j.issn.1000-0593(2021)01-0094-06 或 https://www.gpxygpfx.com/CN/Y2021/V41/I01/94