Terahertz Spectroscopic Identification with Diffusion Maps
倪家鹏1,沈 韬1, 2* ,朱 艳2,李灵杰1,毛存礼1,余正涛1
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504, China
2. Faculty of Materials Science and Engineering, Kunming University of Science and Technology, Kunming 650093, China
Abstract:Feature extraction is the key issue for the identification of terahertz spectroscopy. For the traditional method, it is identified by different absorption peaks as the features that extracted through manual method. However, for many materials,there are no apparent spectral graphics features in the terahertz band, such as peaks, valleys and etc. To this end, the researchers reduce the dimension from the high-dimensional terahertz spectroscopy data and extract the features through statistics learning and machine learning methods. Linear method is easy to cause greater error due to the nonlinear nature of terahertz spectroscopy data, especially when different materials of spectrum curves are very similar. To address this issue, a novel terahertz spectroscopy identification approach with Diffusion Maps (DM) was studied in this paper. Diffusion Maps can realize nonlinear dimensionality reduction while maintaining the internal geometry of the data. In addition, the manifold features extracted by the method have good discrimination and clustering performance. Firstly, S-G filter and cubic spline interpolation were used to smooth and uniform the resolution of terahertz transmission spectra of ten kinds of substances in the same frequency band. Secondly, high-dimensional data of terahertz spectra is mapped to the low-dimensional feature space by using DM so that we can extract the manifold features of terahertz spectroscopy. Finally, a Multi-class Support Vector Machine (M-SVM) classifier is applied to classify these terahertz spectra. Experimental results show that, compared with Principal Component Analysis (PCA) and Isometric Mapping (ISOMAP), manifold features of terahertz spectroscopy extracted by DM have higher degree of differentiation. Besides, DM can get the estimation of intrinsic dimension of terahertz spectra directly. So this proposed method provides a novel approach to identify similar terahertz spectrum quickly and accurately.