|
|
|
|
|
|
Wasserstein GAN for the Classification of Unbalanced THz Database |
ZHU Rong-sheng1, 2, SHEN Tao1, 2*, LIU Ying-li1, 2, ZHU Yan1, 2, CUI Xiang-wei1, 2 |
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504, China
2. Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming 650504, China |
|
|
Abstract The terahertz spectrum of the matter is unique. At present, combined with advanced machine learning methods, research on terahertz spectrum recognition technology based on large-scale spectral databases has become the focus of terahertz application technology. It is difficult to collect multi-material equilibrium spectral data, which is the basis for classifying terahertz spectral data. This paper proposes an unbalanced terahertz spectrum recognition method based on WGAN (Wasserstein Generative Adversarial Networks). As a new method of generating data, WGAN uses the generated data under the condition that the model reaches the Nash equilibrium to supplement the data set, and is finally trained by a support vector machine (SVM). The experimental results prove that the generated data can effectively map the distribution of real data, and the accuracy of identifying unbalanced spectral data can be improved by mixing the generated data with the real data. In this paper, three types of maltose compounds with similar characteristics spectra are used for verification. We first use S-G filtering and cubic spline interpolation to normalize the spectral data of the three substances, and then expand the unbalanced terahertz spectral data of the three substances by constructing a WGAN model to bring it to class equilibrium. The experiments are verified under the same test set, and three sets of comparative experiments are used to prove the effectiveness of WGAN in the processing of uneven data sets. First we use WGAN to generate data. As the number of iterations increases, the generated data gradually conforms to the real data distribution. When the model reaches the Nash equilibrium, the generated data basically conforms to the original data distribution. The experimental results prove that training the SVM model using the extended WGAN data set can solve the problem that the model has a small sample data (Maltotriose, Malthexaose) biased toward a large sample data (Maltoheptaose) on the test set. After comparing WGAN with traditional methods for processing unbalanced data sets FWSVM and COPY, we find that the training set accuracy of the three classification algorithms on the dataset-1 dataset can reach more than 90%. However, due to the limitation of the generalization ability of the model, the effect of the traditional method on the test set is not very satisfactory, and the accuracy of the test set after using WGAN can reach 91.54%. In terms of different imbalances, the data sets with imbalances of 16, 81, and 256 were used for verification. The accuracy rates on the three test sets are 92.08%, 91.54%, and 90.27%, which can meet the requirements of dealing with different imbalances in actual work.
|
Received: 2020-01-15
Accepted: 2020-04-22
|
|
Corresponding Authors:
SHEN Tao
E-mail: shentao@kmust.edu.cn
|
|
[1] Tonouchi M. Nature Photonics, 2007, 1(2): 97.
[2] Jepsen P U, Cooke D G, Koch M. Laser & Photonics Reviews, 2011, 5(1): 124.
[3] Liebermeister L, Nellen S, Kohlhaas R, et al. Journal of Infrared, Millimeter, and Terahertz Waves, 2019, 40(3): 288.
[4] Li Y, Xu L, Zhou Q, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2019, 214: 246.
[5] Strachan C J, Taday P F, Newnham D A, et al. Journal of Pharmaceutical Sciences, 2005, 94(4): 837.
[6] Nishimura F, Hoshina H, Ozaki Y, et al. Polymer Journal, 2019, 51(2): 237.
[7] Fischer B M, Helm H, Jepsen P U. Proceedings of the IEEE, 2007, 95(8): 1592.
[8] Liu P, Zhang X, Pan B, et al. International Journal of Environmental Research, 2019, 13(1): 143.
[9] Mittleman D M. Optics Express, 2018, 26(8): 9417.
[10] Yang X, Pi Y, Liu T, et al. IEEE Sensors Journal, 2018, 18(3): 1063.
[11] He H, Garcia E A. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263.
[12] LIU Jin-jun(刘进军). Computer Applications and Software(计算机应用与软件), 2014, 31(1): 186.
[13] Tao X, Li Q, Ren C, et al. Expert Systems with Applications, 2019, 129: 118.
[14] Arjovsky M, Chintala S, Bottou L. arXiv Preprint arXiv, 2017, 1701: 07875.
[15] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Advances in Neural Information Processing Systems,2014, 27: 2672. |
[1] |
HUANG You-ju1, TIAN Yi-chao2, 3*, ZHANG Qiang2, TAO Jin2, ZHANG Ya-li2, YANG Yong-wei2, LIN Jun-liang2. Estimation of Aboveground Biomass of Mangroves in Maowei Sea of Beibu Gulf Based on ZY-1-02D Satellite Hyperspectral Data[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3906-3915. |
[2] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[3] |
FANG Zheng, WANG Han-bo. Measurement of Plastic Film Thickness Based on X-Ray Absorption
Spectrometry[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3461-3468. |
[4] |
FU Gen-shen1, LÜ Hai-yan1, YAN Li-peng1, HUANG Qing-feng1, CHENG Hai-feng2, WANG Xin-wen3, QIAN Wen-qi1, GAO Xiang4, TANG Xue-hai1*. A C/N Ratio Estimation Model of Camellia Oleifera Leaves Based on
Canopy Hyperspectral Characteristics[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3404-3411. |
[5] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[6] |
SUN Lin1, BI Wei-hong1, LIU Tong1, WU Jia-qing1, ZHANG Bao-jun1, FU Guang-wei1, JIN Wa1, WANG Bing2, FU Xing-hu1*. Identification Algorithm of Green Algae Using Airborne Hyperspectral and Machine Learning Method[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3637-3643. |
[7] |
YU Yang1, ZHANG Zhao-hui1, 2*, ZHAO Xiao-yan1, ZHANG Tian-yao1, LI Ying1, LI Xing-yue1, WU Xian-hao1. Effects of Concave Surface Morphology on the Terahertz Transmission Spectra[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(09): 2843-2848. |
[8] |
CHEN Wen-jing, XU Nuo, JIAO Zhao-hang, YOU Jia-hua, WANG He, QI Dong-li, FENG Yu*. Study on the Diagnosis of Breast Cancer by Fluorescence Spectrometry Based on Machine Learning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(08): 2407-2412. |
[9] |
XIA Chen-zhen1, 2, 3, JIANG Yan-yan4, ZHANG Xing-yu1, 2, 3, SHA Ye5, CUI Shuai1, 2, 3, MI Guo-hua5, GAO Qiang1, 2, 3, ZHANG Yue1, 2, 3*. Estimation of Soil Organic Matter in Maize Field of Black Soil Area Based on UAV Hyperspectral Image[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(08): 2617-2626. |
[10] |
HU Wen-feng1, 2, TANG Wei-hao1, LI Chuang1, WU Jing-jin1, MA Qing-fen1, LUO Xiao-chuan1, WANG Chao2, TANG Rong-nian1*. Estimating Nitrogen Concentration of Rubber Leaves Based on a Hybrid Learning Framework and Near-Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(07): 2050-2058. |
[11] |
FENG Ying-chao1, HUANG Yi-ming2*, LIU Jin-ping1, JIA Chen-peng2, CHEN Peng1, WU Shao-jie2*, REN Xu-kai3, YU Huan-wei3. On-Line Monitoring of Laser Wire Filling Welding Process Based on Emission Spectrum[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(06): 1927-1935. |
[12] |
JIANG Chuan-li1, ZHAO Jian-yun1, 2*, DING Yuan-yuan1, ZHAO Qin-hao1, MA Hong-yan1. Study on Soil Water Retrieval Technology of Yellow River Source Based on SPA Algorithm and Machine Learning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(06): 1961-1967. |
[13] |
HU Hui-qiang1, WEI Yun-peng1, XU Hua-xing1, ZHANG Lei2, MAO Xiao-bo1*, ZHAO Yun-ping2*. Identification of the Age of Puerariae Thomsonii Radix Based on Hyperspectral Imaging and Principal Component Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(06): 1953-1960. |
[14] |
SU Yun-peng, HE Chun-jing, LI Ang-ze, XU Ke-mi, QIU Li-rong, CUI Han*. Ore Classification and Recognition Based on Confocal LIBS Combined With Machine Learning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(03): 692-697. |
[15] |
CHU Zhi-hong1, 2, ZHANG Yi-zhu2, QU Qiu-hong3, ZHAO Jin-wu1, 2, HE Ming-xia1, 2*. Terahertz Spectral Imaging With High Spatial Resolution and High
Visibility[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(02): 356-362. |
|
|
|
|