|
|
|
|
|
|
Decomposition and Classification of Stellar Spectra Based on t-SNE |
JIANG Bin, ZHAO Zi-liang, WANG Shu-ting, WEI Ji-yu, QU Mei-xia* |
School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai, Weihai 264209, China |
|
|
Abstract With the development of astronomy and the improvement of telescope observation ability, many large sky survey telescopes have produced petabytes of stellar spectra. Stellar spectra are a kind of complex frequency domain signal, which is usually composed of continuous spectrum and absorption lines. The differences are mainly caused by the effective temperature, surface gravity acceleration and chemical abundance of elements of stars. The automatic classification of stellar spectra is an important part of astronomical data processing and the basis of studying stellar evolution and parameter measurement. The massive stellar spectra require efficient and accurate classification methods. The traditional manual classification methods have the disadvantages of low speed and accuracy, which cannot meet the actual needs of automatic classification of massive stellar spectra. Machine learning algorithms have been widely used in spectra classification. A significant feature of the stellar spectra is the high data dimension. Dimensionality reduction can not only achieve feature extraction, but also reduce the amount of computation, which is the primary task of spectra classification. The traditional linear dimensionality reduction method only reduces the spectra according to the variance, and different types of spectra will cross in the feature space, while manifold learning can produce good classification boundaries to avoid overlap, which is conducive to subsequent classification. In this paper, the distribution of spectra in high dimensional space and the principle of manifold learning to dimensionality reduction of high dimensional linear data are studied. The effects of two dimensionality reduction methods: t-SNE and principal component analysis were compared and the improved k-nearest neighbor algorithm based on the correlation distance of attribute values was used for spectra classification. Python and Scikit-learn were used to implement the algorithm. 12 000 low signal/noise stellar spectra from SDSS were tested and high precision automatic processing and classification of spectral data are realized finally. Experimental results show that the t-SNE method based on manifold learning can restore the low-dimensional manifold structure in high dimensional spectral data. The low-dimensional manifold features in high-dimensional spaces are found and the corresponding embedded mappings are solved. In the process of dimension reduction, the differences between spectral samples of different categories are preserved to the greatest extent. The three-dimensional visualization of the experimental results shows that PCA can lead to the crossover of the distribution of stellar spectra of different categories, while the t-SNE algorithm can produce more obvious category boundaries. The k-nearest neighbor algorithm based on attribute value correlation distance can achieve satisfactory classification accuracy on test data sets after feature extraction. The method used in this paper can also be applied to the automatic classification of massive spectra generated by other telescopes and data mining of rare objects.
|
Received: 2019-08-17
Accepted: 2019-12-26
|
|
Corresponding Authors:
QU Mei-xia
E-mail: whkunyushan@163.com
|
|
[1] Navarro S G, Corradi R L M, Mampaso A. Astronomy & Astrophysics, 2012, 538(1): A76.
[2] Kheirdastan S, Bazarghan M. Astrophysics and Space Science, 2016, 361(9): 304.
[3] LIU Zhong-bao, REN Juan-juan, SONG Wen-ai, et al(刘忠宝,任娟娟,宋文爱,等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2018,38(2):660.
[4] Chen F, Wu Y, Bu Y, et al. Publications of the Astronomical Society of Australia, 2014, 31(1): e001.
[5] Roweis S T, Saul L K. Science, 2000, 290(5500): 2323.
[6] Park C H, Park H. Pattern Recognition, 2008, 41(3): 1083.
[7] Vincent P, Larochelle H, Bengio Y, et al. International Conference on Machine Learning. ACM, 2008.
[8] Gisbrecht A, Mokbel B, Hammer B. International Joint Conference on Neural Networks. IEEE, 2012.
[9] XIAO Hui-hui, DUAN Yan-ming(肖辉辉,段艳明). Computer Science(计算机科学), 2013,40(S2): 157.
[10] Bolton A S, Schlegel D J, et al. The Astronomical Journal, 2012, 144(5): 144.
[11] Zhao G, Zhao Y H, Chu Y Q, et al. Research in Astronomy and Astrophysics, 2012, 12(7): 723.
[12] SHI Jian-rong(施建荣). Chinese Science Bulletin(科学通报), 2016, 61(12): 1330. |
[1] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[2] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[3] |
JIN Chun-bai1, YANG Guang1*, LU Shan2*, LIU Wen-jing1, LI De-jun1, ZHENG Nan1. Band Selection Method Based on Target Saliency Analysis in Spatial Domain[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(09): 2952-2959. |
[4] |
YUAN Zhuang1, DONG Da-ming2*. Near-Infrared Spectroscopy Measurement of Contrastive Variational Autoencoder and Its Application in the Detection of Liquid Sample[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(11): 3637-3641. |
[5] |
WU Bin1, SHEN Jia-qi2, WANG Xin2, WU Xiao-hong3, HOU Xiao-lei2. NIR Spectral Classification of Lettuce Using Principal Component
Analysis Sort and Fuzzy Linear Discriminant Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(10): 3079-3083. |
[6] |
BAI Zi-jin1, PENG Jie1*, LUO De-fang1, CAI Hai-hui1, JI Wen-jun2, SHI Zhou3, LIU Wei-yang1, YIN Cai-yun1. A Mid-Infrared Spectral Inversion Model for Total Nitrogen Content of Farmland Soil in Southern Xinjiang[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(09): 2768-2773. |
[7] |
ZHANG Liu, ZHANG Jia-kun, LÜ Xue-ying, SONG Hong-zhen, WANG Wen-hua*. Research on Tunable Spectrum Reconstruction[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(05): 1378-1384. |
[8] |
GUO Jun-xian1, MA Yong-jie1, GUO Zhi-ming2, HUANG Hua3, SHI Yong1, ZHOU Jun1. Watercore Identification of Xinjiang Fuji Apple Based on Manifold Learning Algorithm and Near Infrared Transmission Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2020, 40(08): 2415-2420. |
[9] |
CHEN Shu-yi1, ZHAO Quan-ming1, DONG Da-ming2*. Application of Near Infrared Spectroscopy Combined with Comparative Principal Component Analysis for Pesticide Residue Detection in Fruit[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2020, 40(03): 917-921. |
[10] |
LIU Zhong-bao1, REN Juan-juan2, SONG Wen-ai1*, ZHANG Jing1, KONG Xiao2, FU Li-zhen1. Stellar Spectra Classification with Entropy-Based Learning Machine[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2018, 38(02): 660-664. |
[11] |
倪家鹏1,沈 韬1, 2* ,朱 艳2,李灵杰1,毛存礼1,余正涛1. Terahertz Spectroscopic Identification with Diffusion Maps[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2017, 37(08): 2360-2364. |
[12] |
PAN Jing-chang1, WANG Jie1, JIANG Bin1, LUO A-li1, 2, WEI Peng2, ZHENG Qiang3 . A Method of Stellar Spectral Classification Based on Map/Reduce Distributed Computing[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2016, 36(08): 2651-2654. |
[13] |
HUANG Tao1, LI Xiao-yu1*, JIN Rui1, KU Jing1, XU Sen-miao1, XU Meng-ling1, WU Zhen-zhong1, KONG De-guo1,2 . Multi-Target Recognition of Internal and External Defects of Potato by Semi-Transmission Hyperspectral Imaging and Manifold Learning Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2015, 35(04): 992-996. |
[14] |
TANG Chao1, 2, CHEN Jian-ping1, 2*, CUI Jing1, WEN Bo-tao1 . Lithology Feature Extraction of CASI Hyperspectral Data Based on Fractal Signal Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2014, 34(05): 1388-1393. |
[15] |
JIANG Lu-lu1, LUO Mei-fu1, ZHANG Yu1,YU Xin-jie2, KONG Wen-wen3, LIU Fei3* . Identification of Transmission Fluid Based on NIR Spectroscopy by Combining Sparse Representation Method with Manifold Learning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2014, 34(01): 64-68. |
|
|
|
|