Outlier Data Mining and Analysis of LAMOST Stellar Spectra in Line Index Feature Space
WANG Guang-pei1, PAN Jing-chang1*, YI Zhen-ping1, WEI Peng2, JIANG Bin1
1. School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai, Weihai 264209, China 2. Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China
Abstract:Large scale spectrum survey will produce mass spectral data and offer chances for searching rare and unknown types of spectra, which is contribute to revealing the evolution law of the universe and the origin of life. Data mining in outlier data in sky survey can serve the purpose of finding special spectra. Line index can be used in spectra data dimension reduction, keeping the spectral physical characteristics as much as possible, and at the same time, it can effectively solve the high dimensional spectral data clustering analysis in the high computation complexity. This paper proposed a method outlier data mining and analysis for massive stellar spectrum survey data based on line index characteristics, according to this, an outlier spectral data analysis method was proposed using line index characteristics space. Experimental results demonstrated that (1) using line index as the characteristic value of the spectrum can quickly perform the outlier data mining for high dimensional spectral data, and it can solve the problem of high computation complexity of the high dimensional spectral data. (2) this outlier data mining method was conducted based on the clustering results; it can effectively finding out emission stars, late type stars, late M type stars, extremely poor metal stars, and even finding spectra data missing certain data. (3) outlier data mining in line index feature space can help to analysis of rules of special stars found in the feature space. The mothed proposed in this paper based on the characteristics of line index outlier data mining and analysis method can be applied to the study of survey data.
Key words:Lick line index;Outlier datamining;Stellar spectra
王光沛1,潘景昌1*,衣振萍1,韦 鹏2,姜 斌1 . 线指数特征空间内恒星光谱离群数据挖掘与分析 [J]. 光谱学与光谱分析, 2016, 36(10): 3364-3368.
WANG Guang-pei1, PAN Jing-chang1*, YI Zhen-ping1, WEI Peng2, JIANG Bin1 . Outlier Data Mining and Analysis of LAMOST Stellar Spectra in Line Index Feature Space. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2016, 36(10): 3364-3368.
[1] Cui X, Zhao Y, Chu Y, et al. Research in Astron. Astrophys., 2012, 12(9): 1197. [2] Luo A, et al. Astrophys., 2012, 12(9): 1243. [3] Zhao G, et al. Research in Astron. Astrophys., 2012, 12(7): 723. [4] Wei P, Luo A, Li Y, et al. Monthly Notices of the Royal Astronomical Society, 2013, 431(2): 1800. [5] Guy Worthey, Faber S M, et al. The Astrophysical Journal Supplement Series, 1994, 94: 687. [6] Trager S C, Guy Worthey, et al. Astrophysical Journal Supplement Series, 1998, 116(1): 1. [7] Koteeswaran S, Visu P, Janet J. American Journal of Applied Sciences, 2012, 9(2). [8] YAN Tai-sheng, ZHANG Yan-xia, ZHAO Yong-heng, et al(严太生, 张彦霞, 赵永恒, 等). Progress in Astronomy(天文学进展), 2010, 28(2): 112. [9] Woolf V M, West A A. Monthly Notices of the Royal Astronomical Society, 2012, 422(2): 1489. [10] Li H N, Zhao G, Christlieb N, et al. Astrophysical Journal, 2015, 798(2). [11] Comerford L A, Kougioumtzoglou I A, Beer M, et al. An Artificial Neural Network Based Approach for Power Spectrum Estimation and Simulation of Stochastic Processes Subject to Missing Data[C]// Computational Intelligence for Engineering Solutions (CIES), 2013 IEEE Symposium on. IEEE, 2013. 118.