|
|
|
|
|
|
Spectral Classification of M-Type Stars Based on Ensemble Tree Models |
WANG Jing1, YI Zhen-ping1*, YUE Li-li1, DONG Hui-fen1, PAN Jing-chang1, BU Yu-de2 |
1. School of Mechanical, Electrical & Information Engineering, Shandong University,Weihai, Weihai 264209, China
2. School of Mathematics and Statistics, Shandong University,Weihai, Weihai 264209, China |
|
|
Abstract Located at the top of the red giants in Hertzsprung-Russell diagram, M giants are the brightest stars that evolved from the sun-like main sequence stars. The study of M giants is crucial to understand the Milky Way, especially the Galactic haloes. The spectrum of an M giants in medium and low resolution is often mixed with spectra of M dwarfs because of insignificant features, noise effects, and other factors. Previous studies often used the molecular index of CaH2+CaH3 vs. TiO5 to search for M giant candidates, then checked them with human eyes. However, this method only used three important molecular band indices associated with giants, without using other spectral features to identify the M giants, which may cause misclassification due to noise pollution of the index. Moreover, relying on human eyes to check a large number of spectra is time-consuming, and the quality of the inspection dependings on people’s experience and its reliability is not guaranteed. Since 2011, LAMOST has released more than 9 million celestial spectra. The latest spectral data product data release 5(DR5) contains 520 000 M-type spectral data, which needs an automatic, accurate and effective method to distinguish the M sub-samples of different luminosity levels. This study uses four ensemble tree models: Random Forest, GBDT, XGBoost, and LightGBM to construct classifiers that distinguish between M giants and M dwarfs. The accuracy of four classifiers is 97.23%, 98%, 98.05%, and 98.32%, respectively. Experiments showed that LightGBM has higher accuracy and less training time when compared to the other threemodels. The analysis of important features obtained by the classifier models showed that ensemble tree model can efficiently extract and express the structural features that distinguish M giants and M dwarfs. These features include not only the atomic lines, molecular bands, but also their adjacent pseudo-continuum spectrum, which is consistent with the features and pseudo-continuum spectra that we traditionally need to calculate the indices. Compared to the traditional classification methods, ensemble tree can use the combination of tens or hundreds important features in the spectrum rather than only several features to avoid misclassification affected by noises. The results of this study showed that the ensemble tree algorithm has significant advantages in the process of M giant recognition, and it can completely replace the traditional M giant spectral discrimination method using only CaH and TiO indices. In this study an effective method has been provided for LAMOST to efficiently and effectively process the massive celestial spectra. As the LAMOST survey continues, more and more M spectra will be accumulated, which provides massive data for the studies of structure and evolution of the Milky Way.
|
Received: 2018-06-06
Accepted: 2018-10-28
|
|
Corresponding Authors:
YI Zhen-ping
E-mail: yizhenping@sdu.edu.cn
|
|
[1] Cui X Q, Zhao Y H, Chu Y Q, et al. Research in Astronomy and Astrophysics, 2012, 12(9): 1197.
[2] Luo A L, Zhang H T, Zhao Y H, et al. Research in Astronomy and Astrophysics, 2012, 12(9): 1243.
[3] Luo A L, Zhao Y H, Zhao G, et al. Research in Astronomy and Astrophysics, 2015, 15(8): 1095.
[4] ZHAO Yong-heng(赵永恒). Physics(物理), 2015, 44(4): 205.
[5] Zhong J, Lepine S, Li J, et al. Research in Astronomy and Astrophysics, 2015, 15(8): 1154.
[6] Yi Z, Luo A, Song Y, et al. The Astronomical Journal, 2014, 147(2): 33.
[7] Bates S D, Bailes M, Barsdell B R, et al. Monthly Notices of the Royal Astronomical Society, 2012, 427(2): 1052.
[8] Ichikawa D, Saito T, Ujita W, et al. Journal of Biomedical Informatics, 2016, 64: 20.
[9] Devine T R, Goseva-Popstojanova K, McLaughlin M. Monthly Notices of the Royal Astronomical Society, 2016, 459(2): 1519.
[10] Li N, Yu Y, Zhou Z H. Diversity Regularized Ensemble Pruning. Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, 2012: 330.
[11] Breiman L. Machine Learning, 2001, 45(1): 5.
[12] Friedman J H. Annals of Statistics, 2001,29(5):1189.
[13] Roe B P, Yang H J, Zhu J, et al. Nuclear Instruments and Methods in Physics Research,2005,543(2-3):577.
[14] Möller A, Ruhlmann-Kleider V, Leloup C, et al. Journal of Cosmology and Astroparticle Physics, 2016, 2016(12): 8.
[15] Chen T, Guestrin C. Xgboost: A Scalable Tree Boosting System. Proceedings of the 22nd Acmsigkdd International Conference on Knowledge Discovery and Data Mining. ACM, 2016: 785.
[16] Kadiyala A, Kumar A. Environmental Progress & Sustainable Energy, 2018, 37(2): 618.
[17] Ke G, Meng Q, Finley T, et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree (c). Advances in Neural Information Processing Systems,2017. 3149. |
[1] |
WU Chao1, QIU Bo1*, PAN Zhi-ren1, LI Xiao-tong1, WANG Lin-qian1, CAO Guan-long1, KONG Xiao2. Application of Spectral and Metering Data Fusion Algorithm in Variable Star Classification[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(06): 1869-1874. |
[2] |
FENG Ying-chao1, HUANG Yi-ming2*, LIU Jin-ping1, JIA Chen-peng2, CHEN Peng1, WU Shao-jie2*, REN Xu-kai3, YU Huan-wei3. On-Line Monitoring of Laser Wire Filling Welding Process Based on Emission Spectrum[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(06): 1927-1935. |
[3] |
LI Shuang-chuan, TU Liang-ping*, LI Xin, WANG Li-li. Besvm: A-Type Star Spectral Subtype Classification Algorithm Based on Transformer Feature Extraction[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(05): 1575-1581. |
[4] |
FENG Xin1, 2, FANG Chao1*, GONG Hai-feng2, LOU Xi-cheng1, PENG Ye1. Infrared and Visible Image Fusion Based on Two-Scale Decomposition and
Saliency Extraction[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(02): 590-596. |
[5] |
WANG Zhi-xin, WANG Hui-hui, ZHANG Wen-bo, WANG Zhong, LI Yue-e*. Classification and Recognition of Lilies Based on Raman Spectroscopy and Machine Learning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(01): 183-189. |
[6] |
CAI Yu1, 2, ZHAO Zhi-fang3, GUO Lian-bo4, CHEN Yun-zhong1, 2*, JIANG Qiong4, LIU Si-min1, 2, ZHANG Cong-zi4, KOU Wei-ping5, HU Xiu-juan5, DENG Fan6, HUANG Wei-hua7. Research on Origin Traceability of Rhizoma Dioscoreae Based on LIBS[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(01): 138-144. |
[7] |
YAN Wen-hao1, YANG Xiao-ying1, GENG Xin1, WANG Le-shan1, LÜ Liang1, TIAN Ye1*, LI Ying1, LIN Hong2. Rapid Identification of Fish Products Using Handheld Laser Induced Breakdown Spectroscopy Combined With Random Forest[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(12): 3714-3718. |
[8] |
DUAN Hong-wei1, 2, GUO Mei3, ZHU Rong-guang3, NIU Qi-jian1, 2. LIBS Quantitative Analysis of Calorific Value of Straw Charcoal Based on XY Bivariate Feature Extraction Strategy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(11): 3435-3440. |
[9] |
YUAN Zhuang1, DONG Da-ming2*. Near-Infrared Spectroscopy Measurement of Contrastive Variational Autoencoder and Its Application in the Detection of Liquid Sample[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(11): 3637-3641. |
[10] |
FAN Yuan-chao, CHEN Xiao-jing*, HUANG Guang-zao, YUAN Lei-ming, SHI Wen, CHEN Xi. Evaluation of Aging State of Wire Insulation Materials Based on
Raman Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(10): 3161-3167. |
[11] |
YANG Jie-kai1, GUO Zhi-qiang1, HUANG Yuan2, 3*, GAO Hong-sheng1, JIN Ke1, WU Xiang-shuai2, YANG Jie1. Early Classification and Detection of Melon Graft Healing State Based on Hyperspectral Imaging[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(07): 2218-2224. |
[12] |
LU Ya-kun1, QIU Bo1*, LUO A-li2, GUO Xiao-yu1, WANG Lin-qian1, CAO Guan-long1, BAI Zhong-rui2, CHEN Jian-jun2. Classification of 2D Stellar Spectra Based on FFCNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(06): 1881-1885. |
[13] |
LIU Zhong-bao1, WANG Jie2*. Research on the Improvement of Spectra Classification Performance With the High-Performance Hybrid Deep Learning Network[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(03): 699-703. |
[14] |
CHEN Yan-ling, CHENG Liang-lun*, WU Heng*, XU Li-min, HE Wei-jian, LI Feng. A Method of Terahertz Spectrum Material Identification Based on Wavelet Coefficient Graph[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41(12): 3665-3670. |
[15] |
ZHANG Hui-jie, CAI Chong*, CUI Xu-hong, ZHANG Lei-lei. Rapid Detection of Anthocyanin in Mulberry Based on Raman Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2021, 41(12): 3771-3775. |
|
|
|
|