Abstract:Outlier mining is one of the effective methods to find the abnormal celestial spectrum data, and is also one of effective ways to discover the special and unknown celestial bodies. In the present paper, an abnormal characteristic line mining method of celestial spectrum is presented based on the attribute weight and wk-distance by using the idea of information entropy. Based on it, an abnormal characteristic line mining system of celestial spectrum was designed and implemented. Firstly, attribute weight of characteristic line was determined by using the idea of information entropy, so that important degree was effectively reflected for each characteristic line. Secondly, massive characteristic line data set of celestial spectrum was reduced by utilizing pruning technique based on neighborhood radius, so that candidate set of abnormal characteristic line was obtained by deleting data objects in which there may not be abnormal characteristic lines. Thirdly, wk-distance sum was computed according to the deviation between the data objects in the candidate set, and the objects whose wk-distance sum value ranks the first top n were regarded as abnormal characteristic line data objects. In the end, the experimental and the system’s running results validated the effectiveness and feasibility of the method by using the SDSS star spectral data set.
娄圣金,张继福*,杨海峰 . 一种基于属性权值和wk-距离的天体光谱异常特征线挖掘方法 [J]. 光谱学与光谱分析, 2013, 33(08): 2255-2258.
LOU Sheng-jin, ZHANG Ji-fu*, YANG Hai-feng . An Abnormal Characteristic Line Mining Method of Celestial Spectrum Based on Attribute Weight and wk-Distance . SPECTROSCOPY AND SPECTRAL ANALYSIS, 2013, 33(08): 2255-2258.
[1] Richard Stone. Science,2008,320(4): 34. [2] Gulati R K,Gupta R,Rao N K. Astronomy and Astrophysics,1997,322(3): 933. [3] Weaver B, Torres-Dodgen. Astrophysical Journal,1997,487(10): 847. [4] LIU Rong,DUAN Fu-qing, LIU San-yang, et al(刘 蓉,段福庆,刘三阳,等). Chinese Journal of Electronics(电子学报),2005,33(11): 2059. [5] Zhang Jifu, Zhao Xujun, Zhang Sulan, et al. Knowledge-Based Systems,2013, 41:77. [6] ZHANG Ji-fu,ZHAO Xu-jun(张继福,赵旭俊). Pattern Recognition and Artificial Intelligence(模式识别与人工智能),2009,22(4):639. [7] Zhang Jifu,Jiang Yiyong,Chang H Kai, et al. Pattern Recognition Letters,2009,30(15): 1434. [8] ZHANG Ji-fu, JIANG Yi-yong, HU Li-hua, et al(张继福,蒋义勇,胡立华,等). Acta Automatica Sinica(自动化学报),2008,34(9):1060. [9] ZHANG Su-lan,GUO Ping,ZHANG Ji-fu(张素兰,郭 平,张继福). Journal of Beijing Institute of Technology(北京理工大学学报),2011,31(1):59. [10] Zhang Sulan,Guo Ping,Zhang Jifu,et al. Data & Knowledge Engineering,2012,(81):104. [11] Fabrizio A,Fabio F. ACM Transactions on Knowledge Discovery from Data,2009,3(1): 1. [12] Bhaduri K,Matthews B L,Giannella C R. The 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,USA,2011. 859.