摘要
恒星光谱分类是天文数据处理的重要环节,由于天文望远镜的改进与发展,人类已获取海量的光谱数据,在如此大的数据量下,光谱的快速分类识别显得尤为重要。聚类技术是实现目标分类的常用方法之一,而聚类中心点的选择是影响聚类精度和效率的重要因素,基于此提出一种快速确定聚类中心的光谱聚类方法(Fast Determination of Clustering Center)(FDCC).首先预处理提取出给定发射线的置信度信息,将其作为聚类方法的数据,从而实现对光谱数据的降维;计算所有数据的密度和距离,将密度和距离组合成评判值,利用聚类中心的密度高且相互距离远的特点从评判值中找出奇异点;最后利用真正中心点的密度和距离不应相差过大的特点从奇异点中得到聚类中心,再根据聚类中心使用K近邻得到所有的簇。该研究使用LAMOST DR5的光谱数据进行了聚类测试,实验结果表明本文提出的快速确定聚类中心的光谱聚类方法(FDCC)能够有效地减少运行时间,并且较于其他的算法,具有更好的聚类结果。
The classification of star spectral data is an important part of astronomical data processing.Due to the development of astronomical telescopes,humans have obtained massive amounts of spectral data.Under such a large amount of data,the rapid classification and identification of spectra is particularly important.Clustering method is one of the common methods to achieve target classification,and the selection of clustering center point is an important factor that affects the accuracy and efficiency of clustering.Based on this,a spectral clustering method for fast determination of clustering centers(FDCC)is proposed.Firstly,the confidence information of given emission lines will be extracted by preprocessing,and the extracted confidence information is used as the data of the clustering method,so that the dimensionality reduction of the spectral data can be achieved;the density and distance of all data can be calculated.Density and distance are combined into a judgment value,and the characteristics of cluster centers with high density and long distances are used to find singular points from the judgment values;finally,the cluster center is obtained from the singular points by characteristics that density and distance of the cluster center should not be too large.After obtaining clustering center,K nearest neighbors are used to get all clusters by cluster center,and the spectral data of LAMOST DR5 is used to perform the clustering test.The experimental results show that the fast determining clustering center spectral clustering method(FDCC)proposed in this paper can effectively reduce the time.Compared with other algorithms,FDCC has better clustering results.
作者
周永祥
杨海峰
蔡江辉
尚晓群
ZHOU Yong-xiang;YANG Hai-feng;CAI Jiang-hui;SHANG Xiao-qun(School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)
出处
《太原科技大学学报》
2020年第6期425-432,共8页
Journal of Taiyuan University of Science and Technology
基金
国家自然科学基金(U1731126)
山西省重点研发计划(201903D121116)。
关键词
聚类
恒星
聚类中心
预处理
cluster
star
clustering centers
preprocessing