期刊文献+

基于谱聚类的轨道电路故障文本主题聚类研究 被引量:1

Research on Topic Clustering of Track Circuit Fault Text Based on Spectral Clustering
下载PDF
导出
摘要 轨道电路故障日志是现场日常运维工作中的重要数据记录。针对轨道电路故障日志在现场工作中未能充分挖掘利用且人工分析效率较低的问题,提出基于谱聚类算法的轨道电路故障文本主题聚类挖掘分析方法。首先,分析轨道电路故障文本数据特征并进行文本预处理,采用Word2vec模型训练获取字符级特征向量,实现在语义空间上的轨道电路故障文本数据特征表示;然后,依据Laplacian矩阵的图谱聚类特性,将高维故障文本特征数据聚类转换为谱图切分问题,分别对电务、工务及供电故障因素文本数据求解规范化后Laplacian矩阵的特征向量,并构建低维故障文本特征矩阵,再通过K-Means聚类算法实现3种故障因素文本数据集下故障文本主题聚类分析,获取电务、工务及供电故障因素文本数据中蕴含的轨道电路故障主题类型及频率信息,并基于t分布随机邻域嵌入算法实现聚类结果的可视化分析;最后,采用不同聚类模型在3种故障因素文本数据集上进行对比实验。实验结果表明:基于谱聚类算法的聚类模型在保证故障文本聚类准确率的情况下,其收敛性能更优;聚类可视化分析结果验证了获取的不同故障主题类别具有较高的语义区分度。通过该方法对轨道电路故障文本数据进行自动化聚类挖掘及统计分析,可为现场轨道电路综合维修及故障预防提供辅助支持。 Track circuit fault log is an important data record in the daily operation and maintenance work on site.Ai-ming at the problem that the track circuit fault log is not fully utilized in the field work and the efficiency of manual analysis is low,a topic clustering analysis method of track circuit fault text based on spectral clustering algorithm was proposed.Firstly,the characteristics of track circuit fault text data were analyzed and text preprocessing was carried out,Word2vec model was used to train and obtain character-level vectors to realize the feature representation of track circuit fault text data in semantic space;Secondly,according to the spectral clustering characteristics of the Laplacian matrix,the high-dimensional fault text feature data clustering was converted into a spectral segmentation problem,for the three fault factors text data,the feature vectors of normalized Laplacian matrix were solved and a low dimensional fault text feature matrix was constructed,then the K-Means clustering algorithm was used to realize the fault text topic clustering analysis under three fault factors text data sets,and the hidden track circuit fault topic type and frequency information contained in the text data of different fault factors was obtained,and the visual analy-sis of the clustering results based on the t-distributed stochastic neighbor embedding algorithm was realized;Finally,comparative experiments were conducted on three fault factor text data sets using different clustering models.The experimental results show that the clustering model based on spectral clustering algorithm had better convergence performance while ensuring the clustering accuracy of fault text clustering;Based on the clustering visualization anal-ysis results,it is verified that the different fault topic categories obtained have high semantic discrimination.Through this method,automated clustering mining and statistical analysis of track circuit fault text data can provide auxiliary support for on-site track circuit comprehensive maintenance and fault prevention.
作者 姚新文 侯通 郑启明 王小敏 YAO Xinwen;HOU Tong;ZHENG Qiming;WANG Xiaomin(School of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756,China;Transportation&Economics Research Institute,China Academy of Railway Sciences Corporation Limited,Beijing 100081,China;Sichuan Province Train Operation Control Technology Engineering Research Center,Chengdu 611756,China)
出处 《兰州交通大学学报》 CAS 2024年第1期64-72,共9页 Journal of Lanzhou Jiaotong University
基金 中国国家铁路集团有限公司科技研发计划项目(L2022G004,P2021G053)。
关键词 轨道电路 谱聚类 文本聚类 Word2vec 故障主题 track circuit spectral clustering text clustering Word2vec fault topic
  • 相关文献

参考文献8

二级参考文献126

  • 1祝林啸,吴嗣亮.轨道电路多音调频信号的数字解调方法[J].中国铁道科学,2005,26(5):91-95. 被引量:2
  • 2邹娟,周经野,邓成,高南莎.特征词提取中同义处理的新方法[J].中文信息学报,2005,19(6):44-49. 被引量:10
  • 3王玲,薄列峰,焦李成.密度敏感的半监督谱聚类[J].软件学报,2007,18(10):2412-2422. 被引量:95
  • 4于江德,樊孝忠,尹继豪.隐马尔可夫模型在自然语言处理中的应用[J].计算机工程与设计,2007,28(22):5514-5516. 被引量:14
  • 5孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1076
  • 6Abdul-Mageed M M. Online news sites and journalism 2. 0 : Reader comments on A1 Jazeera Arabic [ J ]. tripleC : Communication, Capitalism & Critique. Open Access Journal for a Global Sustainable Information Society, 2008, 6 ( 2 ) : 59-76.
  • 7Liu Q, Zhou M, Zhao X. Understanding News 2.0: A framework for explaining the number of comments from readers on online news [ J ] . Information & Management, 2015, 52(7) : 764-776.
  • 8Walther J B, DeAndrea D, Kim J, et al. The influence of online comments on perceptions of antimarijuana public service announcements on YouTube [ J ]. Human Communication Research, 2010, 36 (4) : 469-492.
  • 9Houston J B, Hansen G J, Nisbett G S. Influence of user comments on perceptions of media bias and third-person effect in online newsEJ~. Electronic News, 2011, 5(2) : 79 -92.
  • 10Saha S K. Person Specific Comment Extraction and Classification [ D ]. Jadavpur University Kolkata, 2012.

共引文献80

同被引文献11

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部