期刊文献+

文本数据主题挖掘与关联搜索研究 被引量:6

Research on Text Data Topic Mining and Association Search
下载PDF
导出
摘要 文本数据是存储和交换信息最自然的方式,文本挖掘技术可以发现海量文本数据中隐藏的潜在知识模式。研究了文本数据主题挖掘与关联搜索技术,首先通过文本解析提取、分词预处理和索引等进行文本信息处理,然后利用基于潜在语义关系的主题发现模型挖掘大量文本数据中隐藏的主题信息,最后利用主题模型计算关键词间的关联程度进行查询扩展,从而实现关联搜索。实现了一个文本数据挖掘与关联搜索的原型系统,对Tancorp数据集进行主题发现和关联搜索,并以视化和网页同步显示关联搜索的过程。 Text data is the most natural way of storing and exchanging information. Text mining technology can disco-ver knowledge patterns hidden in massive text data. The text data mining and related search technology were studied in the paper, First ly, text information is extracted by text parsing and extraction, word preprocessing and indexing. Then the theme information model based on latent semantic relations is used to mine the hidden topic information in large amount of text data. Finally, the topic model is used to calculate the relevance degree of keywords. In order to achieve the associated search,a prototype system of text data mining and association search is implemented. Subject discovery and association search were performed on Tancorp dataset, and the process of association search was displayed synchro-nously with visualization and Web page.
出处 《计算机科学》 CSCD 北大核心 2017年第B11期411-413,456,共4页 Computer Science
关键词 文本挖掘 主题发现 关联搜索 Text mining,Topic discovery, Association search
  • 相关文献

同被引文献51

引证文献6

二级引证文献40

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部