期刊文献+

基于上下文的短信文本分类方法 被引量:13

SMS Text Classification Method Based on Context
下载PDF
导出
摘要 针对海量短信文本数据中大量词语共现的特点,提出一种基于上下文的短信文本分类方法。利用词语的上下文关系,定义词语相似度和基于上下文的词语权值,科学地表达词语在该类别中的语义表示,以提高短信文本分类效率。实验结果表明,与传统的简单向量距离分类法相比,该方法的分类效果较优。 According to the characteristics of a lot of words co-occurrence in mass data of Short Messaging Service(SMS),a context-based SMS text classification method based on the context term is defined word similarity relations,and defines the term weights using context,which expresses more scientific terms in this category in the semantic representation and thus further improves classification efficiency of SMS text.Experimental results show that the classification performance of method than the traditional simple vector distance classification is significantly improved.
出处 《计算机工程》 CAS CSCD 北大核心 2011年第10期41-43,共3页 Computer Engineering
基金 淮安科技计划基金资助项目(HAG09061) 淮阴工学院基金资助重点项目(HGA0907)
关键词 短信文本 词语共现 上下文 词语相似度 短信文本分类 Short Messaging Service(SMS) text word co-occurrence context word similarity SMS text classification
  • 相关文献

参考文献6

二级参考文献54

共引文献81

同被引文献129

  • 1杨胜,顾钧.Feature selection based on mutual information and redundancy-synergy coefficient[J].Journal of Zhejiang University Science,2004,5(11):1382-1391. 被引量:7
  • 2史晶蕊,郑玉明,韩希.人工神经网络在文本分类中的应用[J].计算机应用研究,2005,22(10):213-216. 被引量:10
  • 3樊兴华,孙茂松.一种高性能的两类中文文本分类方法[J].计算机学报,2006,29(1):124-131. 被引量:70
  • 4张玉芳,彭时名,吕佳.基于文本分类TFIDF方法的改进与应用[J].计算机工程,2006,32(19):76-78. 被引量:121
  • 5张华平.计算所汉语词法分析系统ICTCLAS[EB/OL].[2002-08-16].http://www.nip.org.cn/project/project.php?pwj_id=6.
  • 6DEBOLE F, SCBASTIANI F. An analysis of the relative hardness of recuters-21578 subsets [J]. Journal of the American Society for Information Science and Technology,2004,56(6) :584-596.
  • 7AHN B S, CHO S S, KIM C. The integrated methodology of rough set theory and artificial neural network for business failure prediction[ J]. Expert Systems with Applications, 2000,18(2) :65-74.
  • 8SALTON G, WANG A, YANG C S. A vector space model for automatic indexing [ J ]. Communication of the ACM, 1975, 18(5):613-620.
  • 9ESIN Y E, ALAN O, ALPASLAN F N. Improvement on corpus-based word similarity using vector space models [ C ]// 24th International Symposium on Computer and Information Sciences. Guzelyurt: Middle East Technical University Press, 2009: 280-285.
  • 10LEWIS D. Feature selection and feature extraction for text categorization[C]// Proceedings of Speech and Natural Language Workshop. San Mateo: Morgan Kaulinann Press, 1992: 212-217.

引证文献13

二级引证文献112

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部