期刊文献+

有效的中文微博短文本倾向性分类算法 被引量:39

ON EFFECTIVE SHORT TEXT TENDENCY CLASSIFICATION ALGORITHM FOR CHINESE MICROBLOGGING
下载PDF
导出
摘要 对具有长度短、结构复杂以及变形词多等特点的短文本倾向性分类进行深入研究,目的是提高倾向性分类的准确性和效率。以HowNet的情感词典为基础,提出一个微博新词发现算法,构建微博情感词典。在对文本进行分句、分词、标注、情感处理等后,构建一个自动机来计算短文本情感倾向性。为了客观评价该方法,选择基于HowNet的分类方法、基于SVM的分类方法进行比较性实验。实验结果表明提出的方法在一般文本分类上与SVM效果类似,在短文本上则具有明显的优势。同时该方法在效率上也具有突出优势。 In this paper we carry out thorough study on classifying the tendency of short Chinese texts with the characteristics of short length, complex structure and multiple transformed words aiming at improving the accuracy and efficiency of tendency classification. We take emotional lexicons of HowNet as the basis, propose a new discovery algorithm of new mieroblogging words, which is used to construct a mieroblogging emotional lexicon. After the text is performed the sentence segmentation, word segmentation, POS tagging and sentiment process, we set up an automata to calculate the sentiment tendency of the short text. In order to objectively evaluate this method, we chose HowNet-Based classification and SVM-based classification to make comparison experiment. Experimental results show that the proposed method has equivalent effect with SVM classification method on general text, and outperforms on the short text noticeably. The proposed method also has the outstanding advantages in efficiency.
出处 《计算机应用与软件》 CSCD 北大核心 2012年第10期89-93,共5页 Computer Applications and Software
基金 国家自然科学基金项目(61170112)
关键词 倾向性 情感 词典 自动机 知网 支持向量机 Orientation Sentiment Lexicon Automata HowNet SVM
  • 相关文献

参考文献15

二级参考文献63

  • 1刘永丹,曾海泉,李荣陆,胡运发.基于语义分析的倾向性文本过滤[J].通信学报,2004,25(7):78-85. 被引量:34
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3应伟,王正欧,安金龙.一种基于改进的支持向量机的多类文本分类方法[J].计算机工程,2006,32(16):74-76. 被引量:28
  • 4徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:123
  • 5李峰,李芳.中文词语语义相似度计算——基于《知网》2000[J].中文信息学报,2007,21(3):99-105. 被引量:106
  • 6Liu Hugo,Lieberman H,SelKer T.A Model of Textual Affect Sensing Using Real-world Knowledge[C]//Proc.of International Conference on Intelligent User Interfaces.Miami,Florida,USA:[s.n.],2003:125-132.
  • 7Turney P D,Littman M L.Measuring Praise and Critism:Inference of Semantic Orientation from Association[J].ACM Transactions on Information Systems,2003,21(4):315-346.
  • 8Pang Bo,Lee Lilian,Vaithyanathan S.Thumbs up? Sentiment Classification Using Machine Learning Techniques[C]//Proc.of Conferenee on Empirieal Methods in Natural Language Processing.Morristown,NJ,USA:[s.n.],2002:79-86.
  • 9WANG CHAO, LU JIE, ZHANG GUANGQUAN. A semantic clas- sification approach for online product reviews [ C]//Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence. Washington, DC: IEEE, 2005:276-279.
  • 10KIM S M, HOVY E. Automatic identification of pro and con reasons in online reviews [ C]// Proceedings of the COLING/ACL on Main conference poster sessions. Morristown, NJ: Association for Computational Linguistics, 2006:483 - 490.

共引文献263

同被引文献402

引证文献39

二级引证文献381

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部