摘要
提出了一种基于决策树的微博情感度判断方法,并对微博情感做了探索性空间分析,给中文微博平台的海量文本规律研究提供了一个新的视角。以新浪微博数据作为基础,先利用ICTCLAS(Institute of Computing Technology,Chinese Lexical Analysis System)文本分词系统分词、HowNet知网知识库来进行词语相似度计算,再利用ID3(iterative dichotomiser 3)算法训练决策树作为分类器进行微博文本的情感度判断,最后对情感度判断结果进行探索性空间分析。结果表明,基于决策树的微博情感度判断方法的准确度为71.5%,微博用户情绪在空间上存在正的全局空间自相关特性,对局域自相关的分析也揭示了其时空聚集规律。
This paper proposes a method of sentimental judgment of Sina Weibo based on decision tree and makes exploratory spatial data analysis for the emotion of Weibo.Taking Sina Weibo as research data,we use ICTCLAS(Institute of Computing Technology,Chinese Lexical Analysis System)to process the Weibo text for word segmentation and part-ofspeech tagging,calculate the similarity between words based on HowNet system,and ID3(iterative dichotomiser 3)algorithm can be used as a Weibo text sentimental classifier to do exploratory spatial data analysis.Result shows that the method of our proposed has higher accuracy as 71.5%.There exists significant positive spatial autocorrelation(Moran's I)in Weibo sentiment.The analysis of local autocorrelation also shows the regularity of time and space accumulation.
出处
《测绘地理信息》
2018年第1期123-126,共4页
Journal of Geomatics
基金
国家自然科学基金资助项目(41471327)
关键词
空间自相关
情感度判断
决策树
分类算法
探索性空间分析
spatial autocorrelation
sentimental judgment
decision tree
classification algorithm
exploratory spatial data analysis