摘要
现有话题流行度预测方法仅基于话题本身的特征进行流行度预测,未考虑不同话题间的相关性.然而在微博上下文不同的话题之间存在一定的相关性,特别是在同一个事件的不同话题之间.因此,文中利用动态话题模型探测微博中的隐式话题及其流行度时间序列,通过Jensen-Shannon散度和皮尔逊相关系数分别分析话题间的内容和时序相关度,然后在预测模型中引入话题时序相关性,提出基于向量自回归模型的微博隐式话题流行度预测算法.通过在真实微博数据上的实验分析可知,相比未考虑话题相关性的算法,文中算法具有更高的预测准确率和更好的模型拟合效果.
The existing topic popularity prediction methods predict the topic popularity just based on the features of topic and the correlations between different topics are not taken into account. However, there are correlations among different topics in microblog contexts, especially for the topics of the same event. Aiming at this problem, dynamic topic model is firstly employed to detect the hidden topics and their popularity time series from microblogs in this paper. Then, the Jensen-Shannon divergence and Pearson's correlation coefficient are computed to investigate the correlations among topic contents and among topic time-series, respectively. Thus, the motivation of introducing topics correlation is revealed. Finally, a vector auto-regressive (VAR) model based Microblog hidden topic popularity prediction algorithm is proposed by introducing correlations among different topic time-series in model training. Experiments are conducted on the real data. Experimental results show that the proposed algorithm performs better in prediction accuracy and model fitting than algorithms without consideration of correlations among different topics.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2016年第7期616-624,共9页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.61402123
61300222
61173170)
国家计算机网络应急技术处理协调中心青年基金项目(No.2015QN-006
2014QN01)
北京科技计划项目(No.Z161100000216128)
软件工程国家重点实验室开放基金项目(No.SKLSE2012-09-11)资助~~