期刊文献+

特定事件微博与新闻报道话题对比研究 被引量:3

Comparing Topics from Microblog and News Media about Specific Events
下载PDF
导出
摘要 该文描述了基于特定事件的新闻报道和微博在话题层面的对比研究。首先利用LDA话题模型抽取两种媒体上关于特定事件的话题,然后提出了话题关注度、差异度、演化度的定义和计算公式,改进了不同媒体话题差异度的计算方法,最后,选取四个不同种类的事件,进行实验对比与分析,结果显示,关于同一事件,1)微博上评论性话题较多,话题关注度值比较接近;新闻报道上事实性话题较多,话题关注度值差异较大;2)微博与新闻报道对评论性话题词汇差异度大,事实性话题词汇差异度小;3)微博上评论性话题持续时间较长,内容变化较少;新闻报道上事实性话题持续时间较长,内容变化较少。 This work conducts a contrastive study on the topics of specific events from microblog and news media. Firstly, we use LDA to extract topics from the two media, and then define three indexes: attention factor, diversity factor and evolution factor for an improved topic discrepancy calculation. Finally, we chose four events of different types for experiments and analysis. The results show: 1) There are more comment topics appearing on microblog with close attention factors in contrast to a high proportion of factual topics with varied attention factors in the news media. 2) In both microblog and news media, diversity factor of words used in the comment topics is bigger than that in factual topics. 3) In microblog, comment topics last longer with consistent contents, while the factual topics does so in the news media.
作者 周振宇 李芳
出处 《中文信息学报》 CSCD 北大核心 2014年第1期47-55,共9页 Journal of Chinese Information Processing
基金 国家自然科学基金(60873134)
关键词 话题模型 微博 新闻报道 对比 topic model, Microblog, news, contrast
  • 相关文献

参考文献10

  • 1Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allo- cation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 2Blei D M, Lafferty J D. A Correlated Topic Model of Science[J]. The Annals of Applied Statistics 2007,1 (1) :17-35.
  • 3Blei D M, Laf[erty J D. Dynamic Topic Model[C]// Proceedings of International conference on Machine Learning, 2006:113 120.
  • 4Liangjie Hong, Davison B D. Empirical study of topicmodeling in Twitter[C]//Proceedings of the SIGKDD Workshop on SMA,2008.
  • 5Xin Zhao, Jing Jiang, JianshuWeng, et al. Comparing Twitter and traditional media using topic models[C]// Proceedings of the European Conference on Informa- tion Retrieval, 2011.
  • 6Ramage D, Dumais S, Liebling D. Characterizing Mi- croblogs with Topic Models [C]//Proceedings of AAAI on Weblogs and Social Media, 2010.
  • 7Ramage D, Hall D, Nallapati R, et al. Labeled LDA: a supervised topic model for credit attribution in muhi- labeled corpora[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2009.
  • 8Yan Qu, Chen Huang, Pengyi Zhang, et al. Microb- logging after a Major Disaster in China.. A Case Study of the 2010 Yushu Earthquake[C]//Proceedings of the ACM 2011 conference on Computer supported cooper- ative work, 2011: 25-34.
  • 9Vieweg S, Hughes A L, Starbird K, et al. Microblog- gingDuring Two Natural Hazards Events: What Twit- ter May Contribute to Situational Awareness[C]//Pro- ceedings of the 28th International Conference on Hu- man factors in computing systems, 2010: 1079-1088.
  • 10楚克明,李芳.基于LDA话题关联的话题演化[J].上海交通大学学报,2010,44(11):1496-1500. 被引量:20

二级参考文献7

  • 1Blei D M, Ng A Y, Jordan M I. I.atent dirichlet allocation[J]. Journal of Machine Learning Research, 2003 (3) :933-1022.
  • 2Wang X, McCallum A. Topic over time: A non-markov continuous-time model of topical trends[C]//ACM SIGKDIN2006. Philadelphia, USA : [s. n. ], 2006 : 424- 433.
  • 3Hall D, Jurafsky D, Manning C D. Studying the history of ideas using topic models[C]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Waikiki, Honolulu, Huawaii: [s. n. ], 2008:363-371.
  • 4Blei D M, Lafferty J D. Dynamic topic models[C]// Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, Pernsylvania: [s. n. ], 2006:113 -120.
  • 5Alsumait L, Barbara D, Domeniconi C. On-line LDA: Adaptive topic models for mining text streams with applications to topic detection and tracking [C]//ln ICDM. Pisa, Italy:[s.n.], 2008:3-12.
  • 6Griffiths T L, Steyvers M. Finding scientific topics [J].Proe Natl Acad Sci USA, 2004, 101(Suppl 1): 5228-5235.
  • 7楚克明,李芳.基于LDA的新闻话题的演化[C]//第5届全国信息检索学术会议.上海:[s.n.],2009.

共引文献19

同被引文献41

  • 1王妙娅.国内图书馆微博应用现状及建议[J].图书馆学研究(应用版),2010(12):37-41. 被引量:123
  • 2中国互联网络信息中心.第36次中国互联网络发展状况统计报告[R/OL].[2015-06-01].http://www.cnnic.net.cn/h1.wfzyj/hlwxzbg,/.
  • 3Pons-Porrata A, Berlanga-Llavori R, R.uiz-Shulclo- per J. Topic discovery based on text mining tech- niques[J]. Information processing & management, 2007, 43(3): 752-768.
  • 4Park J, Kim J, Lee J. Keyword extraction for blogs based on content richness[J]. Journal of Information Science, 2013,(8): 45.
  • 5Wu K, Chen M, Sun Y. Automatic topics discovery from hyperlinked documents[J]. Information process- ing & management,2004, 40(2): 239-255.
  • 6O'Connor B, Krieger M, Ahn D. TweetMotif: Ex- ploratory Search and Topic Summarization for Twit- ter[C]. US,Define publisher,2010.
  • 7Guo X, Xiang Y, Chen Q, et al. LDA-based online topic detection using tensor factorization[J]. Journal of Information Science, 2013,39(4): 459-469.
  • 8Blei D M, Ng A Y, Jordan M I. Latent dirichlet allo- cation[J], the Journal of machine Learning research, 2003, (3): 993-1022.
  • 9Griffiths T L, Steyvers M. Finding scientific topics[J]. Pro- ceedings of the National academy of Sciences of the United States of America, 2004, 101(Suppl 1): 5228-5235.
  • 10Twitter Reports Second Quarter 2015 Results[R/OL].[2015-09- 01 ].http://files.shareholder.com/downloads/AMDA-2F526X/0xOx 841607/E35857E7-8984-48C1-A33B-15B62F72AOF7/2015_ Q2_Earnings_press_release.pdf.

引证文献3

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部