期刊文献+

基于种子文档LDA话题的演化研究 被引量:6

Topic Evolution Based on Seminal Document and Topic Model
原文传递
导出
摘要 提出一种基于种子文档的LDA话题演化方法。首先选取种子文档,利用种子文档指导后一时间段文档的建模,然后根据种子文档的语义分布信息对连续时间上的LDA话题进行关联,保证话题的同一性。实验结果证明,在NIPS论文语料集和全国两会新闻报道集中,该方法可以推导特定话题的演化结果,避免关联话题之间存在的演化结果。 This paper presents a new method to infer the LDA topic evolution automatically based on seminal documents. The semantic distribution of the seminal documents is used to guide the successive model and link topics between consecutive time slices. The experiments are based on NIPS dataset and Chinese newswire of NPC and CPPCC, and the results show that the method can not only get the correct evolutions in various forms, but also avoid those related topics without evolution relationship.
作者 单斌 李芳
出处 《现代图书情报技术》 CSSCI 北大核心 2011年第7期104-109,共6页 New Technology of Library and Information Service
基金 国家自然科学基金项目"新闻话题线索与主题的探测研究"(项目编号:60873134)的研究成果之一
关键词 LDA 话题演化 种子文档 话题模型 LDA Topic evolution Seminal document Topic model
  • 相关文献

参考文献12

  • 1Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[ J]. The Journal of Machine Learning Research ,2003 (3) :993 - 1022.
  • 2Wang X, McCallum A. Topic over Time : A Non - markov Continuous - time Model of Topical Trends [ C ]. In : Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia,PA, USA. 2006:424 -433.
  • 3Rosen - Zvi M, Griffiths T, Steyvers M, et al. The Author - topic Model for Authors and Documents[ C ]. In : Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, Banff, Canada. 2004:487 - 494.
  • 4Blei D M,McAuliffe J D. Supervised Topic Models[C]. In: Proceeding of the 22nd Annual Conference on Neural Information Pro- cessing Systems. 2008.
  • 5Blei D M, LaffertyJ D. Dynamic Topic Model[ C ]. In: Proceedings of the 23rd International Conference on Machine Learning, Pitts- burgh, Pennsylvania. 2006 : 113 - 120.
  • 6Wei X, Sun J, Wang X. Dynamic Mixture Models for Multiple Time Series [ C ]. In : Proceedings of the 20th International Joint Conference on Artificial Intelligence. 2007:2909 -2914.
  • 7单斌,李芳.基于LDA话题演化研究方法综述[J].中文信息学报,2010,24(6):43-49. 被引量:86
  • 8Makkonen J. Investigations on Event Evolution in TDT[ C]. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. 2003:43 - 48.
  • 9楚克明,李芳.基于LDA话题关联的话题演化[J].上海交通大学学报,2010,44(11):1496-1500. 被引量:20
  • 10Nallapati R M,Ahmed A,Xing E P,et al. Joint Latent Topic Models for Text and Citations [ C ]. In : Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York : ACM Press, 2008:542 - 550.

二级参考文献31

  • 1Thomas Hofmann. Probabilistic latent semantic indexing[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, CA, USA, 1999,50-57.
  • 2David M. Blei, Andrew Y. Ng, Michael I. Jordan. Latent dirichlet allocation[J]. The Journal of Machine Learning Research,2003,3: 993-1022.
  • 3T. Griffiths,M. Steyvers. A probabilistic approach to semantic representation [C]//Proceedings of the 24th Annual Conference of the Congnitive Science Society. Mahwah, NJ : Erlbaum, 2002,381-386.
  • 4M. Steyvers,T. Griffiths. Probabilistic topic models In: T. Landauer, D. S. McNamara, S. Dennis, W Kintsch (Eds.), handbook of Latent Semantic Analysis[M]. Hillsdale, NJ.. Erlbaum. 2007.
  • 5X. Wang, A. McCallum. Topic over time: A non-mark ov continuous-time model of topical trends[C]//Pro ceedings of the 12th ACM SIGKDD International Con ference on Knowledge Discovery and Data Mining Philadelphia, PA, USA, 2006: 424-433.
  • 6D. HalI,D. Jurafsky,C. D. Manning. Studying the history of ideas using topic models[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii, 2008,363-371.
  • 7D. M. Blei,J. D. Lafferty. Dynamic topic model[C]// Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, Pennsylvania, 2006 : 113-120.
  • 8L. Alsumait,D. Barbara,C. Domeniconi. On-line LDA : Adaptive topic models of mining text streams with applications to topic detection and tracking[C]//Proceeding of the 8th IEEE International Conference on Data Mining. Washington,DC, USA : IEEE Computer Society,2008:3-12.
  • 9楚克明.基于LDA新闻话题的演化[C]//第五届全国信息检索学术会议.上海,中国,2009:64-72.
  • 10A. Gohr, A. Hinnerburg, R. Schult, M. Spiliopoulou. Topic evolution in a stream of documents[C]//Proceeding of the Society for Industrial and Applied Mathematics. 2009 : 859-870.

共引文献96

同被引文献69

  • 1刘毅.简析舆情变动规律[J].天津社会科学,2007(3):63-65. 被引量:28
  • 2Juha M.Investigations on Event Evolution in TDT[C].Proceed-ings of the 2003 Conference of the North American Chapter ofthe Association for Computational Linguistics on Human Lan-guage Technology.Edmonton,Canada,2003.
  • 3Blei DM,Ng A Yj Jordan M I.Latent Dirichletal Allocation[J].Journal of Machine Learning Research,2003(3); 993-1022.
  • 4Wang X R,McCallum A.Topic Over Time:a Non-Markov Con-tinuous-time Model of Topical Trends[C].Proceedings of the12th ACM SIGKDD International Conference on Knowledge Dis-covery and Data Mining,Philadelphia,PA,USA,2006:424-433.
  • 5Blei D M,Lafferty J D.Dynamic Topic Model[C].Proceedingsof the 23 rd International Conference on Machine Learning,Pitts-burgh,Pennsylvania,2006:113-120.
  • 6Wei X,Sun J,Wang X.Dynamic Mixture Models for MultipleTime Series[C].Proceedings of the 20th International JointCon-ference on Artifical Intelligence,Hyderabad,India,2007:2909-2914.
  • 7Nallapati R M,Cohen W,Ditmore S,et.al.Multi-scale TopicTomography[C].Proceedings of the 13th ACM SIGKDD Inter-national Conference on Knowledge Discovery and Data Mining,San Jose,California,USA,2007:520-529.
  • 8Wang C,Blei D M,Heckerman D.Continuous Time DynamicTopic Models[C].Proceedings of the 24th Annual Conferenceon Uncertainty in Artificial Intelligence,Corvallis,Oregon,2008:579-586.
  • 9Alsumait L,Barbara D,Domeniconi C.On-line LDA:AdaptiveTopic Models of Mining text Streams with Applications to TopicDetection and Tracking[C].Proceedings of the 8thIEEE Interna-tional Conference on Data Mining,Washington,DC,USA:IEEEComputer Society,2008:3-12.
  • 10Griffiths T L,Steyvers M.Finding Scientific Topics[J].Proceed-ings of the National Academy of Sciences of the United States ofAmerica,2004,101(1):5228-5235.

引证文献6

二级引证文献89

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部