期刊文献+

LDA主题模型研究综述 被引量:15

Review of the Latent Dirichlet Allocation Topic Model
下载PDF
导出
摘要 主题模型在机器学习领域已成为研究的一大热点问题。本文系统阐述LDA(Latent Dirichlet Allocation)主题模型参数估计和Gibbs抽样算法,介绍常见的LDA改进和扩展模型,最后分析LDA模型在文本挖掘领域的应用情况。 Topic model is one of the hottest issues in the research area of machine learning. In this paper, the parameter estimation method and the Gibbs sampling algorithm of the Latent Dirichlet Allocation (LDA) model are systematically described. And then, several improved and generalized LDA models are introduced. Finally, the applications of the LDA model in the field of text mining are analyzed.
作者 祖弦 谢飞
出处 《合肥师范学院学报》 2015年第6期55-58,61,共5页 Journal of Hefei Normal University
基金 安徽省高校省级自然科学研究重点项目(KJ2014A198) 合肥师范学院校级科研项目(2015TD05)
关键词 主题模型 LDA 参数估计 GIBBS抽样 topic model LDA parameter estimation Gibbs sampling
  • 相关文献

参考文献28

  • 1Deerwester S, Dumais S, Furnas G,et al. Indexing by latentsemantic analysis. Journal of the American Society for Infor-mation Science, 1990, 41(6) : 391-407.
  • 2Hofmann T. Probabilistic latent semantic analysis. In: Proc.of the Conference on Uncertainty in Artificial Intelligence,1999:289-296.
  • 3Blei D,Ng A,Jordan M, Latent Dirichlet Allocation. Journalof Machine Learning Research* 2003,3:993-1022.
  • 4Blei D, Lafferty J. Correlated topic models [J], In: Proc. ofInternational Conference on Machine Learning, 2006: 113-120.
  • 5Li W, McCallum A. Pachinko allocation: DAG-structuredmixture models of topic correlations. In: Proc. of internationalconference on Machine learning. 2006: 577-5*84.
  • 6Blei D,Lafferty J. Dynamic topic models. In: Proc. of theACM SIGKDD, 2006: 424-433.
  • 7Wang C,Blei D,Heckerman D. Continuous time dynamic top-ic models. In: proc. of Uncertainty in Artificial Intelligence.2008:579-586.
  • 8AlSumait L,Barbara D, Domeniconi C. On-line LDA: Adap-tive Topic Models for Mining Text Streams with Applicationsto Topic Detection and Tracking [J]. In: Proc. of the IEEEInternational Conference on Data Mining, 2008:3 - 12.
  • 9Zhao W X,Jiang J, et al. Comparing twitter and traditionalmedia using topic models. In: Proc. ofECIR,2011 : 338-349.
  • 10Wang Y,Agichtein E, Benzi M. TM-LDA: efficient onlinemodeling of latent topic transitions in social media. In Proc.of ACM SIGKDD, 2012: 123-131.

二级参考文献19

  • 1张启蕊,张凌,董守斌,谭景华.训练集类别分布对文本分类的影响[J].清华大学学报(自然科学版),2005,45(S1):1802-1805. 被引量:27
  • 2曾雪强,王明文,陈素芬.一种基于潜在语义结构的文本分类模型[J].华南理工大学学报(自然科学版),2004,32(z1):99-102. 被引量:27
  • 3苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:387
  • 4Fabrizio Sebastiani. Text categorization//Alessandro Zanasi. Text Mining and its Applications. Southampton, UK: WIT Press, 2005:109-129
  • 5Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 2002, 34(1): 1-47
  • 6Moschitti A, Basili R. Complex linguistic features for text classification: A comprehensive study//McDonald S, Tait J. Proceedings of the ECIR-04. Sunderland: Springer-Verlag. Sunderland, U. K., 2004:181-196
  • 7Kehagias A, Petridis V, Kaburlasos V G, Fragkou P. A comparison of word- and sense- based text categorization using several classification algorithms. Journal of Intelligent Information Systems, 2003, 21(3): 227-247
  • 8Deerwester S, Dumais S T, Furnas et al. Indexing by latent semantic indexing. Journal of the American Society for Information Science, 1990, 41(6): 391-407
  • 9Thomas Hofmann. Probabilistic latent semantic indexing// Proceedings of the SIGIR. Berkeley, CA, USA, 1999:50-57
  • 10Schutze H, Hull D A et al, A comparison of classifiers and document representations for the routing problem//Proceedings of the SIGIR-95. Seattle, Washington, USA, 1995: 229-237

共引文献102

同被引文献164

引证文献15

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部