期刊文献+

基于改进Labeled LDA模型的科技视频文本分类 被引量:3

Science and Technology Video Text Classification Based on Improved Labeled LDA Model
下载PDF
导出
摘要 在对科技领域视频文本进行分类时,容易忽略分类贡献度较高的专业名词。为此,改进传统Labeled潜在Dirichlet分布(LDA)模型,建立用于科技领域视频文本的M ul CHI-Labeled LDA模型,避免偏向高频词的现象。通过构建领域术语库以突出专业名词,同时使用卡方加权和文本位置加权算法提升主题词质量。实验结果表明,与Labeled LDA模型相比,该模型可以解决专业名词被忽略的问题,并能有效提高主题词质量和分类准确率。 In the process of classifying video texts in the field of science and technology,it is easy to ignore the terminology with high classification contribution.Considering the problem that the traditional Labeled Latent Dirichlet Allocation(LDA)model has biased high frequency words,this paper improves it and establishes the MulCHI-Labeled LDA model for video texts in the scientific field,by building domain termbases to highlight terminology and using chi-square weighting and text position weighting algorithms to improve topic quality.The experimental results show that,compared with the Labeled LDA model,the proposed model can solve the neglect of professional terms and effectively improve the quality of topic words and classification accuracy.
作者 马建红 樊跃翔 MA Jianhong;FAN Yuexiang(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China)
出处 《计算机工程》 CAS CSCD 北大核心 2018年第9期274-279,共6页 Computer Engineering
关键词 科技视频 文本分类 标签 卡方加权 领域术语库 science and technology video text classification label chi-squared weighting database of domain words
  • 相关文献

参考文献14

二级参考文献175

共引文献291

同被引文献24

引证文献3

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部