期刊文献+

利用图结构进行半监督学习的短文本分类研究 被引量:1

Research on Short Text Classification Based on Semi-supervised Learning by Graph Structure
原文传递
导出
摘要 为了解决基于向量空间模型构建短文本分类器时造成的文本结构信息的缺失以及大量样本存在的标注瓶颈问题,提出一种基于图结构的半监督学习分类方法,这种方法既能保留短文本的结构语义关系,又能实现未标注样本的充分利用,提高分类器的性能。通过引入半监督学习的思想,将数量规模较大的未标注样本与少量已标注样本相结合进行基于图结构的自训练学习,不断迭代实现训练样本集的扩充,从而构建最终短文本分类器。经对比实验证明,这种方法能够获得较好的分类效果。 In order to resolve the problems of the lack of text structure and semantic information in the vector space model and the bottleneck problem of annotation in dealing with large numbers of unlabeled samples, this paper introduces a method of short texts classification based on semi-supervised learning. It is feasible to maintain the relationship between samples and can also make full use of the unlabeled parts to improve the performance of the classifier. It is a self-training algorithm that connects the large numbers of unlabeled parts and the labeled together to learn based on graph structure, so that the training samples can be enlarged and used to build the final text classifier. The contrast experiment shows that the algorithm of short text classification based on semi-supervised learning can get better classified effect.
作者 张倩 刘怀亮
出处 《图书情报工作》 CSSCI 北大核心 2013年第21期126-132,共7页 Library and Information Service
关键词 半监督学习 短文本 图结构 自训练 semi-supervised learning short text graph structure self-training
  • 相关文献

参考文献10

二级参考文献105

共引文献133

同被引文献2

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部