期刊文献+

Bagging算法在中文文本分类中的应用 被引量:12

Application of Bagging algorithm to Chinese text categorization
下载PDF
导出
摘要 Bagging算法是目前一种流行的集成学习算法,采用一种改进的Bagging算法Attribute Bagging作为分类算法,通过属性重取样获取多个训练集,以kNN为弱分类器设计一种中文文本分类器。实验结果表明Attribute Bagging算法较Bagging算法有更好的分类精度。 Bagging algorithm is a popular ensemble learning technology.A Chinese text categorization classifier is designed by using an improved Bagging algorithm-Attribute Bagging(AB).Re-sampling attribute is used to get multiple training sets;the kNN is selected as weak learner.Experiments show that the Attribute Bagging gets lower errors and better performance than Bagging.
出处 《计算机工程与应用》 CSCD 北大核心 2009年第5期135-137,179,共4页 Computer Engineering and Applications
基金 国家自然科学基金(No.60573179)~~
关键词 ATTRIBUTE BAGGING BAGGING 中文文本分类 K-近邻 Attribute Bagging Bagging Chinese text categorization k Nearest Neighbors(kNN)
  • 相关文献

参考文献9

  • 1Aask,Eikvill.Text categorization:a survey,Technical Report #941[R]. Norwegian Computing Center, 1999.
  • 2Fabrizio S.Machine learning in automated text categorization[J].J of the ACM(JACM), 2002,34( 1 ) : 1-47.
  • 3Dietterich T G.Machine learning research:four current directions[J]. AI Magazine, 1997,18(4) :97-136.
  • 4沈学华,周志华,吴建鑫,陈兆乾.Boosting和Bagging综述[J].计算机工程与应用,2000,36(12):31-32. 被引量:66
  • 5Saltow G,Wong A,Yang C.A vector space model for automatic indexing[J].Communications of the ACM, 1975,18( 11 ) :613-620.
  • 6周茜,赵明生,扈旻.中文文本分类中的特征选择研究[J].中文信息学报,2004,18(3):17-23. 被引量:165
  • 7Bryll R,Gutierrez O R,Quek F.Attribute Bagging:Improving accuracy of classifier ensembles by using random features subsets[J]. Pattern Recognition Letters,2003,36(6):1291-1302.
  • 8Langley P,Iba W.Average-case Analysis of Nearest Neighbor algorithm[C]//Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence.San Francisco,USA:Morgan Kaufmann Publishers, 1993 : 889-894.
  • 9Yang Yiming,Liu Xin.A re-examination of text categorization methods[C]//Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'99), Berkeley, California, USA : 1999 : 42-49.

二级参考文献17

  • 11.Valiant L G.A Theory of Learnable.Communication of ACM,1984; 27:1134-1142
  • 22.Kearns M,Valiant L G.Learning Boolean Formulae or Factoring.Te- chnical Report TR-1488,Cambridge,MA:Havard University Aiken Computation Laboratory,1988
  • 33.Kearns M,Valiant L G.Crytographic Limitation on Learning Boolean Formulae and Finite Automata.In:Proceedings of the 21st Annual ACM Symposium on Theory of ComputingNew YorkNY:ACM press, 1989:433-444
  • 44.Schapire R E.The Strength of Weak Learnability.Machine Learning, 1990;5:197-227
  • 55.Freund Y.Boosting a Weak Algorithm by Majority.Information and Computation,1995;121(2):256-285
  • 66.Freund Y,Schapire R E.A Decision-Theoretic Generalization of On- Line Learning and an Application to Boosting.Journal of Computer and System Sciences,1997;55(1):119-139
  • 78.Schapire R EFreund YBartlett Y,et al.Boosting the Margin:A New Explanation for the Effectiveness of Voting Methods.The Annals of Statistics,1998;26(5):1651-1686
  • 89.Schapire R E.A Brief Introduction of Boosting.InProceedings of the 16th International Joint Conference on Artificial Intelligence,1999
  • 910.Schapire R E.A Brief Introduction of Boosting. In: Proceedings of the 16th International joint Conference on Artificial Intelligence1999
  • 10Yang Yiming,Pederson J O.A Comparative Study on Feature Selection in Text Categorization [A].Proceedings of the 14th International Conference on Machine learning[C].Nashville:Morgan Kaufmann,1997:412-420.

共引文献229

同被引文献120

引证文献12

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部