期刊文献+

一种有效的C4.5改进模型 被引量:28

Improved decision tree of C 4.5
原文传递
导出
摘要 介绍了一种有效的决策树改进模型:R-C 4.5及其简化版本,旨在构造一棵简单的树,同时提高决策树属性选择度量的可解释性,减少空枝和无意义分枝,以及过度拟合。该决策树模型基于著名的C 4.5决策树模型,但在属性的选取和分枝策略上进行了改进。在R-C 4.5中,通过合并分类效果差的分枝,有效避免了碎片等问题。实验表明,R-C 4.5决策树在保持模型预测准确率的同时,有效改进了树的健壮性。作为R-C 4.5的简化版本,R-C 4.5c和R-C 4.5s可生成更为简单的树,而且R-C 4.5s通过数据预处理阶段完成,易于实现。 An effective improved decision tree—R-C 4.5 and its simplified versions were proposed to enhance the interpretability of test attribute selection measure,reduce the numbers of insignificant or empty branches,and avoid the appearance of over fitting.This model was based on C 4.5 and improved on attribute selection and partition methods.R-C 4.5 combines branches which have high entropies,because these branches have poor classification effect in divide-and-conquer process.The results of experiments show that R-C 4.5 improves the predictive accuracy and robustness.As the simplified versions of R-C 4.5,R-C 4.5c and R-C(4.5)s can construct more robust trees.And R-C 4.5s is improved in data preprocessing,so it is the easiest version to be implemented in the three ones.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2006年第z1期996-1001,共6页 Journal of Tsinghua University(Science and Technology)
基金 上海财经大学"211"工程资助项目
关键词 决策树 R-C4.5 C4.5 分类器 数据挖掘 decision tree R-C 4.5 C 4.5 classifier data mining
  • 相关文献

参考文献10

  • 1[1]Dunham M.Data Mining:Introductory and Advanced Topics[M].Upper Saddle River,NJ:Pearson Education,2003.
  • 2[2]Han J,Kamber M.Data Mining:Concepts and Techniques[M].San Francisco:Morgan Kaufmann Publishers,2001.
  • 3[3]Quinlan J R.C 4.5:Programs for Machine Learning[M].San Mateo,CA:Morgan Kaufmann,1993.
  • 4[4]Lim T S,Loh W Y,Shih Y S.A comparison of prediction accuracy,complexity,and training time of thirty-three old and new classification algorithms[J].Machine Learning,2000,40:203-229.
  • 5[5]Quinlan J R.Induction of decision trees[J].Machine Learning.1986,1(1):81-106.
  • 6[6]Ruggieri S.Efficient C 4.5[J].IEEE Transactions On Knowledge And Data Engineering,2002.14(2):438-444.
  • 7[7]Luger F G.Artificial Intelligence:Structures and Strategies for Complex Problem Solving[M].4th Ed.Harlow,England:Addison Wesley,2001.
  • 8[9]Mitchell T M.Machine Learning[M].New York:McGraw-Hill,1997.
  • 9[10]Breslow L,Aha D W.Simplifying decision trees:A survey[J].Knowledge Engineering Review,1997,12(1):1-40.
  • 10[11]LIU Peng,YAO Zheng,LEI Lei,et al.R-C 4.5 and its sensitivity analysis on missing data[A].Proceedings of JICC2005[C].Chongqing,2005.10-12.

同被引文献194

引证文献28

二级引证文献241

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部