摘要
决策树剪枝是决策树分类学习中的重要步骤,可降低决策树复杂程度和提高决策树泛化能力,从而提高决策树识别精度和效率。通过利用系数函数综合决策树的错误率和规模,形成决策树剪枝标准,在系数函数的参数合适选取,采用自底向上遍历过程逐一进行判断剪枝。实验结果表明,综合考虑决策树的分类预测准确率和决策树的规模大小,BASP剪枝算法能够获得更好的剪枝效果。
Pruning is an important step of decision tree learning,which can reduce the complexity of decision tree and improve its generalization ability to gain the accuracy effectively and efficiently. Definition of function,which combines the error rate and size of decision tree,serves as a criterion for decision tree pruning. After proper selection of coefficient of the function,the procedure of bottom-up traverse is adopted for the decision tree to prune by the criterion,resulting in good accuracy and performance.
出处
《科学技术与工程》
北大核心
2016年第16期79-82,共4页
Science Technology and Engineering
基金
国家高新技术研究发展计划(2009AA062802)
国家自然科学基金(60473125)
中国石油(CNPC)石油科技中青年创新基金(05E7013)
国家重大专项子课题(G5800-08-ZS-WX)资助
关键词
决策树
剪枝算法
准确率
规模
decision tree
pruning algorithm
accuracy
size