摘要
传统的ID3决策树算法存在属性选择困难、分类效率不高、抗噪性能不强、难以适应大规模数据集等问题。针对该情况,提出一种基于属性重要度及变精度粗糙集的决策树算法,在去除噪声数据的同时保证了决策树的规模不会太庞大。利用多个UCI标准数据集对该算法进行了验证,实验结果表明该算法在所得决策树的规模和分类精度上均优于ID3算法。
The traditional ID3 decision tree algorithm is difficult in selecting attribute,its classification efficiency is not highland anti-noise performance is not strong,so it is difficult to adapt to large-scale data set and other issues. Aiming at this situation,a decision tree algorithm based on attribute significance and variable precision rough set was proposed to ensure that the tree size is not too large while removing the noise data. The algorithm was validated by using multiple UCI standard data sets. The experimental results show that the algorithm is superior to the ID3 algorithm in the scale and classification accuracy of the decision tree.
出处
《计算机科学》
CSCD
北大核心
2017年第B11期129-132,共4页
Computer Science
基金
国家自然科学基金项目(61503208)资助
关键词
决策树
属性重要度
变精度粗糙集
属性约简
数据挖掘
Decision tree, Attribute significance, Variable precision rough set, Attribute reduction,Data mining