摘要
为解决决策树算法ID3的多值偏向的缺点,利用属性相似度偏向少值属性的特点,提出一种将属性相似度作为信息熵的系数的属性选择标准,从而提出相应的决策树生成算法。实验结果表明,新算法既避免了以信息熵作为属性选择标准的决策树算法的多值偏向,也避免了以属性相似度作为属性选择标准的决策树算法的少值偏向。
In order to solve the multiple-valued deviation of decision tree algorithm ID3, the character of attribute similarity biasing to few value attribute is used and a new decision tree algorithm is proposed whose choosing attribute standard is taking attribute similarity as the coefficient of information entropy. The experiment result shows that the new the algorithm avoids both the multiple-valued deviation of the decision tree algorithm which takes information entropy as its which takes attribute choosing attribute standard and the few value attribute deviation of the decision tree similarity as its choosing attribute standard.
出处
《科学技术与工程》
2009年第15期4504-4505,4522,共3页
Science Technology and Engineering
关键词
属性约简
信息熵
属性相似度
attribute reduction information entropy attribute similarity algorithm