摘要
在统计学习理论中 ,尤其对于分类问题 ,VC维扮演着中心作用。大多数常用算法的VC维未知。该文计算了二叉决策树生成算法的VC维上界 ,获得了定理 2 ,认为该上界随决策树的复杂度和节点可调参数个数的增大而提高。作为补充 ,还计算了单变量决策树非叶子节点的VC维上界 ,获得了定理 3。为了评估定理 2的数值结果 ,通过实验验证了有关的经验结论 ,发现它们在决策树复杂度较大时能够与实际符合。比较定理 2和经验结论发现两者存在较大的数值差别但是变化趋势相同。
VC dimension plays a central role in the Statistical Learning Theory especially for classification problems. Since there does not exist a universal method to calculate this combinatorial dimension, this value for most of the classification algorithms is still unknown. Binary decision tree algorithms for classification constitute a large family in the fields of machine learning, pattern recognition, and data mining. Investigating their VC dimension seems to be a helpful step towards further improvement of their generalization performance. For this purpose, the paper calculated, in theorem 2, an upper bound of VC dimension for binary decision tree algorithms, which rises as the complexity of resulting tree and the number of adjustable parameters at each node increases. To facilitate the calculation, only continuous attributes are considered here, and hypothesis function class of decision tree algorithms is considered as a continuously expending group of uniform Boolean formulas. As a supplement, upper bound of single non-leaf node of univariate decision tree algorithms is also calculated in theorem 3, which displays its special classification capability when compared with its multivariate counterpart. For the purpose of comparison, experiential conclusions on VC dimension of univirate decision tree algorithms are evaluated by experiments. They are found to be appropriate when complexity of the resulting tree is large enough. Based on this observation, a numerical comparison between the experiential value of VC dimension and the upper bound calculated by us was made. Although the numeric discrepancy between them is large, they display the same tendency. Possible origins of the exaggeration in the upper bound were discussed. Conclusions of the paper is helpful for us to understand the essential purpose of some techniques of improvement such as pruning a grown decision tree or imposing earlier-stop criteria in training, which are assumed to limit the VC dimension of the algorithm to a reasonable range and hence alleviate the serious problem of over training.
出处
《计算机仿真》
CSCD
2005年第2期74-78,126,共6页
Computer Simulation
关键词
决策树
维
统计学习理论
Decision tree
Dimension
Statistical learning theory