摘要
ID3算法在分类数据挖掘中应用广泛,但其在对大规模训练样本集进行挖掘时,占用主存空间较大,且执行效率不高.运用属性约简和分组计数方法对训练样本集进行数据缩减,得到数据规模较小的新训练样本集,然后再运用ID3算法对新训练样本集进行分类挖掘.整个执行过程全部使用现代数据库技术和存储过程编程加以实现.实验表明,通过改进设计提高了ID3算法的执行效率,增强了算法的扩展性.
ID3 Algorithm is widely used in classified data mining, but if it is used in the mining of large - scale training sample set, too much main - memory space will be occupied, which results in low execution efficiency. The attribute reduction method and classified counting method to reduce data in the training sample set and a ne,s one with smaller scale are used, and then [D3 Algorithm in the classification mining of the new training sample set is applied. The whole execution process is realized through modem database technology and procedure programming totally is stored. The experiment shows that the design enhances the execution efficiency of ID3 Algorithm is improved and its application range is extended.
出处
《哈尔滨师范大学自然科学学报》
CAS
2013年第4期51-54,共4页
Natural Science Journal of Harbin Normal University
基金
安徽省教育厅教学研究资助项目(2012JYXM762)
安徽省教育厅自然科学研究资助项目(KJ2013Z090)
关键词
ID3算法
粗糙集
属性约简
分组计数
数据缩减
存储过程
ID3 algorithm
Rough set
Attribute reduction
Group counting
Data reduction
Stored procedure