期刊文献+

数据挖掘中改进的C4.5决策树分类算法 被引量:25

Improved C4.5 Decision Tree Classification Algorithm in Data Mining
下载PDF
导出
摘要 针对传统C4.5决策树分类算法需要进行多次扫描,导致运行效率低的缺陷,提出一种新的改进C4.5决策树分类算法.通过优化信息增益推导算法中相关的对数运算,以减少决策树分类算法的运行时间;将传统算法中连续属性的简单分裂属性改进为最优划分点分裂处理,以提高算法效率.实验结果表明,改进的C4.5决策树分类算法相比传统的C4.5决策树分类算法极大提高了执行效率,减小了需求空间. Aiming at the problem that the algorithm for traditional C4.5 decision tree classification algorithm needed to be scanned several times,resulting in defects of running low efficiency,the author proposed a new improved C4.5 decision tree classification algorithm by optimizing the logarithmic operation related information gain derivation algorithm in order to reduce the running time of the decision tree classification algorithm.And the simple split attribute of the continuous attributes in the traditional algorithm was improved to the optimal partition point splitting processing in order to improve the efficiency of the algorithm.Experimental results show that compared with the traditional C4.5 decision tree classification algorithm,the improved C4.5 decision tree classification algorithm greatly improves the execution efficiency and reduces the demand space.
作者 王文霞 WANG Wenxia(Department of Computer Science and Technology, Yuncheng University, Yuncheng 044000, Shanxi Province, Chin)
出处 《吉林大学学报(理学版)》 CAS CSCD 北大核心 2017年第5期1274-1277,共4页 Journal of Jilin University:Science Edition
基金 国家自然科学基金(批准号:11241005) 山西省运城学院131人才专项基金(批准号:JG201634)
关键词 数据挖掘 C4.5决策树 分类算法 判别能力度量 连续属性 data mining C4.5 decision tree classification algorithm discriminative ability measure continuous attribute
  • 相关文献

参考文献2

二级参考文献18

  • 1Lee K H. Lee YJ. Choi H. et al. Parallel Data Processing with MapReduce , A Survey[J]. ACM SIGMOD Record. 2011. 40(4): 11-20.
  • 2Condie T. Conway N. Alvaro P. et al. MapReduce Online[C]IIProceedings of the 7th USENIX Symposium on Networked Systems Design and Implementation. Berkeley: USENIX. 2010: 21.
  • 3DeanJ, Ghemawat S. Map'Reduce , Simplified Data Processingon Large Clusters[J]. Communications of the ACM, 2008, 510): 107-113.
  • 4Hadoop. Apache Hadoop[EB/OL]. 2014-12-01. http- z /hadoop. apache. org/.
  • 5Luckham D. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems[M]. Boston: Addison-Wesley Longman Publishing Co, Inc, 2001.
  • 6Bhatotia p, Wieder A, Rodrigues R, et al. Incoop . MapReduce for Incremental Computations[C/ OLJ/ / Proceedings of the 2nd ACM Symposium on Cloud Computing. New York: ACM, 2011: doi , 10.1145/2038916. 2038923.
  • 7Yan C, Yang X, Yu Z, et al. Incrnr , Incremental Data Processing Based on MapReduce[CJl /2012 IEEE 5th International Conference on Cloud Computing (CLOUD). Piscataway, NJ: IEEE, 2012: 534-541.
  • 8BU Yingyi , Howe B, Balazinska M, et al. Hal.oop . Efficient Iterative Data Processing on Large Clusters[J]. Proceedings of VLDB Endowment, 2010. 30/2): 285-296.
  • 9EkanayakeJ. LI Hui , ZHANG Bingjing , et al. Twister: A Runtime for Iterative MapReduce[C]/ /Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC'10). New York: ACM, 2010: 818-820.
  • 10Zaharia M, Chowdhury M. Franklin MJ, et al. Spark: Cluster Computing with Working Sets[CJ/ /Proceedings of the 2nd USENIX Conference of Hot Topics in Cloud Computing. Berkeley: USENIX, 2010: 10-16.

共引文献6

同被引文献237

引证文献25

二级引证文献130

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部