摘要
在数据的关联规则挖掘研究中,产生候选频繁项,存在的重复计算和冗余候选项,会造成数据关联特征发生失真,导致计算支持数时重复扫描事务数据库的次数增加。为此,提出一种抗特征失真的深度挖掘算法,首先进行数据处理,计算每个项目在事务数据库中的支持数,然后与最小支持度相比,利用POS最优解的思想计算最优特征,引入淘汰因子,实现数据深度挖掘,有效地提高了算法的效率。实验数据表明,该算法的挖掘效率比现有的同类算法更快速有效。
This paper puts forward a resistance characteristics of the depth of the distortion of the mining algorithm, first carries on the data processing, calculate each project in the affairs of the number of database support, and then compared with the value of minimum support, using POS for the optimal solution of the optimal feature calculation thought, the introduction of selection factor, realize the data mining depth, effectively improve the efficiency of the algorithm. Through the experimental data show that the change of the mining algorithm efficiency ratio of the exiating similar algorithm more quickly and efficiently.
出处
《科技通报》
北大核心
2013年第12期45-47,共3页
Bulletin of Science and Technology
基金
内江职业技术学院在线答疑系统
关键词
数据挖掘
特征失真
关联规则
最优解
data mining
characteristic distortion
association rules
the optimal solution