摘要
Apriori是挖掘频繁项集的基本算法,目前该算法及其优化变种都没有解决候选项及重复扫描事务数据库的问题。文章通过对Apriori及其优化算法的深入探究,提出了一种基于单事务组合项集的挖掘算法,该算法在一个事务内部对"数据项"进行组合,在事务数据库中对所有相同"项集"进行计数。不经过迭代过程,不产生候选项集,所有频繁项集的挖掘过程只需对事务数据库一次扫描,提高了频繁项集挖掘效率。
Apriori is a basic algorithm for frequent itemsets mining. At present, neither Neither Apriori nor its variations resolve some problems which is candidate item and scans transaction database repeatedly. This paper makes a profound research on Apriori and proposes a novel algorithm based on single transaction combination itemsets for mining. The algorithm combines data item to form an itemsets in one transaction database and counts the same itemsets in all transaction databases. Moreover, there is no iteration and candidate itemsets produced by the algorithm, and the mining process scans the traction database only one time, therefore, this algorithm is more effective.
出处
《计算机科学》
CSCD
北大核心
2008年第1期196-197,226,共3页
Computer Science
基金
重庆市自然科学基金(2006BA6015)重点资助项目
关键词
频繁项集
APRIORI
单事务项集组合
候选项
Frequent itemsets,Apriori, Single transaction itemsets combination, Candidate item