摘要
Mining association rules from large database is very costly. We develop a parallel algorithm for this task on shared-memory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to scan the database at least two times. In this article, a parallel algorithm Scan Once (SO) has been proposed for SMP, which only scans the database once. And this algorithm is fundamentally different from the known parallel algorithm Count Distribution (CD). It adopts bit matrix to store the database information and gets the support of the frequent itemsets by adopting Vector-And-Operation, which greatly improve the efficiency of generating all frequent itemsets. Empirical evaluation shows that the algorithm outperforms the known one CD algorithm.
Mining association rules from large database is very costly.We develop a parallel algorithm for this task on sharedmemory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to scan the database at least two times. In this article, a parallel algorithm Scan Once (SO) has been proposed for SMP,which only scans the database once. And this algorithm is fundamentally different from the known parallel algorithm Count Distribution (CD). It adopts bit matrix to store the database information and gets the support of the frequent itemsets by adopting Vector-And-Operation, which greatly improve the efficiency of generating all frequent itemsets.Empirical evaluation shows that the algorithm outperforms the known one CD algorithm.