摘要
为了易于产生候选频繁项目集和计算项目集的支持数,提出了基于二进制的关联规则挖掘算法,但在搜索候选频繁项目集时仍从集合论出发,沿用传统搜索超集或子集的方法,在一定程度上效率受到了限制;为此提出了一种基于二进制的交叉挖掘关联规则算法,通过数值的递增和递减交叉方式自动产生候选频繁项集,缩短了候选频繁项的搜索空间,并在计算支持数时通过数字特征减少了扫描事务的个数,算法的效率得到了明显提高;该实验结果表明:与现有的二进制关联规则挖掘算法相比,算法是快速而有效的。
In order to easily generate candidate frequent itemsets and calculate support of itemsets,an algorithm of association rules mining based on binary is presented.However,at the time of searching candidate frequent itemsets,according to set theory this algorithm still uses traditional methods of searching superset or subset to generate candidate frequent itemsets,to a certain extent the efficiency of the algorithm has been influenced.Hence,this paper proposes an algorithm of intercrossing mining association rules based on binary,which automatically crossways generates candidate frequent itemsets through ascending and descending value to shorten searching space of candidate frequent itemsets,and then applies digital character to reduce the number of scanned transactions to be calculated support.And so the efficiency of the algorithm is obviously improved,and the experiment indicates that the algorithm is fast and efficient by comparing with existing algorithms of association rules mining based on binary.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第7期141-145,共5页
Computer Engineering and Applications
关键词
关联规则
交叉挖掘
数值递减
数值递增
数字特征
二进制
association rules
intercrossing mining
value descending
value ascending
digital character
binary