摘要
本文提出了一种适用于数字资源访问日志数据库的关联规则挖掘改进算法,它采用事务压缩和项目压缩相结合,而候选项目集及支持度计算是在每条事务压缩后通过联接产生,候选项目集采用关键字识别,省去了Apriori算法中的剪枝和字符串模式匹配步骤,可快速得到完整的频繁模式集。该算法特别适用于数字图书馆海量数字资源的个性化信息需求获取分析。
This paper proposes an enhanced algorithm which associates the Apriori algorithm with the transaction reduction and item reduction techniques.The candidate set generation and the support calculation of each itemset is created after each transaction is compressed and connected. The candidate set adopts the key word identification.The process of pruning and string pattern matching is removed from the Apriori algorithm,and it is especially suitable for the personal services of large digital libraries to gain personal information requirements.
出处
《计算机工程与科学》
CSCD
2007年第1期83-85,108,共4页
Computer Engineering & Science