摘要
挖掘频繁访问模式是Web日志挖掘的一个重要任务。针对类Apriori算法和GITC算法的不足,提出了基于双亲链的单次扫描求交的Web频繁访问模式挖掘算法—BIPL,该算法首先对用户的访问模式两两进行交集运算,生成候选访问模式,并在求交集过程中保存各个候选访问模式的双亲模式,然后通过简单的求和运算,计算出各个候选访问模式的支持数。最后通过理论分析和实验验证,该算法是稳定的和高效的。
Mining frequent access patterns is an important task of Web log mining.In connection with the shortage of the similar Apriori algorithm and the GITC algorithm,the paper presents BIPL algorithm which is used to mine the Web frequent access patterns.The algorithm is based on parents list and intersection,and requests to scan the database only one times.h first gets the intersections of each two access patterns and gives the birth to candidate access patterns.And the parents access patterns of each candidate access pattern are saved in the process of intersection.Then the counts of all the candidate access patterns can be calculated easily through add operational.Finally,the algorithm is proved to be stable and efficient through theoretical analysis and experimental proof.
出处
《计算机工程与应用》
CSCD
北大核心
2008年第23期136-138,156,共4页
Computer Engineering and Applications
基金
国家自然科学基金(No.50604012)~~