期刊文献+

Web日志中时态约束浏览模式挖掘算法研究 被引量:3

An algorithm for temporal constraint browsing pattern mining in Weblogs
下载PDF
导出
摘要 为了有效地从海量的Web日志中挖掘出有用的用户浏览模式,将顺序约束和时态约束加入到快速关联规则挖掘算法中,给出了一种基于时态约束的浏览模式挖掘算法FPMBTC.该算法简化了挖掘过程中候选模式的生成操作,对数据库扫描一次,求出所有事务的连续子序列集,利用集合交差运算求得支持度,同时逐步修正会话事务时间得到浏览模式的有效时间,根据网站结构及Web日志不断变化的特点,给出了增量更新挖掘算法.实验结果表明:与类Apriori算法相关工作相比,运行时间少,扩展性好,并且挖掘出的模式具有时效性,适合于不断变化的且有时态特点的Web日志信息的挖掘.此研究对于学习和研究Web挖掘技术具有很好的参考价值,对建造实际的Web挖掘系统具有重要的理论意义和实用价值. To effectively excavate useful browsing patterns from mass Weblogs, the sequential and temporal constraints are added in the quick mining algorithm based on the association rule in this paper. A browsing pattern mining algorithm based on temporal constraints:FPMBTC is presented. This algorithm simplifies the generation of candidate patterns. The continuous sub-sequence sets of all transactions were acquired by scanning over the database only once. The supporting degrees were calculated by the intersection and difference operation of sets. At the same time, the effective time of browsing patterns was obtained by the gradual correction for the session transaction time. On the basis of the above-mentioned process, the increment update algorithm was given according to the character of the continuous change in the structure of the homepage and the Weblogs. The experimental results show that the algorithm is able to excavate the patterns in a real-time way; meanwhile, it needs shorter running time and is more expandable than the Apriori-like algorithm. This approach suits to the mining of Weblogs which are in continuous change and with temporal feature, and can provide a good reference on learning and researching on Web mining technology.
出处 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2008年第9期1474-1480,共7页 Journal of Harbin Institute of Technology
基金 国家自然科学基金资助项目(60603092) 哈尔滨师范大学科研基金资助项目(KM2007-17)
关键词 WEB日志挖掘 频繁访问模式 有效时间 Weblog mining frequent access patterns valid time
  • 相关文献

参考文献8

二级参考文献15

  • 1Jia-WeiHan,JianPei,Xi-FengYan.From Sequential Pattern Mining to Structured Pattern Mining: A Pattern-Growth Approach[J].Journal of Computer Science & Technology,2004,19(3):257-279. 被引量:18
  • 2[1]P.Adriaans and D.Zantinge. Data Mining[M].Addison-Wesley:Harlow,England,1996.
  • 3[2]U.M.Fayyad, G.Piatetsky-Sharpiro, P.Smyth and R.Uthurusamy. Advances in Knowledge Discovery and Data Mining[M]. AAAI/MIT Press,1996.
  • 4[3]G.Piatetsky-Sharpiro, U.M.Fayyad and P.Smyth. From data mining to knowledge discovery: An overview. In U.M.Fayyad et al eds. Advances in Knowledge Discovery and Data Mining, 1-35[M]. AAAI/MIT Press, 1996.
  • 5[4]M.S.Chen,J.Han and P.S.Yu. Data mining: An overview from a database perspective[J]. IEEE Trans.Knowledge and Data Engineering,8:866-883,1996.
  • 6[5]A.Tansel et al eds. Temporal Databases: Theory, Design and Implementation[M]. The Benjamin/Cummings Publishing Company, 1993.
  • 7[6]J.F.Allen. Maintaining Knowledge about Temporal Intervals[J]. Communications of ACM, 26(11),1993.
  • 8[7]R.Agrawal,T.Imielinski and A.Swami. Mining Association Rules between Sets of Items in Large Databases[C]. Proceedings of ACM SIGMOD, May 1993.
  • 9[8]C.J.Date. A guide to the SQL Standard[M]. Addison-Wesley Publishing Company, 1987.
  • 10[9]J.Han et al. DMQL:A Data Mining Query Language for Relational Databases. SIGMOD'96 Workshop on Research Issues on Data Mining and Knowledge Discovery[C]. Canada:Montreal, 1996.

共引文献99

同被引文献17

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部