期刊文献+

基于MapReduce的关联规则增量更新算法 被引量:15

MapReduce Based Association Rule Incremental Updating Algorithm
下载PDF
导出
摘要 云计算以其强大的存储和计算能力而成为解决海量数据挖掘问题的有效途径。经典的关联规则增量更新算法FUP需要频繁扫描原数据集,不适用于海量数据的处理。文中以提高海量数据上关联规则增量更新效率为目标,将FUP算法与云计算的MapReduce编程模式相结合,提出了一种基于MapReduce的关联规则增量更新算法MRFUP。该算法只需扫描原数据集一次,并能充分利用云计算强大的存储和并行计算能力。基于Hadoop的实验结果表明,MRFUP算法可提高对海量数据的处理能力和效率,适用于海量数据的关联规则挖掘。 Cloud computing,with its powerful storage and computing power,has become one of the most effective way for solving the problem of massive data mining.FUP is one of the most classic incremental updating algorithms for association rules.But it can not meet the need of massive data mining very well because it needs to scan the dataset frequently.In this paper,in order to enhance the incremental updating efficiency of association rules for massive data,a MapReduce based incremental updating algorithm for association rules is proposed by combing FUP algorithm and MapReduce programming mode,which is named MRFUP.MRFUP scans the original dataset only once,and takes full advantage of the powerful storage and computing power provided by cloud computing.The results of the experiments deployed on Hadoop show that MRFUP can improve the ability and efficiency of processing massive data;It adapts to mine association rules from massive data.
出处 《计算机技术与发展》 2012年第4期115-118,122,共5页 Computer Technology and Development
基金 国家"973"计划资助项目(2011CB302903) 国家自然科学基金资助项目(61073189)
关键词 海量数据挖掘 云计算 映射/规约 关联规则 增量更新 massive data mining cloud computing MapReduce association rules incremental updating
  • 相关文献

参考文献12

  • 1Agramal R,Srikant R. Fast Algorithms for Mining Association Rules[A].Santiago Chile,1994.487-499.
  • 2范明;孟小峰.数据挖掘:概念与技术[M]北京:机械工业出版社,2001.
  • 3程舒通,徐从富.关联规则挖掘技术研究进展[J].计算机应用研究,2009,26(9):3210-3213. 被引量:14
  • 4Savasere A,Omiecinski E,Navathe S. An Efficient Algorithm for Mining Association Rules in Large Databases[A].San Francisco:Morgan Kaufmann Publishers,1995.432-444.
  • 5Cheung D W. Maintenance of Discovered Association Rules in Large Database:An Incremental Updating Technique[A].IEEE Computer Society Press,1996.106-114.
  • 6Weiss A. Computing in Clouds[J].ACM Networker,2007,(04):18-25.
  • 7刘鹏.云计算[M]北京:电子工业出版社,2010.
  • 8Dean J,Ghemawat S. Mapreduce:simplified data processing on large clusters[A].San Francisco,California,USA,2004.137-150.
  • 9Rajaraman A,Ullman J D. Mining of Massive Data[M].Stanford,2010.
  • 10Venner J. Pro Hadoop[M].Apress,2009.

二级参考文献24

  • 1王俊峰,杨建华,周虹霞,谢高岗,周明天.网络测量中自适应数据采集方法(英文)[J].软件学报,2004,15(8):1227-1236. 被引量:11
  • 2AGRAWAL R, IMIELINSKI T, SWAMI A. Mining association rules between sets of items in large databases[ C]//Proc of ACM SIGMOD International Conference on Management of Data. New York: ACM Press, 1993 : 207-216.
  • 3AGRAWAL R, SRIKANT R. Fast algorithms for mining association rules[C]//Proc of the 20th International Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers, 1994: 478-499.
  • 4PARK J S, CHEN M S, YU P S. Using a hash based method with transaction trimming for mining association rules [ J ]. IEEE Trans on Knowledge and Data Engineering, 1997, 9(5) :813-825.
  • 5BRIN S, MOTWANI R, ULLMAN J D, et al. Dynamic itemset counting and implication rules for market basket data [ C ]//Proc of ACM SIGMOD International Conference on Management of Data. New York: ACM Press, 1997: 255-264.
  • 6MANNILA H, TOIVONEN H, VERKAMO A I. Efficient algorithms for discovering association rules[ C ]//Proc of the AAAI Workshop on Knowledge Discovery in Databases. Washington: AAAI Press, 1994: 181-192.
  • 7TOIVONEN H. Sampling large databases for association rules [ C ]// Proc of the 22nd International Conference on Very Large Data Bases. Sam Francisco: Morgan Kaufmann Publishers, 1996: 134-145.
  • 8HAN Jia-wei, PEI Jian, YIN Yi-wen. Mining frequent patterns without candidate generation [ C ]//Proc of ACM SIGMOD International Conference on Management of Data. New York : ACM Press, 2000 : 1-12.
  • 9GRAHNE G, ZHU Jian-fei. Efficiently using prefix-trees in mining frequent itemsets [ C ]//Proc of IEEE ICDM Workshop on Frequent Itemset Mining Implementations. 2003.
  • 10PIEPRZYK J, MORZY M. Mining generalized association roles using prutax and hierarchical bitmap index [ EB/OL ]. [ 2009-02-19]. http://www, cs. put. poznan, pl/mmorzy/papers/admkd07, pdf.

共引文献13

同被引文献148

引证文献15

二级引证文献66

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部