期刊文献+

利用抽样技术和元学习的分布式关联规则挖掘算法 被引量:3

Distributed association rules mining algorithm by sampling and meta-learning
下载PDF
导出
摘要 结合动态项集计数技术和抽样的思想,利用元学习策略来产生频繁项集,提出了一个不共享内存的分布式关联规则挖掘算法DASM;引进了相似度的概念,并用之提高了挖掘的精确度。理论分析以及在IBM数据生成器生成的数据集上的实验均表明,DASM算法具有较高的挖掘效率和较低的通信量,适用于对效率要求较高的应用领域。 A new distributed association rule mining algorithm of DASM was presented. It adopted the ideas of dynamic itemset counting and sampling, and produced frequent itemsets by meta-learning method. Different sites that applied DASM needn't share the same memory. To assure the completeness of the results, the concept of similar degree was introduced. Theory analysis and experiments on the datasets generated using the generator from the IBM Almaden Quest research group show that DASM has better performance and less communication loads. DASM is applicable to those applications where the efficiency could be more important than accuracy results.
出处 《计算机应用》 CSCD 北大核心 2006年第4期872-874,877,共4页 journal of Computer Applications
基金 河南省自然科学基金资助项目(0211050110)
关键词 抽样 元学习 动态项集计数 相似度 分布式关联规则挖掘 sampling meta-learning dynamic itemset counting similar degree distributed association rule mining
  • 相关文献

参考文献12

  • 1TOIVONEN H.Sampling large databases for association rules[A].Proceedings of the 22nd Int'l Conference on Very Large Data Bases[C].Mumbai,India,1996.134-145.
  • 2BRIN S,MOTWANI R,ULLMAN JD,et al.Dynamic itemset counting and implication rules for market basket data[A].Proceedings ACM SIGMOD International Conference on Management of Data[C].Tucson,Arizona,USA,1997.
  • 3SAVASERE A,OMIECINSKI E,NAVATHE S.An efficient algorithm for mining association rules in large databases[A].Proceedings of 21th Int'l Conference on Very Large Data Base[C].Switzerland,1995.432-444.
  • 4PARK JS,YU PS,CHEN MS.Mining association rules with adjustable accuracy[A].Proceedings of the fourth Int conf on Knowledge Discovery and Data Mining[C].New York.1998.
  • 5CHEUNG DW,HAN JW,NG VT,et al.Fast distributed algorithm for mining association rules[A].Proceedings of Int'I Conference on Parallel and Distributed Information Systems[C].Florida,1998.31-44.
  • 6SCHUSTER A,WOLFF R,TROCK D.A high performance distributed algorithm for mining association rules[A].Proceedings of the Third IEEE International Conference on Data Mining[C].2003.
  • 7[加]HAN JW,KAMBER M.数据挖掘概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2001.
  • 8CHEN B,HAAS PJ,SCHEUERMANN P.A new two-phase sampling based algorithm for discovering association rules[A].Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining[C].Andreas,2002.
  • 9王春花,黄厚宽.利用抽样技术分布式开采可变精度的关联规则[J].计算机研究与发展,2000,37(9):1101-1106. 被引量:12
  • 10PRODROMIDIS L,CHAN PK.Meta-learning in distributed data mining systems:Issues and approaches[M].Advances in Distributed Data Mining,MIT:MIT Press,2000.81-113.

二级参考文献4

  • 1Park J S,Proc of the Fourth Int’ l Conf on Knowledge Discovery andData Mining,1998年
  • 2Chan P,Ph D dissertation,1996年
  • 3Cheung D W,Proc 1996Int’ l Conf Parallel and Distribut-ed Information Systems,1996年,31页
  • 4Park J,Proc ACM SIGMOD Int Conf on Management of Data,1995年,175页

共引文献11

同被引文献22

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部