期刊文献+

大图挖掘中一种基于云计算的改进SpiderMine算法 被引量:1

Improved Spider Mine Algorithm Based on Cloud Computing in Big Graph Mining
下载PDF
导出
摘要 现有的图挖掘算法在云环境下难以有效地进行大规模图形的高频模式挖掘。为此,对Spider Mine算法做了改进,提出一种基于云的Spider Mine算法(c-Spider Mine)。首先,利用最小切割算法将大规模图形数据分为多个子图,使分区/融合成本最小,然后,利用Spider Mine进行模式挖掘,显著降低了大型模式生成时的组合复杂度。最后,采用一种模式键函数来保存模式,以保证所有模式可被成功恢复和融合。基于3种真实数据集的仿真实验结果表明,c-Spider Mine可高效挖掘云环境下的前K个大型模式,在不同数据规模和最小支持设置条件下,c-Spider Mine在内存使用和运行时间方面的性能均优于Spider Mine。 The existing graph mining algorithms in a cloud environment are difficult to carry out mining the high frequent patterns of a massive graph.To solve this problem,this paper has made the improvement to the Spider Mine algorithm,and an improved Spider Mine algorithm is proposed based on the cloud(c-Spider Mine).Firstly,one big graph data is divided into several sub graphs by minimum cut algorithm to minimize partition/merge costs.And then it exploits Spider Mine to mine the patterns,which generates large patterns with much lower combinational complexity.Finally,a pattern key(PK) function is proposed to preserve the patterns,which guarantees that all patterns can be successfully recovered and merged.The experiments are conducted with three real data sets,and the experimental results demonstrate that c-Spider Mine can efficiently mine top-k large patterns in the cloud,and performs well in memory usage and execution time with different data sizes and minimum supports than the Spider Mine.
机构地区 合肥学院
出处 《微型电脑应用》 2016年第1期33-37,共5页 Microcomputer Applications
基金 合肥学院校级基金(14KY12ZR)
关键词 图挖掘 云计算 高频模式 最小切割算法 模式键函数 运行时间 Graph Mining Cloud Computing Frequent Patterns Minimum Cut Algorithm Pattern Key Function Execution Time
  • 相关文献

参考文献11

  • 1孙鹤立,陈强,刘玮,黄健斌,邹建华.利用MapReduce平台实现高效并行的频繁子图挖掘[J].计算机科学与探索,2014,8(7):790-801. 被引量:4
  • 2Anchuri P, Zaki M J, Barkol O, et al. Approximate graph mining with label costs[C]. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013: 518-526.
  • 3Kang U, Akoglu L, Chau D H P. Big Graph Mining: Algorithms, Anomaly Detection, and Applications [J].Proceedings of the ACM ASONAM, 2013, 13: 25-28.
  • 4Zhu F, Qu Q, Lo D, et al. Mining top-k large structural patterns in a massive network[J]. Proceedings of the VLDB Endowment, 2011, 4(11): 807-818.
  • 5郭鑫,董坚峰,周清平.自适应云端的大规模导出子图提取算法[J].计算机科学,2014,41(6):155-160. 被引量:7
  • 6Akoglu L, Chau D H, Kang U, et al. Opavion: Mining and visualization in large graphs[C]. Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 2012: 717-720.
  • 7Yuan J, Bae E, Tai X C. A study on continuous max-flow and rain-cut approaches[C]. Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010: 2217-2224.
  • 8Sarrna A D, Afrati F N, Salihoglu S, et al. Upper and lower bounds on the cost of a map-reduce computation[C] Proceedings of the VLDB Endowment. VLDB Endowment, 2013, 6(4): 277-288.
  • 9Borgelt C, Meinl T, Berthold M. Moss: a program for molecular substructure mining[C]. Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations. ACM, 2005: 6-15.
  • 10Borgelt C. Canonical forms for frequent graph mining [M]. Advances in Data Analysis. Springer Berlin Heidelberg, 2007: 337-349.

二级参考文献38

  • 1Dehaspe L, Toivonen H, King R. Finding frequent substruc- tures in chemical compounds[C]//Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '04), Seattle, USA, Aug 22-25, 2004. New York, NY, USA: ACM, 2004: 30-36.
  • 2Fatta G, Berthold M. Dynamic load balancing for the distrib- uted mining of molecular structures[J]. IEEE Transactions on Parallel and Distributed System, 2006, 17(8): 773-785.
  • 3Wang Ke, Liu Huiqing. Discovering typical structures of documents: a road map approach[C]//Proceedings of the 21st Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR '98), Melbourne, Australia, Aug 24-28, 1998. New York, NY, USA: ACM, 1998: 146-154.
  • 4Kriegel H, Schonauer S. Similarity search in structured data[C]// Proceedings of the 5th International Conference on Data Ware- housing and Knowledge Discovery (DaWak '03), Prague, Czech Republic, 2003. Berlin: Springer-Verlag, 2003: 309-319.
  • 5Fischer A, Riesen K, Bunke H. An experimental study of graph classification using prototype selection[C]//Procee- dings of the 19th International Conference on Pattern Rec- ognition (ICPR '08), Florida, USA, 2008. Washington, DC, USA: IEEE Computer Society, 2008: 1-4.
  • 6Huang Jianbin, Sun Heli, Han Jiawei, et al. SHR/NK: a struc- tural clustering algorithm for detecting hierarchical commu- nities in networks[C]//Proceedings of the 19th ACM Inter-national Conference on Information and Knowledge Manage- ment (CIKM '10), Toronto, Canada, Oct 26-30, 2010. New York, NY, USA: ACM, 2010: 219-228.
  • 7Huang Jianbin, Sun Heli, Song Qinbao, et al. Revealing density-based clustering from the core-connected tree of a network[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(8): 1876-1889.
  • 8Yan Xifeng, Yu P, Han Jiawei. Graph indexing: a frequent structure-based approach[C]//Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD '04), Baltimore, USA, Jul 27-29, 2004. New York, NY, USA: ACM, 2004: 335-346.
  • 9Williams D W, Huan Jun, Wang Wei. Graph database indexing using structured graph decomposition[C]//Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE '07), Istanbul, Turkey, Apr 15-20, 2007. Washington, DC, USA: IEEE Computer Society, 2007: 231-235.
  • 10Hill S, Srichandan B, Sunderraman R. An iterative Map- Reduce approach to frequent subgraph mining in biological datasets[C]//Proceedings of the 3rd ACM Conference on Bioinformatics, Computational Biology and Biomedicine (BCB '12), Orlando, USA, Oct 7-10, 2012. New York, NY, USA: ACM, 2012: 661-666.

共引文献8

同被引文献8

引证文献1

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部