大图挖掘中一种基于云计算的改进SpiderMine算法被引量：1

Improved Spider Mine Algorithm Based on Cloud Computing in Big Graph Mining

下载PDF

导出

摘要现有的图挖掘算法在云环境下难以有效地进行大规模图形的高频模式挖掘。为此,对Spider Mine算法做了改进,提出一种基于云的Spider Mine算法(c-Spider Mine)。首先,利用最小切割算法将大规模图形数据分为多个子图,使分区/融合成本最小,然后,利用Spider Mine进行模式挖掘,显著降低了大型模式生成时的组合复杂度。最后,采用一种模式键函数来保存模式,以保证所有模式可被成功恢复和融合。基于3种真实数据集的仿真实验结果表明,c-Spider Mine可高效挖掘云环境下的前K个大型模式,在不同数据规模和最小支持设置条件下,c-Spider Mine在内存使用和运行时间方面的性能均优于Spider Mine。 The existing graph mining algorithms in a cloud environment are difficult to carry out mining the high frequent patterns of a massive graph.To solve this problem,this paper has made the improvement to the Spider Mine algorithm,and an improved Spider Mine algorithm is proposed based on the cloud（c-Spider Mine）.Firstly,one big graph data is divided into several sub graphs by minimum cut algorithm to minimize partition/merge costs.And then it exploits Spider Mine to mine the patterns,which generates large patterns with much lower combinational complexity.Finally,a pattern key（PK） function is proposed to preserve the patterns,which guarantees that all patterns can be successfully recovered and merged.The experiments are conducted with three real data sets,and the experimental results demonstrate that c-Spider Mine can efficiently mine top-k large patterns in the cloud,and performs well in memory usage and execution time with different data sizes and minimum supports than the Spider Mine.

作者刘莹杜奕智邹乐

机构地区合肥学院

出处《微型电脑应用》 2016年第1期33-37,共5页 Microcomputer Applications

基金合肥学院校级基金(14KY12ZR)

关键词图挖掘云计算高频模式最小切割算法模式键函数运行时间 Graph Mining Cloud Computing Frequent Patterns Minimum Cut Algorithm Pattern Key Function Execution Time

分类号 TP393 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献11

1孙鹤立,陈强,刘玮,黄健斌,邹建华.利用MapReduce平台实现高效并行的频繁子图挖掘[J].计算机科学与探索,2014,8(7):790-801. 被引量：4
2Anchuri P, Zaki M J, Barkol O, et al. Approximate graph mining with label costs[C]. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013: 518-526.
3Kang U, Akoglu L, Chau D H P. Big Graph Mining: Algorithms, Anomaly Detection, and Applications [J].Proceedings of the ACM ASONAM, 2013, 13: 25-28.
4Zhu F, Qu Q, Lo D, et al. Mining top-k large structural patterns in a massive network[J]. Proceedings of the VLDB Endowment, 2011, 4(11): 807-818.
5郭鑫,董坚峰,周清平.自适应云端的大规模导出子图提取算法[J].计算机科学,2014,41(6):155-160. 被引量：7
6Akoglu L, Chau D H, Kang U, et al. Opavion: Mining and visualization in large graphs[C]. Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 2012: 717-720.
7Yuan J, Bae E, Tai X C. A study on continuous max-flow and rain-cut approaches[C]. Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010: 2217-2224.
8Sarrna A D, Afrati F N, Salihoglu S, et al. Upper and lower bounds on the cost of a map-reduce computation[C] Proceedings of the VLDB Endowment. VLDB Endowment, 2013, 6(4): 277-288.
9Borgelt C, Meinl T, Berthold M. Moss: a program for molecular substructure mining[C]. Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations. ACM, 2005: 6-15.
10Borgelt C. Canonical forms for frequent graph mining [M]. Advances in Data Analysis. Springer Berlin Heidelberg, 2007: 337-349.

二级参考文献38

1Dehaspe L, Toivonen H, King R. Finding frequent substruc- tures in chemical compounds[C]//Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '04), Seattle, USA, Aug 22-25, 2004. New York, NY, USA: ACM, 2004: 30-36.
2Fatta G, Berthold M. Dynamic load balancing for the distrib- uted mining of molecular structures[J]. IEEE Transactions on Parallel and Distributed System, 2006, 17(8): 773-785.
3Wang Ke, Liu Huiqing. Discovering typical structures of documents: a road map approach[C]//Proceedings of the 21st Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR '98), Melbourne, Australia, Aug 24-28, 1998. New York, NY, USA: ACM, 1998: 146-154.
4Kriegel H, Schonauer S. Similarity search in structured data[C]// Proceedings of the 5th International Conference on Data Ware- housing and Knowledge Discovery (DaWak '03), Prague, Czech Republic, 2003. Berlin: Springer-Verlag, 2003: 309-319.
5Fischer A, Riesen K, Bunke H. An experimental study of graph classification using prototype selection[C]//Procee- dings of the 19th International Conference on Pattern Rec- ognition (ICPR '08), Florida, USA, 2008. Washington, DC, USA: IEEE Computer Society, 2008: 1-4.
6Huang Jianbin, Sun Heli, Han Jiawei, et al. SHR/NK: a struc- tural clustering algorithm for detecting hierarchical commu- nities in networks[C]//Proceedings of the 19th ACM Inter-national Conference on Information and Knowledge Manage- ment (CIKM '10), Toronto, Canada, Oct 26-30, 2010. New York, NY, USA: ACM, 2010: 219-228.
7Huang Jianbin, Sun Heli, Song Qinbao, et al. Revealing density-based clustering from the core-connected tree of a network[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(8): 1876-1889.
8Yan Xifeng, Yu P, Han Jiawei. Graph indexing: a frequent structure-based approach[C]//Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD '04), Baltimore, USA, Jul 27-29, 2004. New York, NY, USA: ACM, 2004: 335-346.
9Williams D W, Huan Jun, Wang Wei. Graph database indexing using structured graph decomposition[C]//Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE '07), Istanbul, Turkey, Apr 15-20, 2007. Washington, DC, USA: IEEE Computer Society, 2007: 231-235.
10Hill S, Srichandan B, Sunderraman R. An iterative Map- Reduce approach to frequent subgraph mining in biological datasets[C]//Proceedings of the 3rd ACM Conference on Bioinformatics, Computational Biology and Biomedicine (BCB '12), Orlando, USA, Oct 7-10, 2012. New York, NY, USA: ACM, 2012: 661-666.

共引文献8

1田园.云计算环境下的电力信息系统数据上云技术研究[J].自动化与仪器仪表,2018,0(12):75-79. 被引量：3
2张晓蕾,马晓丽.支腿自动翻转式整体提升电梯井操作平台施工技术[J].电子技术应用,2015,41(9):95-98. 被引量：1
3林媛.非结构化网络中有价值信息数据挖掘研究[J].计算机仿真,2017,34(2):414-417. 被引量：22
4陈小莉.基于大数据的计算机数据挖掘技术在档案管理系统中的研究应用[J].激光杂志,2017,38(2):142-145. 被引量：49
5刘全飞,彭凌云.云计算平台下恶意软件动态自适应自主防护算法设计[J].科学技术与工程,2017,17(31):283-288. 被引量：1
6崔景洋.图数据挖掘研究[J].太原师范学院学报（自然科学版）,2018,17(1):38-40. 被引量：3
7包永红.云计算技术下数据挖掘平台设计及技术[J].现代电子技术,2016,39(16):61-63. 被引量：9
8任晋宇,白琳,周志阳,冯睿智,钟华.基于gSpan改进算法的中医辨证论治模式挖掘研究[J].中国中医药信息杂志,2021,28(10):22-28.

同被引文献8

1侯研.混合服务器时变网络环境中潜在威胁挖掘模型[J].计算机仿真,2014,31(7):301-304. 被引量：5
2邱芬,张炘.云计算环境下多来源数据最优选取模型仿真[J].计算机仿真,2014,31(11):179-182. 被引量：8
3郭晴,杨海霞,刘永泰.云计算环境下的复杂数据库并行调度模型仿真[J].计算机仿真,2015,32(6):360-363. 被引量：10
4王昌辉.云计算设备中的大数据特征高效分类挖掘方法研究[J].现代电子技术,2015,38(22):55-58. 被引量：9
5刘明伟,张晓滨,杨东山.移动环境下多情景源用户情景序列的提取[J].西安工程大学学报,2015,29(6):746-750. 被引量：14
6张金娜,喻林.基于混合累积模式匹配的云数据特征分区融合算法[J].科技通报,2016,32(2):158-162. 被引量：2
7王曙霞,胡瑞敏,梁意文,熊曾刚.云服务器中的不稳定数据挖掘系统的研究与设计[J].现代电子技术,2016,39(6):49-52. 被引量：4
8刘鸽,叶宏,李运喜,胡宁,何翔.基于多分区操作系统的多核确定性调度方法设计[J].航空计算技术,2016,46(1):99-102. 被引量：14

引证文献1

1李娜,余省威.云计算环境下多服务器多分区数据的高效挖掘方法设计[J].现代电子技术,2017,40(10):43-45. 被引量：9

二级引证文献9

1冯丽慧.云计算和挖掘服务融合下的大数据挖掘体系架构设计及应用[J].电脑编程技巧与维护,2017(24):49-51. 被引量：4
2杨磊.云计算环境下数据挖掘服务模式研究[J].内蒙古民族大学学报（自然科学版）,2018,33(5):383-389. 被引量：2
3陈志忠.数据挖掘算法在云平台应用中的优化与实施[J].电子元器件与信息技术,2019,0(3):8-11. 被引量：8
4华英.云计算环境下的大数据挖掘体系架构研究[J].无线互联科技,2018,15(20):46-47. 被引量：1
5崔辰.云计算技术下海量数据挖掘的实现机制[J].微型电脑应用,2019,35(4):129-131. 被引量：6
6吕国,肖瑞雪,白振荣,孟凡兴.大数据挖掘中的MapReduce并行聚类优化算法研究[J].现代电子技术,2019,42(11):161-164. 被引量：21
7郭芳,查梦芳,王丹丹.基于数据挖掘的多用户多服务器下互动行为匹配方法[J].电子设计工程,2020,28(5):103-106. 被引量：2
8黄伟建,贾孟玉,黄亮.并行随机抽样贪心算法分区的MapReduce负载均衡研究[J].现代电子技术,2020,43(16):170-173. 被引量：3
9薛慧敏.基于MapReduce的分布式云计算数据挖掘方法[J].安阳师范学院学报,2020(5):24-27. 被引量：4

1陈立.一种动态显示Matrix中Total背景颜色的方法[J].Windows IT Pro Magazine（国际中文版）,2008(10):84-84.
2李伟光,朱金华,赵博.PC标准键盘在单片机系统中的应用[J].电测与仪表,2003,40(8):29-31. 被引量：5
3新软物语[J].电脑爱好者,2006,0(7):40-41.
4陈桂鑫.用高级筛选做个多条件查询界面[J].电脑迷,2010(10):70-70.
5闫艳力.销售业绩亮起来[J].电脑迷,2005,0(10):83-83.
6你们超频我降压![J].电脑爱好者,2008,0(13):77-77.
7李伟光,朱金华,俞烽.测控系统中标准键盘接口设计[J].组合机床与自动化加工技术,2004(2):76-77.
8阿刚.完美替代网文快捕[J].电脑迷,2007,0(14):68-68.
9王洪伟,王彦丽.应用Web日志挖掘技术改善企业客户关系[J].计算机与现代化,2007(10):125-127. 被引量：1
10王秀萍.通过校验值推算关键地址的软件保护方法研究[J].现代计算机,2012,18(16):27-30.

微型电脑应用

2016年第1期

浏览历史

内容加载中请稍等...

大图挖掘中一种基于云计算的改进SpiderMine算法被引量：1

参考文献11

二级参考文献38

共引文献8

同被引文献8

引证文献1

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

大图挖掘中一种基于云计算的改进SpiderMine算法 被引量：1

参考文献11

二级参考文献38

共引文献8

同被引文献8

引证文献1

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

大图挖掘中一种基于云计算的改进SpiderMine算法被引量：1