期刊文献+

MR-GSpar:一种基于MapReduce的大图稀疏化算法 被引量:5

MR-GSpar:A Distributed Large Graph Sparsification Algorithm Based on MapReduce
下载PDF
导出
摘要 图的稀疏化是图聚类分析中数据预处理的关键操作,已得到广泛的关注。针对图数据日益普及、规模不断增大的现状,提出了一种基于MapReduce的面向大规模图的稀疏化算法,即MR-GSpar算法。该算法在MapReduce并行计算框架的基础上,通过对传统的最小哈希(Minhash)算法的并行化改造,使其可在分布式的集群环境中实现对大规模图数据的高效稀疏化处理。真实数据集上的实验表明了该算法的可行性与有效性。 As an important data pre-processing operation, graph sparsification has attracted wide attentions from the ar-ea of database. Nowadays the graph data is becoming popular and scale. Thus this paper proposed an efficient parallel graph sparsification algorithm, namely the MR-GSpar algorithm. The MR-GSpar algorithm is presented by reforming the traditional Minhash algorithm into a parallel and distributed algorithm using MapReduce framework,which can ar- chive efficient sparsification on large-scale graph data in a large machine cluster environment. Experiments on real data- sets show that the algorithm is feasible and effective.
出处 《计算机科学》 CSCD 北大核心 2013年第10期190-193,212,共5页 Computer Science
基金 国家自然科学基金项目(61070031 61070032)资助
关键词 图稀疏化 Minhash MAPREDUCE框架 MR-GSpar算法 Graph sparsification, Minhsh, MapReduce framework, MR-GSpar algorithm
  • 相关文献

参考文献7

  • 1Satuluri V, Parthasarathy S. Scalable graph clustering using sto- chastic flows: applications to community discovery[C]// ACM SlGKDD. 2009 : 737-746.
  • 2Kulis B, Basu S, Dhillon I, et al. Semi-supervised Graph Cluste- ring: A Kernel Approach[J]. Machine I_earning, 2009,74 (1) : 1- 22.
  • 3Satuluri V,Parthasarathy S, Ruan Y. Local graph Sparsification for Sealable Clustering[C]//SIGMOD. 2011:737-746.
  • 4李建江,崔健,王聃,严林,黄义双.MapReduce并行编程模型研究综述[J].电子学报,2011,39(11):2635-2642. 被引量:187
  • 5Lin J, Schatz M. Design patterns for efficient graph algorithms in mapreduce[C]//MLG. 2010: 78-85.
  • 6尹丹,高宏,邹兆年.一种新的高效图聚集算法[J].计算机研究与发展,2011,48(10):1831-1841. 被引量:8
  • 7Lv Qin, Josephson W, Wang Zhe, et al. Multi-probe LSH: effi- cient indexing for high-dimensional similarity seareb[C] // Proc of the 33rd Int Conf on Very Large Data Bases(VLDB'07). Vien- na Austria: VLDB Endowment, 2007: 950-961.

二级参考文献62

  • 1宁焕生,张瑜,刘芳丽,刘文明,渠慎丰.中国物联网信息服务系统研究[J].电子学报,2006,34(B12):2514-2517. 被引量:151
  • 2CNN Facebook nearly as large as U. S. population [OL]. (2009-09 -16)[2011-04-27]. http..//edition, cnn. com/2009/ TECH/09/16/facebook. profit/.
  • 3Raghavan S, Molina G H. Representing Web graphs [C] // Proc of Int Conf on Data Engineering 2003. Piscataway, NJ: IEEE, 2003:405-416.
  • 4Rjeili A A, Karypis G. Multilevel algorithms for partitioning power-law graphs [C] //Proc of Int Parallel and Distributed Processing Symp 2006. Piscataway, NJ: IEEE, 2006: 16- 575.
  • 5Tian Y Y, Hankins R A, Patel M J. Effcient aggregation for graph summarization [C] //Proc of ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2008 : 567-580.
  • 6Zhang N, Tian Y Y, Patel M J. Discovery-driven graph summarization [C] //Proc of Int Conf on Data Engineering 2010. Piscataway, NJ: IEEE, 2010: 880-891.
  • 7Chakrabarti D, Faloutsos C. Graph mining: Laws, generators, and algorithms [J]. ACM Computing Surveys, 2006, 38(1): article No. 2.
  • 8Newman M E J, The structure and function of complex networks [J]. ACM Sigcsim Installation Management Review, 2003, 45: 167-256.
  • 9Chakrabarti D, Faloutsos C, Zhan Y. Visualization of large networks with min-cut plots, A-plots and R MAT [J]. Int Journal of Man-machine Studies, 2007, 65(5): 434-445.
  • 10Jun H, Wang W, Prins J, et al. Spin: Mining maximal frequent subgraphs from graph databases [C] //Proc of Knowledge Discovery and Data Mining 2004. New York: ACM, 2004:581-586.

共引文献193

同被引文献39

  • 1邓波,张玉超,金松昌,林旺群.基于MapReduce并行架构的大数据社会网络社团挖掘方法[J].计算机研究与发展,2013,50(S2):187-195. 被引量:10
  • 2Lin J, Schataz M. Design patterns for efficient graph algorithms in mapreduce [ C ]//Proceedings of the 8th Workshop on Mhaing and Learning with Graphs,Wash- ington D.C.,July 24-25,2010:78-85.
  • 3Luan T,Lu R X,Shen X M,et al.Social on the road: ena- bling secure and efficient social networking on highways [J].IEEE Wireless Communications,2015,22(1):44-51.
  • 4Yang H C,Dasdan A,Hsiao H L, et al. Map-Reduce- Merge:Sknplified rehtional data processing on large chsters[C]//Proceeding of the 2007 ACM SIGMOD In- ternational Conference on Management of Data,Beijmg, June 11-14,2007:1029-1040.
  • 5Vrba Z,Halvorsen P,Griwodz G,et al.Kahn process net- works are a flexible alternative to mapreduce[C]//Proc of the 11th IEEE International Conference on High Per- formance Computing and Communications, Seoul, June 25-27,2009:154-162.
  • 6Sandholm T,Lai K.MapReduce opthvimtion using regu- lated dynamic prioritization [ J]. Performance Evaluation Review,2009,37(1):299-310.
  • 7Liu Q,Todman T,Luk W.Combining optimizations in au- tomated low power design[C]//Proe of.Design, Automa- tion and Test in Europe,Dresden,Germany,March 8-12, 2010:1791-1796.
  • 8Chen Qnan,Zhang Daqiang,Guo Mingyi, et al.SAMR: A self-adaptive mapreduce scheduling algorithm in hetero- geneous environment[C]//Proc of the 10th IEEE Interna- tional Conference on Computer and Information Technol- ogy,Bradford,UK,June 29-July 1,2010:2736-2743.
  • 9Nicolas G P,Aida D H G. Scaling up data mining algo- rithms: Review and taxonomy[J].Process in Artificial In- telligence,2012,1 (1):71-87.
  • 10Chen E.All-disturbances sketches revisited:HIP esthna- tors for massive graphs analysis[J].IEEE Transactions on Knowledge and Data Engineering,2015(99):1.

引证文献5

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部