基于HBase的并行BFS方法被引量：4

HBase Based Parallel BFS Method

下载PDF

导出

摘要 NoSQL数据库作为下一代巨型数据的存储模式,在科学计算和商业计算领域均发挥着重要作用,受到当前学术界和企业界的广泛关注。提出一种新的基于NoSQL数据库HBase的并行求取最短路径树的方法。首先利用Watts-Strogatz模型完成对巨型网络的数学建模,这种建模方式使得网络模型具有一定的聚类效果;其次利用HBase最近发布的Coprocessor简化和改进并行BFS方法,提高其计算效率。此外,还设计并实施了大量实验,得出了巨型网络的最短路径树,验证了该算法的正确性和有效性;同时对比其它路径算法,验证了该算法的高效性。 As the next generation of storage model of giant data, NoSQL database plays an important role both in the fields of scientific computing and commercial computing, and has gained wide attention in academia and business com- munity. We presented a new parallel method based on HBase to gain the shortest path tree. Firstly,Watts-strogatz mode was used to complete the mathematical modeling of giant network, therefore the network would have some cluster effects. Secondly, we made a simplification and improvement to the parallel breath-first search method, in order to im- prove its calculation efficiency. In addition, we designed and implemented a large number of experiments. According to the experiment resuhs,we obtained the giant network shortest path tree, and verified the correctness and validity of the algorithm. Meanwhile,Contrast to the other path algorithm, we verified the efficiency of the algorithm.

作者强彦卢军佐刘涛裴博

机构地区太原理工大学计算机科学与技术学院

出处《计算机科学》 CSCD 北大核心 2013年第3期228-231,共4页 Computer Science

基金国家自然科学基金项目(61202163 61240035) 山西省自然科学基金(2012011015-1) 山西省科技攻关项目(20120313032-3)资助

关键词 HBASE 协处理器并行广度优先算法 MAPREDUCE NOSQL数据库 HBase, Ccoprocessor, Parallel BFS, Mapreduce, NoSQL database

分类号 TP301 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献11

1Lakshman A,Malik P.Cassandra:A Decentralized Structured Storage System[C] //SIGOPS.2010.
2HBase Homepage[OL].http://hbase.apache.org.
3Chang F,et al.Bigtable:A Distributed Storage System for Structured Data[C] //OSDI.2006.
4许春聪,黄小猛,吴诺,孙宁伟,杨广文.分布式文件系统存储介质评测与分析[J].计算机学报,2010,33(10):1873-1880. 被引量：9
5Lin J,Dyer C.Data-Intensive Text Processing with MapReduce,ser.Synthesis Lectures on Human Language Technologies[M].Morgan &.Claypool Publishers,2010.
6Dean J,Ghemawat S.Mapreduce:simplified data processing on large clusters[J].Commun.ACM,2008,51(1):107-113.
7周国亮,陈红.基于图形处理器的并行方体计算[J].软件学报,2010,33(10):1788-1798.
8Dean J.Large-scale distributed systems at google:Current systems and future directions[C] //The 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware.2009.
9Dijkstra E W.A note on two problems in connexion with graphs[C] //Numerische Mathematik.1959:269-271.
10Cormen T H,Leiserson C E,Rivest R L.Introduction to Algorithms[M].Cambridge,Massachusetts:MIT Press,1990.

二级参考文献12

1Ghemawat S, Gobioff H, Leung S T. The google file system//Proceedings of the 19th ACM Symposium on Operating Systems Principles. Sagamore, 2003:29-43.
2Caulfield A M, Grupp L M, Swanson S. Gordon: Using flash memory to build fast, power-efficient clusters for dataintensive applications//Proceedings of the International Con ference on Architectural Support for Programming Languages and Operating Systems. Wangington, 2008:217-228.
3Andersen D G, Franklin J, Kaminsky M. FAWN: A fast ar ray of wimpy nodes//Proceedings of the 22nd ACM Symposi um on Operating Systems Principles. Big Sky, 2009:1-14.
4Outerhout J, Agrawal P, Erickson D et al. The case for RAMClouds: Scalable high-performance storage entirely in DRAM. Operating Systems Review, 2009, 43(4) 92-105.
5Schmidt K, Ou Y, Harder T. The promise of solid state disks: Increasing efficiency and reducing cost of DBMS processing//Proceedings of the Canadian Conference on Comput er Science & Software Engineering. Montreal, 2009. 35-41.
6Polte M, Simsa J, Gibson G. Comparing performance of solid state devices and mechanical disks//Proceedings of the 3rd Petascale Data Storage Workshop Held in Conjunction with Supercomputing. Pittsburgh, 2008:1-7.
7Narayanan D, Thereska Eno, Donnelly A et al. Migrating server storage to SSDs: Analysis of tradeoffs//Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys'09). Nuremberg, 2009. 145-158.
8Anderson E, Spence S, Swaminathan R et al. Quickly finding near-optimal storage designs. ACM Transactions on Computer System, 2005, 23(4) : 337- 374.
9Strunk J, Thereska E, Faloutsos C, Ganger G. Using utility to provision storage systems//Proceedings of the USENIX Conference on File and Storage Technologies. San Jose, 2008, 313- 328.
10Kryder M H, Kim C S. After hard drives-What comes next? IEEE Transactions on Magnetics, 2009, 45 (10): 3406-3413.

共引文献8

1李东阳,刘鹏,田浪军.基于SSD的云存储主服务器元数据管理研究[J].计算机技术与发展,2013,23(10):68-71.
2刘靖宇,谭毓安,薛静锋,马忠梅,李元章,张全新.S-RAID中基于连续数据特征的写优化策略[J].计算机学报,2014,37(3):721-734. 被引量：4
3张瑞杰,张文生,李战怀.基于文件队列的分级存储系统FQ-HSM的设计与实现[J].计算机与现代化,2017(2):67-72. 被引量：1
4许春聪,文海雄,刘钊,郑强,韩鹏.分布式内存文件系统的发展[J].科技与创新,2017,0(24):31-33.
5聂沛,陈广胜,景维鹏.一种面向遥感影像的分布式存储方法[J].测绘工程,2018,27(11):40-45. 被引量：5
6杨澜泳.关于范德蒙行列式的应用[J].西部皮革,2019,41(14):79-81. 被引量：1
7肖进,李春燕,贾品荣.人工智能背景下政府治理智能决策优化研究[J].电子科技大学学报（社科版）,2021,23(5):42-48. 被引量：3
8汪朋,姜红玉,封雷.面向数据处理与管理的云平台系统架构设计[J].计算机技术与发展,2022,32(7):122-127. 被引量：2

同被引文献23

1郝占刚,王正欧.基于遗传算法和k-medoids算法的聚类新算法[J].现代图书情报技术,2006(5):44-46. 被引量：5
2TomWhite.Hadoop权威指南[M].周敏奇,王晓玲,译.北京:清华大学出版社,2011.
3张吉赞,李洪波,王峰.Ad Hoc网络的加权可靠路由策略[J].计算机工程与应用,2007,43(35):140-145. 被引量：2
4Zhang Qiaoping,Couloigner I.A new and efficient K-medoid algorithm for spatial[C]//Computational Science and its Applications-ICCSA,2005:181-189.
5Park Hae-Sang,Jun Chi-Hyuck.A simple and fast algorithm for K-medoids clustering[J].Expert Systems with Applications,2009,36(2):3336-3341.
6Alper Z G.K-harmonic means data clustering with simulated[J].Applied Mathematics and Computation,2007,184:199-209.
7Pei Ying,Xu Jungang,Cen Zhiwang,et al.IKMC:An improved K-medoids clustering method for near-duplicated records detection[C]//International Conference on Computational Intelligence and Software Engineering,2009:1-4.
8Cardot H,Cénac P,Monnez J M.A fast and recursive algorithm for clustering large datasets with k-medians[J].Computational Statistics and Data Analysis,2012,56:1434-1449.
9Qiao Shaoyu,Geng Xinyu,Wu Min.An improved method for K_medoids algorithm[C]//International Conference on Business Computing and Global Informatization,2011:440-444.
10孙胜,王元珍.基于核的自适应K-Medoid聚类[J].计算机工程与设计,2009,30(3):674-675. 被引量：14

引证文献4

1刘玉军,汪明辉,蔡猛,陈坤.降低Ad hoc网络信息泄露的路由算法[J].计算机工程与科学,2015,37(6):1087-1092. 被引量：1
2王永贵,戴伟,武超.一种基于Hadoop的高效K-Medoids并行算法[J].计算机工程与应用,2015,51(16):47-54. 被引量：4
3宋华珠,段文军,刘翔.基于HBase的本体存储模型[J].计算机科学,2016,43(6):39-43. 被引量：2
4宋江健.基于OpenTSDB和OPC的能耗数据采集存储技术研究[J].福建电脑,2019,35(1):8-9. 被引量：1

二级引证文献8

1虢韬,徐志聘,王伟,王俊锞,李昊.基于GIS的输电线路覆冰趋势分析系统设计与应用[J].电力信息与通信技术,2017,15(4):36-40. 被引量：2
2周恩波,毛善君,李梅,孙振明.GPU加速的改进PAM聚类算法研究与应用[J].地球信息科学学报,2017,19(6):782-791. 被引量：4
3吕太富.基于网格的Ad Hoc网络匿名组播路由协议[J].计算机工程与设计,2017,38(8):2053-2058.
4李慧敏.基于Hadoop平台的并行化Canopy聚类算法[J].电脑知识与技术,2018,14(10Z):18-19.
5刘斌,何进荣,耿耀君,王最.并行机器学习算法基础体系前沿进展综述[J].计算机工程与应用,2017,53(11):31-38. 被引量：10
6朱松杰,娄渊胜,叶枫,李凌,陈勇.基于协处理器的HBase内存索引机制的研究[J].计算机工程与应用,2020,56(1):98-105. 被引量：11
7邓玉芳,张继福.一种基于标准差的K-medoids聚类算法[J].计算机技术与发展,2020,30(8):53-60. 被引量：4
8石晓栊,赵统永,王耀忠,彭君.基于大数据应用的地质灾害数据存储策略[J].计算机测量与控制,2023,31(6):156-161. 被引量：1

1郑文艳.Flash在深度和广度优先遍历算法教学中的应用[J].软件导刊,2013,12(11):62-64.
2杨万应,章勇,黄涛,谢峰森.基于SNMP协议的多线程网络拓扑发现算法的研究[J].中国电子商情（通信市场）,2011(3):83-87. 被引量：1
3张文杰,余志雄.基于v-SVR算法的隧道地表沉降预测方法研究与应用[J].广东土木与建筑,2008,15(11):53-54.
4李嵬,王新伟,束金龙,赖颖彦,王超.基于混合优化策略的智能集装箱预翻箱系统[J].计算机应用研究,2006,23(2):171-174. 被引量：4
5李田来,刘方爱,马艳.一种新的网格资源管理与调度算法[J].山东科学,2007,20(1):69-72. 被引量：1
6张俊星,王培昌.一类复杂控制系统中任务模型的建立及调度方法的研究[J].化工自动化及仪表,2008,35(1):32-35.
7李辰寅,徐健,张淑梅,浦敏,李云飞.立体停车库调度算法的研究与实现[J].苏州科技学院学报（工程技术版）,2008,21(1):63-66. 被引量：6
8佟强,周园春,阎保平.关联规则挖掘算法[J].微电子学与计算机,2005,22(6):68-72. 被引量：21
9王佳慧,王书锋.随机Petri网可视化软件的设计与实现[J].计算机工程与设计,2011,32(5):1845-1848. 被引量：4
10黄学毛.巨型网络的路由设计及协议选择[J].计算机时代,2003(10):11-12.

计算机科学

2013年第3期

浏览历史

内容加载中请稍等...

基于HBase的并行BFS方法被引量：4

参考文献11

二级参考文献12

共引文献8

同被引文献23

引证文献4

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

基于HBase的并行BFS方法 被引量：4

参考文献11

二级参考文献12

共引文献8

同被引文献23

引证文献4

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

基于HBase的并行BFS方法被引量：4