iHDFS: A Distributed File System Supporting Incremental Computing

下载PDF

导出

摘要 Big data are always processed repeatedly with small changes, which is a major form of big data processing. The feature of incremental change of big data shows that incremental computing mode can improve the performance greatly. HDFS is a distributed file system on Hadoop which is the most popular platform for big data analytics. And HDFS adopts fixed-size chunking policy, which is inefficient facing incremental computing. Therefore, in this paper, we proposed iHDFS (incremental HDFS), a distributed file system, which can provide basic guarantee for big data parallel processing. The iHDFS is implemented as an extension to HDFS. In iHDFS, Rabin fingerprint algorithm is applied to achieve content defined chunking. This policy make data chunking has much higher stability, and the intermediate processing results can be reused efficiently, so the performance of incremental data processing can be improved significantly. The effectiveness and efficiency of iHDFS have been demonstrated by the experimental results.

作者 Zhenhua Wang Qingsong Ding Fuxiang Gao Derong Shen Ge Yu

机构地区 College of Information Science and Engineering Key Laboratory of Medical Image Computing

出处《国际计算机前沿大会会议论文集》 2015年第1期44-45,共2页 International Conference of Pioneering Computer Scientists, Engineers and Educators（ICPCSEE）

关键词 INCREMENTAL COMPUTING distributed FILE system BIG data HDFS

分类号 C5 [社会学]

引文网络
相关文献

1样题(二)[J].初中生学习（高）,2003,0(Z2):9-14.
2贾士代.一个十分有用的恒等式[J].中学数学研究（华南师范大学）（上半月）,2001,0(8):31-32.
3SONG Jianlan.“Wild-Type” Research Fronts as Defined by Peer Scientists[J].Bulletin of the Chinese Academy of Sciences,2017,31(4):196-198.
4Xie Kun,Huang Xiaohong,Ma Maode,Zhang Pei.Power savings in software defined data center networks via modified hybrid genetic algorithm[J].The Journal of China Universities of Posts and Telecommunications,2017,24(4):76-86. 被引量：2
5陶永才,巴阳,石磊,卫琳.一种基于可用性的动态云数据副本管理机制[J].小型微型计算机系统,2018,39(3):490-495. 被引量：5
6PENG Yuanxi,ZHOU Feng,HAI Yue,WANG Yaohua.A Multi-instruction Streams Extension Mechanism for SIMD Processor[J].Chinese Journal of Electronics,2017,26(6):1154-1160. 被引量：1
7王艳萍.创新理念背景下的高职院校跨境电商专业人才培养研究[J].文存阅刊,2017,0(14):198-198.
8魏文燕,彭维平,李子臣,汤永利.一种基于Rabin和Paillier的数字签名方案[J].计算机应用与软件,2017,34(12):301-306. 被引量：3
9田莎,徐晓燕,廖柳,田雪飞,姚天振.基于以抗肿瘤为主要活性的蜈蚣肝毒性的实验研究[J].时珍国医国药,2018,29(1):1-3. 被引量：7
10Hongguang Pan,Weimin Zhong,Zaiying Wang.An on-line constraint softening strategy to guarantee the feasibility of dynamic controller in double-layered MPC[J].Chinese Journal of Chemical Engineering,2017,25(12):1805-1811.

国际计算机前沿大会会议论文集

2015年第1期

浏览历史

内容加载中请稍等...

iHDFS: A Distributed File System Supporting Incremental Computing

相关作者

相关机构

相关主题

浏览历史