期刊文献+

细粒度云数据自适应去重方法研究

Research on the Adaptive Deduplication Method of Fine-grained Cloud Data
下载PDF
导出
摘要 常规的数据去重方法多数采用Simhash算法设计而成,在细粒度云数据去重过程中,去重覆盖范围有限,存在去重质量较低、实时性较差的问题。基于此,在传统数据去重方法的基础上,开展了细粒度云数据自适应去重方法研究。首先,采用相似重复数据检测方法,对细粒度云数据作出全方位的检测,根据字符串相似度,判断数据集中是否存在相似重复数据。其次,压缩存在相似重复性质的细粒度云数据,提取云数据去重特征。在此基础上,利用重复数据分块去重技术,删除细粒度云数据中的相似重复数据。根据实验分析结果可知,按照提出方法对细粒度云数据进行去重后,空间压缩率均达到了98%以上,能够最大限度地去除细粒度云数据中的重复数据。 Most of the conventional data deduplication methods are designed by Simhash algorithm,and in the process of finegrained cloud data deduplication,the deduplication coverage is limited,and there are problems of low deduplication quality and poor real-time performance.On the basis of the traditional data deduplication method,the research on the adaptive deduplication method of fine-grained cloud data is carried out.Firstly,the similar-duplicate data detection method is used to detect the fine-grained cloud data in an all-round way,and determine whether there is similar duplicate data in the dataset according to the string similarity.Secondly,the fine-grained cloud data with similar repetition properties is compressed,and the deduplication features of cloud data are extracted.On this basis,the deduplication technology of duplicate data is used to remove similar duplicate data in fine-grained cloud data.According to the experimental analysis results,the spatial compression ratio of the fine-grained cloud data can reach more than 98% after deduplication according to the proposed method,which can remove the duplicate data in the fine-grained cloud data to the greatest extent.
作者 王小红 WANG Xiao-hong(Yichun Vocational and Technical College,Yichun 336000,Jiangxi)
出处 《电脑与电信》 2023年第9期87-91,共5页 Computer & Telecommunication
关键词 细粒度 云数据 去重 自适应 方法 fine-grained cloud data deduplication adaptive method
  • 相关文献

参考文献9

二级参考文献45

共引文献54

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部