一种面向大规模副本存储系统的可靠性模型被引量：7

An Analytical Model for Large-Scale Storage System with Replicated Data

下载PDF

导出

摘要可靠性对大规模存储系统至关重要,在大规模存储系统中设备失效日趋频繁,副本技术成为提高系统可靠性的主流技术之一.基于Markov模型,针对多副本存储系统建立了度量系统可靠性的理论模型.该模型能够反应失效检测延迟对系统可靠性的影响.通过该模型还可以度量存储系统关键参数如系统规模、副本阶数、单节点容量、单节点平均失效时间、数据对象平均大小、平均修复带宽等对系统可靠性的影响,从而为存储系统的设计提供理论基础. Nowadays storage systems become larger and larger, so the number of storage devices is increasing rapidly, which makes storage device failure occur quite frequently in large scale storage systems. Data replica technology begins to be adopted prevalently to enhance storage system reliability. When designing a large scale storage system, there are many factors that could affect the reliability of the storage system, such as failure detection latency, storage node capacity selection, data object size design, replica rank selection and so on. On the other hand, system reliability can not be exactly experimented, so a theoretical model is needed to evaluate it. In this paper, an analytical framework is represented to evaluate the reliability for large scale storage systems which adopt replica technology to protect data. Based on the Markov model, this analytical model could provide quantitative answers to measure the impact of a series of storage system design factors on the reliability of storage systems, such as the rank of the replicated data, the capacity of the storage system, the capacity of storage nodes, the size of data object, the repair bandwidth, mean time failure detection latency and so on. Hence, many storage system design tradeoffs could be reasoned by this framework.

作者穆飞薛巍舒继武郑纬民

机构地区清华大学计算机科学与技术系清华信息科学与技术国家实验室(筹)

出处《计算机研究与发展》 EI CSCD 北大核心 2009年第5期756-761,共6页 Journal of Computer Research and Development

基金国家自然科学基金项目(90612018) 科技部"十一五"国家科技支撑计划重大项目(2006BAA02A17) 国家"九七三"重点基础研究发展计划基金项目(2004CB318205)~~

关键词存储系统可靠性多副本 MARKOV模型失效检测 storage system reliability replica Markov model failure detection

分类号 TP302.1 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献5

1IDC White Paper-Sponsored by EMC. The expanding digital universes A forecast of worldwide information growth through 2010 [OL]. [2007-08-06]. http://www. eme. com/ about/destination/digitaluniverse/
2Patterson D A, Gibson G, Katz R H. A case for redundant arrays of inexpensive disks (RAID)[C] //Proc of the 1988 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1988:109-116
3Xin Q, Miller E L, Long D D E, et al. Reliability mechanisms for very large storage systems [C]//Proc of the 20th IEEE/11th NASA Goddard Conf on Mass Storage Systems & Technologies. Piscataway, NJ: IEEE, 2003: 146-156
4Lian Q, Chen W, Zhang Z. On the impact of replica placement to the reliability of distributed brick storage systems [C]//Proc of the 25th ICDCS. Piscataway, NJ: IEEE, 2005
5Ramabhadran S, Pasquale J. Analysis of long-running replicated systems [C] //Proc of INFOCOM. Piscataway, NJ: IEEE, 2006:1-9

同被引文献40

1韩德志,谢长生,李怀阳.存储备份技术探析[J].计算机应用研究,2004,21(6):1-4. 被引量：49
2韩德志,汪洋,李怀阳.远程备份及关键技术研究[J].计算机工程,2004,30(22):34-36. 被引量：11
3张世武,吴月华,杨杰,刘际明.基于信息寻觅智能体的网络用户浏览模式研究[J].计算机研究与发展,2004,41(11):1966-1973. 被引量：6
4陈宁江,魏峻,杨波,黄涛.Web应用服务器的适应性失效检测[J].软件学报,2005,16(11):1929-1938. 被引量：18
5LINDHORST T,LUKAS G,NETT E,et al.Data-mining-based link failure detection for wireless mesh networks[C] // Proceedings of the 29th IEEE International Symposium on Reliable Distributed Systems.Piscataway,NJ:IEEE Press,2010:353-357.
6TSAI W,SHAO Q,SUN X,ELSTON J.Real-time service-oriented cloud computing[C] // Proceedings of the 6th World Congress on Services (SERVICES-1).Piscataway,NJ:IEEE Press,2010:473-478.
7GREVE F,SENS P,ARANTES L,et al.A failure detector for wireless networks with unknown membership[C] // Proceedings of the 17th International Conference on Parallel Processing.Berlin:Springer-Verlag,2011,Ⅱ:27-38.
8DING X,HOU Y,GU Z,et al.A failure detection model based on message delay prediction[C] // GCC'09:Proceedings of the 2009Eighth International Conference on Grid and Cooperative Computing.Washington,DC:IEEE Computer Society,2009:24-30.
9CHEN W,TOUEG S,AGUILERA M K.On the quality of service of failure detectors[J].IEEE Transactions on Computers,2002,51(5):561-580.
10HAYASHIBARA N,DEFAGO X,YARED R,et al.The (ψ) accrual failure detector[C] //SRDS'04:Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems.Washington,DC:IEEE Computer Society,2004:66-78.

引证文献7

1石磊,陈文远,陶永才,卫琳.自适应累加型失效检测模型研究[J].计算机应用,2012,32(3):614-616. 被引量：1
2张林峰,谭湘键,杜凯.大规模存储系统可靠性参数最优化分析[J].计算机工程与应用,2013,49(1):112-119. 被引量：4
3董杰,陈临强.一种云存储多份存储方案的研究与实现[J].电子科技,2017,30(3):95-97. 被引量：5
4李静,王刚,刘晓光,李忠伟.存储系统可靠性预测综述[J].计算机科学与探索,2017,11(3):341-354. 被引量：5
5李静,刘冬实.主动容错云存储系统的可靠性评价模型[J].计算机应用,2018,38(9):2631-2636. 被引量：5
6张晓阳,许佳豪,胡燏翀.云存储系统中的预测式局部修复码[J].计算机研究与发展,2019,56(9):1988-2000. 被引量：8
7聂世强,郑旭达,刘钊华,伍卫国,董小社,张兴军.非MDS码存储系统的通用可靠性模型[J].西安电子科技大学学报,2021,48(4):113-119.

二级引证文献28

1程江洲,常俊晓,徐浩,刘林.Timer控件在地质滑坡监测软件中的精确定时[J].计算机应用,2012,32(A02):144-146.
2沈亦军,钟伯成.一种入侵者视野下的复杂网络安全评估方案[J].计算机工程与应用,2015,51(15):119-123. 被引量：6
3李静,王刚,刘晓光,李忠伟.存储系统可靠性预测综述[J].计算机科学与探索,2017,11(3):341-354. 被引量：5
4李洵,廖臣,杨箴,龙娜,舒彧.基于云计算的电网虚拟化调度系统研究[J].电子设计工程,2019,27(12):138-141. 被引量：6
5韩文军,余春生.面向输变电工程数据存储管理的分布式数据存储架构[J].沈阳工业大学学报,2019,41(4):366-371. 被引量：32
6刘建华,郑晓坤,郑东,敖章衡.基于属性加密且支持密文检索的安全云存储系统[J].信息网络安全,2019(7):50-58. 被引量：6
7张晓阳,许佳豪,胡燏翀.云存储系统中的预测式局部修复码[J].计算机研究与发展,2019,56(9):1988-2000. 被引量：8
8杨洪章,杨雅辉,屠要峰,孙广宇,吴中海.基于“采集—预测—迁移—反馈”机制的主动容错技术[J].计算机研究与发展,2020,57(2):306-317. 被引量：2
9许琴,金晶,邱燕,朱涛.基于云存储技术的手术室数据管理系统[J].自动化与仪器仪表,2020,0(2):97-100. 被引量：4
10张航,刘善政,唐聃,蔡红亮.分布式存储系统中的低修复成本纠删码[J].计算机应用,2020,40(10):2942-2950. 被引量：6

1杨辉.小规模WSN的休眠调度算法研究[J].福建电脑,2009,25(12):95-96.
2余尧.大规模存储系统可靠性参数最优化分析[J].电子技术与软件工程,2015(16):201-201.
3常鹏晖.金融业计算机网络系统面临的安全威胁[J].中国金融,2008(20):85-85.
4周华,周海军,马建锋.基于博弈论的入侵容忍系统安全性分析模型[J].电子与信息学报,2013,35(8):1933-1939. 被引量：18
5杨刚.定量评估DCS的可靠性[J].石油化工自动化,2014,50(2):1-5. 被引量：3
6许双伟,谭林,龚时雨.一种考虑部件冷备冗余的可靠性仿真方法[J].电子产品可靠性与环境试验,2007,25(4):15-18. 被引量：1
7张静.基于OPNET的排队模型网络仿真[J].电脑知识与技术,2010,6(3X):2218-2219. 被引量：2
8华镕.透明就绪的功能实现(四)[J].自动化博览,2006,23(6):18-21.
9张林峰,谭湘键,杜凯.大规模存储系统可靠性参数最优化分析[J].计算机工程与应用,2013,49(1):112-119. 被引量：4
10袁艺,陈海光.web日志挖掘中会话识别方法[J].上海师范大学学报（自然科学版）,2016,45(5):593-598. 被引量：1

计算机研究与发展

2009年第5期

浏览历史

内容加载中请稍等...

一种面向大规模副本存储系统的可靠性模型被引量：7

参考文献5

同被引文献40

引证文献7

二级引证文献28

相关作者

相关机构

相关主题

浏览历史

一种面向大规模副本存储系统的可靠性模型 被引量：7

参考文献5

同被引文献40

引证文献7

二级引证文献28

相关作者

相关机构

相关主题

浏览历史

一种面向大规模副本存储系统的可靠性模型被引量：7