期刊文献+

大规模并行文件系统中的数据可靠性机制

Reliability Mechanisms for Very Large Parallel File System
下载PDF
导出
摘要 把分布式的备份思想应用到大规模并行文件系统中,在使用冗余机制构建数据的系统中提供快速恢复机制。并使用马尔可夫模型建立了一个平均直到数据丢失时间的分布模型,指导如何在数据可靠性需求和冗余数据开销之间进行平衡。根据可靠性模型分析,在快速恢复机制下,使用m-n机制,只要n≥m+2,并且恢复数据所需的计算时间与磁盘I/O时间相比可以忽略不计,就可以满足大规模存储系统对可靠性的需求。 A fast recover mechanism is built using distributed sparing. An analytic model for the distribution of the mean time utile data lost is constructed. This paper shows how to balance requirement for high data reliability against the overhead cost of redundant data. According to the reliability analysis, using m-n mechanisms under the fast recover mechanism can meet the need of reliability for large-scale storage system, if only n≥m+2 and the time for data recovery computing can be ignored comnared with the time for disk I/O.
出处 《计算机工程》 EI CAS CSCD 北大核心 2006年第9期25-27,共3页 Computer Engineering
基金 国家"863"计划基金资助项目(2002AA104420 2002AA1Z2101)
关键词 可靠性 基于对象存储系统 并行文件系统 Reliability Object based storage system Parallel file system
  • 相关文献

参考文献5

  • 1Menon J,Mattson R L.Distributed Sparing in Diskarrays[C].Proceedings of Compcon'92,1992-02:410-416.
  • 2Xin Qin,Miller E L,Schwarz T J E.Evaluation of Distributed Recovery in Large-scale Storage Systems[C].Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing,2004-06:172-181.
  • 3Asami S.Reducing the Cost of System Administration of a Disk Storage System Built from Commodity Components[R].Berkeley:University of California,2000.
  • 4Castro M,Liskov B.Proactive Recovery in a Byzantinefault-tolerant System[C].Proceedings of the 4th Symposium on Operating Systems Design and Implementation,2000.
  • 5Litwin W,Schwarz T.LH*RS:A High-availability Scalable Distributed Data Structure Using Reed Solomon Codes[C].Proceedings of the ACM SIGMOD International Conference on Management of Data,2000-05:237-248.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部