期刊文献+

拜占庭系统技术研究综述 被引量:104

Research on the Technologies of Byzantine System
下载PDF
导出
摘要 随着分布式系统规模的增大,设计复杂度也不断提升,系统可靠性所面临的问题也越来越严峻.由于拜占庭协议能够容忍包括人为失误、软件bug和安全漏洞等各种形式的错误,其系统技术和实现方法越来越受到研究者们的重视.介绍和总结了目前拜占庭系统技术的研究成果,分析了目前拜占庭系统的研究现状,并探讨了拜占庭系统的发展趋势.通过分析得出:1)拜占庭系统性能上仍然与已经实用的非拜占庭系统相距较大,占用资源数量仍然较多,需要进一步研究其性能和资源优化技术;2)通过检测错误或者定期修复来降低系统中的错误,是延长系统可持续运行时间的方法,需要研究新的、高效的全面检测拜占庭服务器、合理定期修复等保障系统可持续运行的方法;3)实际应用背景和需求及其特定错误类型的处理方法对拜占庭协议和功能等提出了不一样的要求,需要研究拜占庭系统在实际中的应用和可用性. Nowadays, in order to resolve the reliability problem in an enlarging distributed system, Byzantine fault tolerant system has researched popularly for its ability of tolerating arbitrary faults. In this paper, the definitions of Byzantine system and the estimation methods of improving the performance of Byzantine system are introduced. After that, some unresolved problems and some future development trends will be indicated. Finally, after analyzing the status of studies, several conclusions are drawn: 1) The cost of running a Byzantine system is still much higher than non-Byzantine system. Plans to increase the performance and decrease the overhead are need to be explored in further study. 2) While detecting Byzantine faults and proactive recovery can keep Byzantine system from breaking down, they still have some drawbacks. How to eliminate the drawbacks should be studied. 3) Different applications require different aspect of optimization. How to make practical Byzantine systems are needed to be studied.
出处 《软件学报》 EI CSCD 北大核心 2013年第6期1346-1360,共15页 Journal of Software
基金 国家自然科学基金(60925006) 国家高技术研究发展计划(863)(2013AA013201)
关键词 可靠性 容错 拜占庭系统 状态机 QUORUM reliability fault tolerance Byzantine system state machine Quorum
  • 相关文献

参考文献2

二级参考文献50

  • 1Layman P, Varian H R. How much information 2003? [EB/OL]. [2010 10-18]. http://www2, sims. berkeley. edu/research/proiects/how-mueh-info-2003.
  • 2Pinheiro E, Weber W D, Barroso L A. Failure trends in a large disk drive population [C] //Proc of the 5th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2007 : 17-28.
  • 3Schroeder B, Gibson G A. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? [C] //Proc of the 5th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2007: 1-16.
  • 4Bairavasundaram L N, Goodson G R, Pasupathy S, et al. An analysis of latent sector errors in disk drives [C]//Proc of 2007 ACM SIGMETRICS Int Conf on Measurement and Modeling of Computer Systems. New York: ACM, 200: 289-300.
  • 5Hafner J M, Deenadhayalan V, Rao K, et al. Matrix methods for lost data reconstruction in erasure codes [C] // Proc of the 4th USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2005: 183-196.
  • 6Hafner J M, Deenadhayalan V, Kanungo T, et al. Performance metrics for erasure codes in storage systems, RJ 10321 [R]. San Jose, [A] IBM Research, 2004.
  • 7Li M, Shu J, Zheng W. GRID Codes: Strip based erasure codes with high fault tolerance for storage systems [J].ACM Transon Storage, 2009, 4(4): 1-22.
  • 8Blaum M, Brady J, Bruek J, et al. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures [J].IEEE Trans on Computer, 1995, 44 (2) 192-202.
  • 9Corbett P, English B, Goel A, et al. Row-diagonal redundant for double disk failure correction [C] //Proc of the 3rd USENIX Conf on File and Storage Technologies. Berkeley, CA: USENIX Association, 2004:2-15.
  • 10Xu L, Bruck J. X-code: MDS array codes with optimal encoding[J]. IEEE Trans on Information Theory, 1999, 45 (1) : 272-276.

共引文献92

同被引文献532

引证文献104

二级引证文献4017

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部