期刊文献+

一种分布式并行服务器节点故障检测算法 被引量:3

A Node Fault Detection Algorithm in Distributed Parallel Server
下载PDF
导出
摘要 故障检测技术是实现分布式并行服务器容错的基础。为了尽可能减小故障检测算法对系统通信开销的增加,寄生式自适应故障检测算法被提了出来。该算法依靠系统内部固有的信息交换进行故障检测,而通过自回归AR模型预测消息的传输时间和处理时间,并以此自动调整故障检测的阈值,达到自适应系统运行状况的目的,最后通过伪代码的形式描述了该算法的实现。该算法已被应用于分布式并行数据库系统DPSQL,较好地实现了节点故障检测。 Fault detection technology is the base of fault-tolerance in distributed parallel server. To reduce the communication cost that fault detection algorithm brings to the greatest extent, Autoecious Adaptive Fault Detection (A2FD) algorithm is proposed. Depending on the exchange of inherent information processed by system, the algorithm carries out fault detection. To achieve the goal that it adapts itself to system status, the algorithm adjusts the key value of fault detection according to transmission and transaction time of the message predicted by Auto Regression (AR) model. Finally, the realization of the algorithm is described with the pseudocode. The algorithm has been applied to distributed parallel database system--DPSQL, which has realized node fault diction well.
出处 《电子科技大学学报》 EI CAS CSCD 北大核心 2007年第1期119-121,125,共4页 Journal of University of Electronic Science and Technology of China
关键词 分布式并行服务器 故障检测 自适应 寄生 distributed parallel server fault detection adaptive autoeciousness
  • 相关文献

参考文献3

二级参考文献11

共引文献10

同被引文献38

  • 1汤小康.服务器虚拟化技术在校园网中的应用[J].计算机时代,2009(2):14-15. 被引量:29
  • 2林闯,彭雪海.可信网络研究[J].计算机学报,2005,28(5):751-758. 被引量:253
  • 3杨少春.采用VMware构建虚拟并行计算网[J].计算机工程与设计,2006,27(14):2546-2547. 被引量:20
  • 4Patterson D.Recovery oriented computing.Presented at Princeton University[EB/OL].2002,http://roc.cs.berkeley.edu /talks/UIUC.ppt.
  • 5Yamanouchi M,Matsuura S,and Sunahara H.A fault detection system for large scale sensor networks considering reliability of sensor data[C].Proc of the Ninth Annual International Symposium on Applications and Internet (SAINT'09).Seattl,USA,2009:255-258.
  • 6Lee H M,Park D S,and Hong M,et al..A resource management system for fault tolerance in grid computing[C].Proc of International Conference on Computational Science and Engineering (CSE'09).Vancouver,CA,2009,2:609-614.
  • 7Chtepen M,Claeys F,and Dhoedt B,et al..Adaptive task checkpointing and replication:toward efficient fault-tolerant grids[J].IEEE Transactions on Parallel and Distributed Systems,2009,20(2):180-190.
  • 8Jain A and Shyamasundar R K.Failure detection and membership in grid environments[C].Proc of the 5th IEEE/ACM Int'l Workshop on Grid Computing (GRID'04),Los Alamitos,CA,IEEE Computer Society Press,2004:44-52.
  • 9Hwang S and Kesselmanl C.A flexible framework for fault tolerance in the grid[J].Journal of Grid Computing,2003,1(3):251-272.
  • 10Chen W,Toueg S,and Aguilera1 M K.On the quality of service of failure detectors[J].IEEE Transactions on Computers,2002,51(2):13-32.

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部