硬盘故障给数据中心的可靠性和可用性带来的影响越来越大,采用不同的机器学习方法构建基于自监控分析报告技术(self-monitoring,analysis and reporting technology,SMART)属性的硬盘故障预测模型策略的研究已经取得了一定的效果.但这...硬盘故障给数据中心的可靠性和可用性带来的影响越来越大,采用不同的机器学习方法构建基于自监控分析报告技术(self-monitoring,analysis and reporting technology,SMART)属性的硬盘故障预测模型策略的研究已经取得了一定的效果.但这些模型策略无法得到较为稳定的预测效果,并且无法选择适合于不同用户需求的最佳模型.为得到更高的准确率和较低的误报率,实现了基于Adaboost算法的BP神经网络预测模型优化方法.在此基础上,为更好地适用于实际工作场景,实现了根据遗传算法(genetic algorithm,GA),按照用户的预测效果要求,选择出最恰当的预测模型的方法,在不同的效果要求下选用不同的预测模型.展开更多
随着互联网的发展、存储规模的骤增,大型数据中心硬盘频繁损坏导致的数据丢失给企业带来的损失已成为不可忽视的重大问题.以往基于硬盘SMART(self-monitoring,analysis and reporting technology)属性建立的包括应用统计学和机器学习等...随着互联网的发展、存储规模的骤增,大型数据中心硬盘频繁损坏导致的数据丢失给企业带来的损失已成为不可忽视的重大问题.以往基于硬盘SMART(self-monitoring,analysis and reporting technology)属性建立的包括应用统计学和机器学习等方法在内的各种硬盘故障预测模型,虽然取得了较好的效果,但其数据采集及处理等方面均存在不足之处.基于某真实的互联网大型数据中心环境,提取SMART属性数据,并提出了一种基于神经网络权值矩阵的方法,结合Rank Sum秩和检验、RAT反向安排测试、Z-Score评分3种无参统计学方法,对属性进行选择,应用CART决策树及BP神经网络2种机器学习方法,建立硬盘故障预测模型.实验表明描述的2种硬盘故障预测模型均具有很好的性能,这是机器学习算法在实际应用场景下很好的实践.此外,通过实验以及对实验的分析和解释,得出一些有益的结论,这为下一步的研究工作奠定了基础.展开更多
Disk failure prediction methods have been useful in handing a single issue,e.g.,heterogeneous disks,model aging,and minority samples.However,because these issues often exist simultaneously,prediction models that can h...Disk failure prediction methods have been useful in handing a single issue,e.g.,heterogeneous disks,model aging,and minority samples.However,because these issues often exist simultaneously,prediction models that can handle only one will result in prediction bias in reality.Existing disk failure prediction methods simply fuse various models,lacking discussion of training data preparation and learning patterns when facing multiple issues,although the solutions to different issues often conflict with each other.As a result,we first explore the training data preparation for multiple issues via a data partitioning pattern,i.e.,our proposed multi-property data partitioning(MDP).Then,we consider learning with the partitioned data for multiple issues as learning multiple tasks,and introduce the model-agnostic meta-learning(MAML)framework to achieve the learning.Based on these improvements,we propose a novel disk failure prediction model named MDP-MAML.MDP addresses the challenges of uneven partitioning and difficulty in partitioning by time,and MAML addresses the challenge of learning with multiple domains and minor samples for multiple issues.In addition,MDP-MAML can assimilate emerging issues for learning and prediction.On the datasets reported by two real-world data centers,compared to state-of-the-art methods,MDP-MAML can improve the area under the curve(AUC)and false detection rate(FDR)from 0.85 to0.89 and from 0.85 to 0.91,respectively,while reducing false alarm rate(FAR)from 4.88%to 2.85%.展开更多
文摘硬盘故障给数据中心的可靠性和可用性带来的影响越来越大,采用不同的机器学习方法构建基于自监控分析报告技术(self-monitoring,analysis and reporting technology,SMART)属性的硬盘故障预测模型策略的研究已经取得了一定的效果.但这些模型策略无法得到较为稳定的预测效果,并且无法选择适合于不同用户需求的最佳模型.为得到更高的准确率和较低的误报率,实现了基于Adaboost算法的BP神经网络预测模型优化方法.在此基础上,为更好地适用于实际工作场景,实现了根据遗传算法(genetic algorithm,GA),按照用户的预测效果要求,选择出最恰当的预测模型的方法,在不同的效果要求下选用不同的预测模型.
文摘随着互联网的发展、存储规模的骤增,大型数据中心硬盘频繁损坏导致的数据丢失给企业带来的损失已成为不可忽视的重大问题.以往基于硬盘SMART(self-monitoring,analysis and reporting technology)属性建立的包括应用统计学和机器学习等方法在内的各种硬盘故障预测模型,虽然取得了较好的效果,但其数据采集及处理等方面均存在不足之处.基于某真实的互联网大型数据中心环境,提取SMART属性数据,并提出了一种基于神经网络权值矩阵的方法,结合Rank Sum秩和检验、RAT反向安排测试、Z-Score评分3种无参统计学方法,对属性进行选择,应用CART决策树及BP神经网络2种机器学习方法,建立硬盘故障预测模型.实验表明描述的2种硬盘故障预测模型均具有很好的性能,这是机器学习算法在实际应用场景下很好的实践.此外,通过实验以及对实验的分析和解释,得出一些有益的结论,这为下一步的研究工作奠定了基础.
基金Project supported by the National Natural Science Foundation of China(No.61902135)the Shandong Provincial Natural Science Foundation,China(No.ZR2019LZH003)。
文摘Disk failure prediction methods have been useful in handing a single issue,e.g.,heterogeneous disks,model aging,and minority samples.However,because these issues often exist simultaneously,prediction models that can handle only one will result in prediction bias in reality.Existing disk failure prediction methods simply fuse various models,lacking discussion of training data preparation and learning patterns when facing multiple issues,although the solutions to different issues often conflict with each other.As a result,we first explore the training data preparation for multiple issues via a data partitioning pattern,i.e.,our proposed multi-property data partitioning(MDP).Then,we consider learning with the partitioned data for multiple issues as learning multiple tasks,and introduce the model-agnostic meta-learning(MAML)framework to achieve the learning.Based on these improvements,we propose a novel disk failure prediction model named MDP-MAML.MDP addresses the challenges of uneven partitioning and difficulty in partitioning by time,and MAML addresses the challenge of learning with multiple domains and minor samples for multiple issues.In addition,MDP-MAML can assimilate emerging issues for learning and prediction.On the datasets reported by two real-world data centers,compared to state-of-the-art methods,MDP-MAML can improve the area under the curve(AUC)and false detection rate(FDR)from 0.85 to0.89 and from 0.85 to 0.91,respectively,while reducing false alarm rate(FAR)from 4.88%to 2.85%.