期刊文献+

数据质量闭环管控框架数据估值的应用研究 被引量:4

Application o Data Profiling On Closed-Loop Data Quality Control Framework
原文传递
导出
摘要 通过对K均值算法进行优化形成OKMI估计算法、小样本模糊预聚类,排除噪声数据干扰,让数据实例按照更为精准的导向进行聚类,进而产生预测值,用以辅助完成数据质量准确性和完整性的剖析和整改提升。通过对运营监测(控)中心的实际数据源进行实验分析,验证了OKMI估计算法的有效性。 The data quality is poor and lack of data quality management capacity in Utility industry. Base on the data life cycle, a closed-loop data quality control framework is proposed for SGCC Operation Monitoring Center, which de- scribes a comprehensive definition, profiling, metrics, enhance of data quality, and achieves all-round management of data quality. Meanwhile, this paper focuses on an algorithm of a fuzzy clustering approach for missing value impu- tation with noisy data immunity. The OKMI (Optimized K-Means Imputation ) method aggregates data instances to more accurate clusters for further appropriate estimation via information entropy after resampling pre-clustering and outlier test. The effectiveness of experimental results in SGCC Operational Monitoring Center demonstrate that the 0KMI proposed obtains higher precision both on quantitative and on nominal attributive missing value completion than other classic methods under all missingness mechanisms at varving missing rates with abnormal values.
机构地区 重庆市电力公司
出处 《华东电力》 北大核心 2013年第3期546-549,共4页 East China Electric Power
关键词 数据质量 运营监测 闭环管控 K均值 模糊聚类 ata Quality Operation Monitor K-Means Imputation fuzzy clustering
  • 相关文献

参考文献1

  • 1FRIEDMAN, TED, SCOTT D. NELSON, JOHN RAD- CLIFFE. RM Demands Data Cleansing. Gartner Research [J]. 3 December 200d-(6) L.A. Zadeh Fuzzy sets, Informa- tion and Control 1965 (8).

同被引文献13

引证文献4

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部