摘要
针对碳排放报告中燃油消耗数据存在单个不连续缺失和连续缺失2类数据,使用单一方法估计误差大的问题,提出一种基于聚类分析的组合估计方法。该方法首先采用K-medoids聚类算法将数据归类为单个不连续缺失数据以及连续缺失数据,然后使用NB方法对单个不连续数据进行估计填充,使用DTW方法对连续缺失数据估计填充,最后分别在1%、2%以及3%均方根误差时进行估计结果评价。实验结果表明:基于聚类分析的NB-DTW组合方法能有效降低估计误差,在1%、2%以及3%均方根误差时比NB方法分别降低了9.3%、12.1%、12.96%,比DTW方法分别降低了35.46%、43.62%、55.04%。
Aiming at the problem of single discontinuous missing data and continuous missing data in the carbon emission report,the estimation error of using a single method is large,a combined estimation method based on cluster analysis is proposed.The method firstly uses the K-medoids clustering algorithm to classify the data into single discontinuous missing data and continuous missing data,and then uses the Naive Bayes(NB)method to estimate the single discontinuous data,uses Dynamic Time Warping(DTW)method to estimate the continuous missing data,and finally evaluates the estimation results at 1%,2%,and 3%root mean square error.The simulation results show that the NB-DTW combination method based on cluster analysis can effectively reduce the estimation error,which is 9.3%,12.1%and 12.96%lower than the NB method at 1%,2%and 3%root mean square error,respectively,and reduced by 35.46%,43.62%and 55.04%respectively than DTW method.
作者
李舒
张伟业
汪坤
段照斌
LI Shu;ZHANG Wei-ye;WANG Kun;DUAN Zhao-bin(COMAC Software Co.,Ltd.,Chengdu 610000,China;Engineering Technology Training Center,Civil Aviation University of China,Tianjin 300300,China)
出处
《计算机与现代化》
2022年第8期65-69,共5页
Computer and Modernization
基金
国家自然科学基金资助项目(61703406)。