应用中的各种因素可能造成数据缺失,影响后续任务的分析。因此,数据集缺失值的插补尤为重要。相比原本没有插补的处理,错误的插补值也会对分析造成更严重的偏差。针对这种情况,提出新的采用双重判别器的基于条件生成对抗插补网络(C-GAIN...应用中的各种因素可能造成数据缺失,影响后续任务的分析。因此,数据集缺失值的插补尤为重要。相比原本没有插补的处理,错误的插补值也会对分析造成更严重的偏差。针对这种情况,提出新的采用双重判别器的基于条件生成对抗插补网络(C-GAIN)的缺失值插补算法DDC-GAIN(Dual Discriminator based on C-GAIN)。该算法通过一个辅助判别器辅助主判别器判断预测值的真假,即根据一个样本的全局信息判断这个样本生成的真假,更注重特征之间的关系,以此估算预测值。在4个数据集上与5种经典插补算法进行对比实验,结果表明:同样条件下,DDC-GAIN算法在样本量较大时的均方根误差(RMSE)最低;在Default credit card数据集上缺失率为15%时,DDC-GAIN算法的RMSE比次优算法C-GAIN降低了28.99%。这说明利用辅助判别器帮助主判别器学习特征之间的关系是有效的。展开更多
Energy consumption prediction of a CNC machining process is important for energy efficiency optimization strategies.To improve the generalization abilities,more and more parameters are acquired for energy prediction m...Energy consumption prediction of a CNC machining process is important for energy efficiency optimization strategies.To improve the generalization abilities,more and more parameters are acquired for energy prediction modeling.While the data collected from workshops may be incomplete because of misoperation,unstable network connections,and frequent transfers,etc.This work proposes a framework for energy modeling based on incomplete data to address this issue.First,some necessary preliminary operations are used for incomplete data sets.Then,missing values are estimated to generate a new complete data set based on generative adversarial imputation nets(GAIN).Next,the gene expression programming(GEP)algorithm is utilized to train the energy model based on the generated data sets.Finally,we test the predictive accuracy of the obtained model.Computational experiments are designed to investigate the performance of the proposed framework with different rates of missing data.Experimental results demonstrate that even when the missing data rate increases to 30%,the proposed framework can still make efficient predictions,with the corresponding RMSE and MAE 0.903 k J and 0.739 k J,respectively.展开更多
文摘应用中的各种因素可能造成数据缺失,影响后续任务的分析。因此,数据集缺失值的插补尤为重要。相比原本没有插补的处理,错误的插补值也会对分析造成更严重的偏差。针对这种情况,提出新的采用双重判别器的基于条件生成对抗插补网络(C-GAIN)的缺失值插补算法DDC-GAIN(Dual Discriminator based on C-GAIN)。该算法通过一个辅助判别器辅助主判别器判断预测值的真假,即根据一个样本的全局信息判断这个样本生成的真假,更注重特征之间的关系,以此估算预测值。在4个数据集上与5种经典插补算法进行对比实验,结果表明:同样条件下,DDC-GAIN算法在样本量较大时的均方根误差(RMSE)最低;在Default credit card数据集上缺失率为15%时,DDC-GAIN算法的RMSE比次优算法C-GAIN降低了28.99%。这说明利用辅助判别器帮助主判别器学习特征之间的关系是有效的。
基金supported in part by the National Natural Science Foundation of China(51975075)Chongqing Technology Innovation and Application Program(cstc2018jszx-cyzd X0183)。
文摘Energy consumption prediction of a CNC machining process is important for energy efficiency optimization strategies.To improve the generalization abilities,more and more parameters are acquired for energy prediction modeling.While the data collected from workshops may be incomplete because of misoperation,unstable network connections,and frequent transfers,etc.This work proposes a framework for energy modeling based on incomplete data to address this issue.First,some necessary preliminary operations are used for incomplete data sets.Then,missing values are estimated to generate a new complete data set based on generative adversarial imputation nets(GAIN).Next,the gene expression programming(GEP)algorithm is utilized to train the energy model based on the generated data sets.Finally,we test the predictive accuracy of the obtained model.Computational experiments are designed to investigate the performance of the proposed framework with different rates of missing data.Experimental results demonstrate that even when the missing data rate increases to 30%,the proposed framework can still make efficient predictions,with the corresponding RMSE and MAE 0.903 k J and 0.739 k J,respectively.