摘要
风电机组风功率预测和功率曲线建模等工作的开展依赖于历史运行数据。然而,历史数据中积累了大量的异常数据,导致上述工作难以有效开展。国内外学者已经提出了多种异常数据检测方法,然而对不同方法的优缺点与适用场合还缺少整体认识。为此,本文对基于密度的聚类算法、局部离群因子算法、Thompson-tau四分位法和孤立森林四种常用的风功率异常值检测方法进行了对比研究。为评价不同检测方法,提出了基于标准功率曲线的评价指标。实验结果表明,孤立森林算法相比其他三种方法具有更高的精度,能应对不同分布的异常数据,且清洗时间较短。
Wind power prediction and power curve modeling of wind turbines rely on historical operating data.How-ever,a large amount of abnormal data accumulated in historical data makes it difficult to carry out the above-men-tioned work effectively.Scholars at home and abroad have proposed a variety of abnormal data detection methods,but there is still a lack of overall understanding of the advantages and disadvantages of different methods and their applicable occasions.To this end,this paper compares four common wind power outlier detection methods,inclu-ding density-based clustering algorithm,local outlier factor algorithm,Thompson-tau quartile method and isolated forest.In order to evaluate different detection methods,an evaluation index based on power curve modeling error is proposed.The experimental results show that the isolated forest algorithm has higher accuracy than the other three methods,can deal with differently distributed abnormal data,and has a shorter cleaning time.
作者
封焯文
朱世平
赵志华
孙铭仁
董密
宋冬然
FENG Zhuo-wen;ZHU Shi-ping;ZHAO Zhi-hua;SUN Ming-ren;DONG Mi;SONG Dong-ran(Hunan Electric Power Design Institute Co.,Ltd.CEEC,Changsha 410007,China;School of Automation,Central South University,Changsha 410083,China)
出处
《电工电能新技术》
CSCD
北大核心
2021年第7期55-61,共7页
Advanced Technology of Electrical Engineering and Energy
基金
湖南省战略性新兴产业-科技攻关与重大科技成果转化项目(2018GK4002)。
关键词
风功率数据
异常值特点
数据清洗
孤立森林
wind power data
outlier characteristics
data cleaning
isolated forest