摘要
针对海量电力运营文本数据缺乏有效应用的现状,提出了一种包括文本清洗和文本分析的电力运营文本数据预处理方法,且进一步设计了基于文本数据特征识别的电力运营信息模型。该模型采用词频-逆向文档频率(TF-IDF)算法提取经预处理后的电力运营文本特征项,并将其作为长短期记忆网络(LSTM)的输入,实现特征的自动学习。最终,输出电力运营文本分类结果。通过算例分析结果表明,所提出的TF-IDF-LSTM算法相比于LSTM与TF-IDF-SVM算法,在电力运营文本分类上具有更高的准确率。
In view of the lack of effective application of massive power operation text data,this paper proposes a power operation text data preprocessing method including text cleaning and text analysis,and further designs a power operation information model based on text data feature recognition.In this model,TF-IDF algorithm is used to extract the preprocessed power operation text features,which are used as the input of LSTM to realize the automatic learning of features,and finally output the power operation text classification results.The results show that the proposed TF-IDF-LSTM algorithm has higher accuracy than LSTM and TF-IDF-SVM algorithm in power operation text classification.
作者
俞阳
邹云峰
康雨萌
孙少辰
YU Yang;ZOU Yunfeng;KANG Yumeng;SUN Shaochen(Marketing Service Center,State Grid Jiangsu Electric Power Co.,Ltd.,Nanjing 210000,China)
出处
《电子设计工程》
2023年第1期102-106,共5页
Electronic Design Engineering
基金
国网江苏省电力有限公司科技项目(J2020062)。