摘要
确定燃气管道安全风险大数据预警模型采用怀卡托智能分析环境。确定数据预处理流程,包含原始数据的获取、数据清洗、特征变量确定与提取、缺失值填补、训练样本的选取。指出内部因素数据为管龄、管材、管径、压力级制、埋深、管理单位,外部因素数据为铁路、地铁等电气化轨道、水系面(河流与湖泊)等影响管道腐蚀的3类。从数据库中随机提取正样本1份,负样本4份,每份各855个样本点。将训练数据集分成3组:训练样本1、训练样本2、训练样本3,组成分别为正样本+负样本1,正样本+负样本2,正样本+负样本3。确定缺失值填补采用KNN算法。选择决策树C4.5、随机森林、贝叶斯网络、朴素贝叶斯、支持向量机和逻辑回归6种算法作为预警模型训练算法。根据选择的算法,同时考虑内外部因素的影响,进行预警模型训练(即实验)。根据实验结果比较分析,选出随机森林为最优算法。同时考虑内外部因素比仅考虑内部因素,模型准确率提高5.07%。
The big data early warning model for gas pipeline safety risk is determined by the Waikato Environment for Knowledge Analysis( WEKA).The data preprocessing process is determined,including the acquisition of original data,data cleaning,the determination and extraction of characteristic variables,the filling of missing values,and the selection of training samples.It is pointed out that the data of internal factors are pipe age,pipe material,pipe diameter,pressure grade system,buried depth and management unit,while the data of external factors are 3 types that affect pipeline corrosion,such as electrified tracks including railway,subway and water system surface( river and lake).One positive sample and four negative samples are randomly selected from the database,855 sample points each.The training data set is divided into three groups: training sample 1,training sample 2 and training sample 3,and the composition is respectively positive sample + negative sample 1,positive sample +negative sample 2 and positive sample + negative sample 3.KNN algorithm is used to determine the filling of missing values.Decision tree C4.5,random forest,Bayesian network,Naive bayes,support vector machine and logistic regression are selected as training algorithms of the early warning model.According to the selected algorithm,and considering the influence of internal and external factors,the early warning model training( i.e.experiment) is conducted.According to the comparison and analysis of the experimental results,the random forest is selected as the optimal algorithm.Compared with considering the internal factors only,the accuracy of the model is increased by 5.07% when considering the internal and external factors at the same time.
作者
刘江涛
张涛
吴波
顾先凯
李春青
关鸿鹏
李夏喜
曹印峰
詹淑慧
甘颖涛
荫东锦
任立坤
LIU Jiangtao;ZHANG Tao;WU Bo;GU Xiankai;LI Chunqing;GUAN Hongpeng;LI Xiaxi;CAO Yinfeng;ZHAN Shuhui;GAN Yingtao;YIN Dongjin;REN Likun
出处
《煤气与热力》
2018年第12期36-42,共7页
Gas & Heat