摘要
中国石油天然气集团有限公司(简称中国石油)经营商品(包括油品和非油商品)种类多、数量大,加油站数量众多。每种商品在不同加油站的销售规律不同,而在零售业务分析实践中为每种商品在每个加油站的销售情况分别建立模型是不现实的,必须使用统一的模型来为不同的商品进行销量预测,这就要求模型的拟合器必须能分辨出不同商品、加油站的差异信息;而为每种商品、每个加油站进行独热编码,数据维度较高且没有相当量级的销量数据用于训练,因此提出通过引入知识图谱和图神经网络模型,将不同的商品、加油站映射到少量维度的向量空间,模型拟合器便能通过少量维度分辨出不同商品或加油站的差异信息,从而达到构建统一模型的目的。以成品油零售业务销量预测为例,利用GATNE(异构图神经网络图嵌入模型)将加油站和商品特性进行向量化表征,通过与历史数据向量拼接,作为XGBoost(极度梯度提升树)等销量预测模型的数据输入。经过验证:与直接基于商品、加油站组构建统一模型相比,从加油站维度构建引入油品节点图向量表示的销量预测统一模型具有更高的预测精度,且通过与分品号分加油站单独建模效果相比,从加油站维度构建的销量预测统一模型符合业务逻辑规律,具有业务实际应用价值。
China National Petroleum Corporation(CNPC)enjoys a huge number of goods(including oil products and non-oil products)along with a large number of gas stations.Each commodity has different sales rules in different gas stations,and it is unrealistic to establish a model for the sales of each commodity in each gas station in the practice of retail business analysis.Therefore,it is necessary to use unified models to conduct sales prediction for various goods which requires that the fitting device of the model must be able to distinguish the difference of different commodities and gas stations;As for one-hot encoding for each good and gas station,the data dimension is higher and there is no equivalent volume data for training.So,the paper proposes to introduce knowledge domain and neural network model to map different commodities and gas stations to a vector space of a few dimensions.The model fitting device is able to distinguish the difference of different commodities or gas stations through a few dimensions,in a bid to achieve the purpose of building a unified model.Taking the forecast of sales volume of refined oil retail business as an example,GATNE model can be utilized to show characteristics of gas station and commodity represented by vectors.By combining with historical data,they can serve as data input of sales volume prediction model for XGBoost.According to verification,in comparison with the unified model built directly based on commodity and gas station group,the unified model of sales prediction constructed from the dimension of gas station and represented by oil node graph vector enjoys higher prediction accuracy.Besides,compared with the effect of separate modeling by product number and gas station,the unified model of sales prediction constructed from the dimension of gas station conforms to the law of business logic and can produce practical application value.
作者
刘增霞
蔺光岭
刘华中
刘速
杨文军
LIU Zengxia;LIN Guangling;LIU Huazhong;LIU Su;YANG Wenjun(Kunlun Shuzhi Technology Co.,Ltd.)
出处
《油气与新能源》
2023年第3期66-75,共10页
Petroleum and new energy