摘要
为提高基于随机森林算法重复拨打投诉预警模型的效果,文章从数据、指标、参数3个方面对模型进行优化。在数据处理方面,利用SMOTE算法平衡投诉与非投诉比例,一方面防止了模型出现过拟合;另一方面消除了非平衡数据对模型效果的影响。在特征选择方面,使用基尼系数进行特征选择,从而减少数据的噪声,提高模型预测的准确度。在参数调整方面,使用R语言软件对模型决策树数量参数和最大特征参数进行调整,模型最终的OOB误差率为5.03%,准确率和召回率均超过70%。目前投诉预警模型已经进行试点应用,实现了投诉业务的提前识别,通过采用相应服务策略,减少了服务升级事件,降低了客户投诉率,有效提升了客户感知。
In order to improve the effect of the repeated call warning model based on random forest algorithm, this paper optimizes the model from three aspects of data, index and parameter. In the aspect of data processing, the SMOTE algorithm is used to balance the ratio between complaint and non-complaint, on the one hand, it prevents the model from being fitted, on the other hand, eliminates the influence of unbalanced data on the model effect. In feature selection, the Gini coefficient is used for feature selection, which reduces the noise of the data and improves the accuracy of the model prediction. In the aspect of parameter adjustment, using R language software to adjust the quantity parameter and the maximal characteristic parameter of the model decision tree, the final OOB error rate of the model is 5.03%, the accuracy rate and the recall rate are over 70%, which can reach the applicable level. At present, the complaint early warning model has been used in the pilot application to realize the early identification of the complaint business, by adopting the corresponding service strategy, the service escalation events and the customer complaints are reduced and the customer perception is enhanced effectively.
作者
朱龙珠
宫立华
刘鲲鹏
杨菁
赵强
ZHU Long-zhu;GONG Li-hua;LIU Kun-peng;YANG Jing;ZHAO Qiang(State Grid Corporation Customer Service Center,Tianjin 300000,China;Beijing Data Ocean Wisdom Technology Co.,Ltd.,Beijing 100081,China)
出处
《电力信息与通信技术》
2018年第8期60-65,共6页
Electric Power Information and Communication Technology
关键词
参数优化
随机森林
重复拨打
SMOTE算法
投诉预警
parameter optimization
random forest
repeated call
SMOTE algorithm
complaint warning