摘要
利用XGBoost模型对国内Y银行的信贷数据建立客户申请评分模型,并运用该模型对新客户的违约概率进行预测,然后在训练集和测试集上对模型得分进行分箱,同时计算在每个分数段坏客户的累计召回率,以便对模型的预测效果进行评价。研究发现:模型在训练集和测试集上的AUC值分别为0.97和0.93,且模型在训练集和测试集上得分最高(得分即模型预测为坏客户的概率)的5%的客户可以分别覆盖78.7%和55.6%的坏客户。实践中,可针对得分最高和最低的客户群采取不同的风险政策,以达到最小化银行金融资产风险并最大化公司收益的目的。
The author uses the XGBoost model to establish the customer application scoring model for credit data from domestic bank Y,and predicts the probability of default for new customers.In the end,to evaluate the prediction effect of the model,the model score is boxed on the train set and test set,and the cumulative recall rate of the bad customers in each scored segment is calculated.The AUC of the model are as high as 0.97 and 0.93 on the train set and the test set,and customers whose scores on the train set and the test set are in the top 5%(that is,the probability that the model predicted to be a bad customer)can cover 78.7%and 55.6%of the bad customers.In practice,different risk policies can be adopted for the highest and lowest customer base,to minimize the risk of banking financial asset and maximize corporate earnings.
出处
《上海立信会计金融学院学报》
2020年第1期17-28,共12页
Journal of Shanghai Lixin University of Accounting and Finance
基金
教育部人文社会科学研究一般项目“基于大数据思维的小微企业信用指数体系研究”(17YJA790059)
浙江省教育厅一般科研项目“基于数据挖掘的网络众筹模式下用户参与动机研究”(Y201940884)
浙江金融职业学院基本科研业务费一般项目“大数据视角下网络众筹投资者行为模式的统计研究”(2019YB66)。