期刊文献+

基于组合分类的消费者信用评估 被引量:14

Customers Credit Scoring Based On Ensemble Classification
下载PDF
导出
摘要 为了降低单个分类模型的不稳定性,提高其应用于消费者信用评估的准确性,提出一种基于组合分类的消费者信用评估方法。首先通过有监督聚类将各个类别的数据样本划分为若干子集合,使得各个子集合中数据样本均来自同一类别,再对不同类别子集合之间进行两两组合得到训练样本子集合,然后在各个训练样本子集合中分别建立不同的分类模型。在分类模型结果综合阶段,以各个分类模型在待分类样本的近邻训练样本中的分类性能作为权重,对各个分类模型的结果进行加权投票以产生待分类样本的最终分类结果。实证研究以决策树作为基本分类器,通过在实际的消费者信用数据集上的比较分析,说明所提出方法相对于其它组合分类方法具有更高的分类准确性,可以有效应用于消费者信用评估。 With the rapid growth in the credit industry,the credit scoring models are widely utilized to classify customers' credit as either accepted( good) or rejected( bad) based on their characteristics,such as age,income,and marital status. For commercial banks and creditors,efficient credit scoring models are important because they could reduce the risk and operational cost,and increase market competitiveness. Therefore,customers credit scoring is an important research topic in the field of business administration.In most of the existing researches,only one classification model is employed for scoring customers credit. However,it has been shown theoretically and experimentally that using multiple classifiers tends to be an effective technique for improving accuracy and stability of a single classifier. Therefore,it is beneficial to introduce ensemble classification for scoring customers credit. For an efficient ensemble classification,it is required that classifiers in the ensemble are accurate and have diversity in their predictions.Current researches on ensemble classification often utilize random sampling( either in the instance space or in the feature space) to generate classifiers for ensemble. Thus,the diversity of classifiers is not guaranteed,which may lead to a degradation of classification performance. To overcome this problem,we propose an ensemble classification based on supervised clustering,and apply it to scoring customers credit. Without loss of generality,only binary classification is considered in this paper.The proposed approach first employs the supervised clustering technique to partition the original data set into a number of subsets,in each of which the observations are from the same class. In addition,subsets from different classes are paired to form a number of training subsets. The idea behind the supervised clustering technique is that different observations may have different patterns even in the same class. By clustering,observations with similar patterns or characteristics are grouped into the same cluster. Therefore,the training subsets,obtained by pairwise conjugate of subsets from different classes,could better represent different patterns of customers.K-means clustering is employed since it is effective for partitioning large samples. Meanwhile,we use a validity index to optimize the clustering result. As a result,different classifiers are constructed in different subsets. For an unknown instance,if a class label needs to be predicted,we combine the outputs of different classifiers by weighted voting. The weight associated with a classifier is determined by classification performance of the classifier in the neighborhood of the unknown instance.In the empirically study,two real world data sets are adopted to evaluate the predictive accuracy of the proposed credit scoring approach based on ensemble classification. The widely used bagging and RSM approaches for ensemble classification are used as the benchmarks. The results indicate that the proposed approach is able to improve the accuracy of ensemble classification. In addition,the classifiers generated through the supervised clustering technique,and pairwise conjugate of subsets from different classes are more diverse than those in bagging and RSM approaches,which is favorable for efficient ensemble. Therefore,the proposed ensemble classification approach based on supervised clustering and weighted voting techniques is a promising approach for scoring customers credit.
作者 王昱
出处 《管理工程学报》 CSSCI 北大核心 2015年第1期30-38,共9页 Journal of Industrial Engineering and Engineering Management
基金 国家自然科学基金项目资助项目(71001112) 中央高校基本科研业务费资助项目(CDJSK 11066)
关键词 有指导聚类 组合分类 信用评估 决策树 supervised clustering ensemble classification credit scoring decision tree
  • 相关文献

参考文献26

  • 1Eisenbeis RA. Problems in applying discirminant in credit scoring models [J]. Journal of Banking and Finance, 1978, 2:205 : 219.
  • 2Henley WE. Statistical aspects of credit scoring [ D ]. Ph.D. Thesis, Open University, 1995.
  • 3Henley WE, Hand D J. A k-NN classifier for assessing consumer credit risk [ J]. The Statistician, 1996, 65:77 -95.
  • 4Zhang D, Zhnu X, Leung SCH, et al. Vertical bagging decision trees model for credit scoring [ J ]. Expert Systems with Applications, 2010, 37 ( 12 ) : 7838 - 7843.
  • 5Shi Y, Peng Y, Xu W, et al. Data mining via multiple criteria linear programming: applications in credit card portfolio management [J]. International Journal of Information Technology and Decision Making, 2002, 1 : 131 - 151.
  • 6Luo ST, Cheng BW, Hsieh CH. Prediction model building with clustering-launched classification and support vector machines in credit scoring [ J]. Expert Systems with Applications, 2009, 36 (d) : 7562 -7566.
  • 7West D. Neural network credit scoring models [ J]. Computers & Operations Research, 2000, 27 ( 11 ) : 1131 - 1152.
  • 8Huang JJ, Tzeng GH, Ong CS. Two-stage genetic programming (2SGP) for the credit scoring model [ J]. Applied Mathematics and Computation, 2006, 174 ( 2 ) : 1039 : 1053.
  • 9孙洁,李辉.企业财务困境的多分类器混合组合预测[J].系统工程理论与实践,2009,29(2):78-86. 被引量:12
  • 10Tsymbal A, Pechenizkiy M, Cunningham P. Diversity in search strategies for ensemble feature selection [J]. Information Fusion, 2005, 6:83 :98.

二级参考文献3

共引文献11

同被引文献66

引证文献14

二级引证文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部