期刊文献+

基于有效特征选择的高价值移动通信用户预测方法 被引量:5

Prediction for high-value mobile communication users based on efficient feature selection
下载PDF
导出
摘要 高价值移动通信用户预测是电信客户关系管理中的一项重要内容。针对建立预测模型时遇到的高维、大规模、类不平衡等数据处理问题,提出了一种基于有效特征选择的预测方法。利用欠采样方式从初始不平衡数据集提取多个平衡训练集,使用结合Pearson相关性分析和随机森林特征重要性评估的特征选择策略,在集成学习方法中嵌入加权和投票机制获得最优的特征子集,最后采用随机森林算法建立预测模型。实验结果表明,该预测模型可以有效降低特征集的维度并提升对高价值移动通信用户的预测性能。 The prediction of high-value mobile communication user is an important part of telecom cus-tomer relationship management. This paper proposed a predicting method based on efficient feature selection to solve such problems as high dimension, large scale and imbalanced classes in data process-ing. With balanced training sets extracted from an initial imbalanced dataset using under-sampling,afeature selection strategy based on Pearson correlation analysis and random forest method assessing the feature's importance was applied and the best feature subset was selected by embedding weighted and voting mechanism in the ensemble learning method. The final prediction model was built by ran-dom forest algorithm. Experimental results show that the proposed model not only reduces the di-mension of feature set efficiently , but also improves its prediction performance for high -value mobile communication users.
出处 《武汉科技大学学报》 CAS 北大核心 2017年第2期149-154,共6页 Journal of Wuhan University of Science and Technology
基金 国家自然科学基金资助项目(60975031)
关键词 移动通信用户 不平衡数据集 特征选择 Pearson相关分析 随机森林 预测模型 mobile communication user imbalanced dataset feature selection Pearson correlation analysis random forest prediction model
  • 相关文献

参考文献9

二级参考文献125

  • 1刘涛,吴功宜,陈正.一种高效的用于文本聚类的无监督特征选择算法[J].计算机研究与发展,2005,42(3):381-386. 被引量:37
  • 2林盛,肖旭.基于RFM的电信客户市场细分方法[J].哈尔滨工业大学学报,2006,38(5):758-760. 被引量:42
  • 3杨占华,杨燕.SOM神经网络算法的研究与进展[J].计算机工程,2006,32(16):201-202. 被引量:78
  • 4Chen YL.Kuo MH,Wu SY,Tang K.Discovering recency,frequency,and monetary(RFM)sequential patterns from customers'purchasing data.Electronic Commerce Research and Applications,2009(8):241-251.
  • 5Kohonen T.Self-organized formation of topologically correct feature maps,Biological Cybernetics,1982,43(1):59-69.
  • 6Budayan C,Dikmen I,Birgonul MT.Comparing the performance of traditional cluster analysis,self-organizing maps and fuzzy C-means method for strategic grouping.Expert Systerms With Applications,2009,36:11772-11781.
  • 7YOON K, KWEK S. A data reduction approach for resolving the imbalanced data issue in functional genomics [ J ]. Neural Comput & Applic, 2007 (16) :295-306.
  • 8ZHENG Zhaohui, WU Xiaoyun, ROHINI Srihari. Feature selection for text categorization on imbalanced data [J]. SIGKDD Explorations, 2004, 6( 1 ) :80-89.
  • 9JIANG Shengyi, WANG Lianxi. Unsupervised feature selection based on clustering [ C ]//Proceedings of IEEE Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA). Changsha: IEEE, 2010: 263-270.
  • 10YU L, LIU H. Efficient feature selection via analysis of relevance and redundancy [J]. Journal of Machine Learning Research, 2004, 5 : 1205-1224.

共引文献433

同被引文献52

引证文献5

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部