期刊文献+

基于数据挖掘的网络购物用户兴趣分类研究 被引量:1

Research on Interest Classification of Online Shopping Users Based on Data Mining
下载PDF
导出
摘要 对网络购物用户兴趣进行分类研究可以根据网络购物用户的兴趣向其推送可能感兴趣的商品,能够在为用户提供更多的方便同时提高店铺的销售量。当前方法的初始簇中心选择具有随机性,受网络购物用户兴趣数据孤立点的影响较大,分类稳定性和准确性较差。提出一种基于数据挖掘的网络购物用户兴趣分类方法,对网络购物用户兴趣的网页浏览停留时间、兴趣的持续时间、兴趣得分分别进行了分析和计算。通过设定关于网络购物用户兴趣得分的阀值对强兴趣、弱兴趣以及非兴趣进行区分。将网络购物用户兴趣分为短期兴趣和长期兴趣两种,构建了网络购物用户兴趣模型。为了提高分类结果的稳定性以及排除孤立点的影响,对K-Means算法进行了改进,对网络购物用户兴趣数据进行多次采样,最终选取较优的初始簇中心对网络购物用户兴趣进行分类。仿真结果证明,所提方法受网络购物用户兴趣数据孤立点的影响明显降低,且获得的分类结果更加接近实际数据分布,CPU用时和迭代次数较少。 This paper proposed a method of classifying network shopping user interest based on data mining. This method analyzed and calculated the web browsing time, the time of duration and the interest score of network shopping user interest. By setting the threshold value about network shopping user interest score, we distinguished strong inter- est, weak interest and non-interest. Then, we divided network shopping user interests into short-term interest and long-term interest. Meanwhile, we built the model of network shopping user interest. In order to improve the stability of classification result and exclude the influence of isolated point, we improved K-Means algorithm and sampled the network shopping user interest data time after time. Finally, we chose the better initial cluster center to classify the network shopping user interest. Simulation results prove that the influence of isolated point of online shopping user in- terest data on proposed method is significantly reduced. Meanwhile, the classification result is close to the actual data distribution, which needs little time consumption of CUP and number of iterations.
作者 韩景灵 HAN Jing-ling(Business College of Shanxi University,Taiyuan Shanxi 030031,China)
出处 《计算机仿真》 北大核心 2018年第7期418-421,共4页 Computer Simulation
关键词 数据挖掘 网络购物 用户 兴趣 分类 Data mining Network shopping User Interest Classification
  • 相关文献

参考文献10

二级参考文献100

  • 1江小平,李成华,向文,张新访,颜海涛.k-means聚类算法的MapReduce并行化实现[J].华中科技大学学报(自然科学版),2011,39(S1):120-124. 被引量:79
  • 2梁邦勇,李涓子,王克宏.基于语义Web的网页推荐模型[J].清华大学学报(自然科学版),2004,44(9):1272-1276. 被引量:9
  • 3屠金路,金瑜,王庭照.bootstrap法在合成分数信度区间估计中的应用[J].心理科学,2005,28(5):1199-1200. 被引量:12
  • 4苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:387
  • 5Choudhury M D,Diakopoulos N,Naaman M.Unfolding theevent landscape on twitter:classification and exploration of user categories[C]∥Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work.2012:241-244.
  • 6Perez-Sola C,Herrera-Joancomarti J.Classifying online socialnetwork users through the social graph[C]∥Proceedings of the 5th international conference on Foundations and Practice of Security.2012,115-131.
  • 7Chu Z,Gianvecchio S,Wang H,et al.Who is tweeting on Twitter:human,bot,or cyborg?[C]∥Proceedings of the 26th Annual Computer Security Applications Conference.2010:21-30.
  • 8Pennacchiotti M,Popescu A-M.Democrats,republicans andstarbucks afficionados:user classification in twitter[C]∥Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2011:430-438.
  • 9An Exhaustive Study of Twitter Users Across the World-Beevolve,Social Media Analytics Platform[EB/OL].http://www.beevolve.com/twitter-statistics/.
  • 10Xu Z,Ru L,Xiang L,et al.Discovering User Interest on Twitter with a Modified Author-Topic Model[C]∥Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.Volume 01,2011:422-429.

共引文献135

同被引文献11

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部