摘要
对网络购物用户兴趣进行分类研究可以根据网络购物用户的兴趣向其推送可能感兴趣的商品,能够在为用户提供更多的方便同时提高店铺的销售量。当前方法的初始簇中心选择具有随机性,受网络购物用户兴趣数据孤立点的影响较大,分类稳定性和准确性较差。提出一种基于数据挖掘的网络购物用户兴趣分类方法,对网络购物用户兴趣的网页浏览停留时间、兴趣的持续时间、兴趣得分分别进行了分析和计算。通过设定关于网络购物用户兴趣得分的阀值对强兴趣、弱兴趣以及非兴趣进行区分。将网络购物用户兴趣分为短期兴趣和长期兴趣两种,构建了网络购物用户兴趣模型。为了提高分类结果的稳定性以及排除孤立点的影响,对K-Means算法进行了改进,对网络购物用户兴趣数据进行多次采样,最终选取较优的初始簇中心对网络购物用户兴趣进行分类。仿真结果证明,所提方法受网络购物用户兴趣数据孤立点的影响明显降低,且获得的分类结果更加接近实际数据分布,CPU用时和迭代次数较少。
This paper proposed a method of classifying network shopping user interest based on data mining. This method analyzed and calculated the web browsing time, the time of duration and the interest score of network shopping user interest. By setting the threshold value about network shopping user interest score, we distinguished strong inter- est, weak interest and non-interest. Then, we divided network shopping user interests into short-term interest and long-term interest. Meanwhile, we built the model of network shopping user interest. In order to improve the stability of classification result and exclude the influence of isolated point, we improved K-Means algorithm and sampled the network shopping user interest data time after time. Finally, we chose the better initial cluster center to classify the network shopping user interest. Simulation results prove that the influence of isolated point of online shopping user in- terest data on proposed method is significantly reduced. Meanwhile, the classification result is close to the actual data distribution, which needs little time consumption of CUP and number of iterations.
作者
韩景灵
HAN Jing-ling(Business College of Shanxi University,Taiyuan Shanxi 030031,China)
出处
《计算机仿真》
北大核心
2018年第7期418-421,共4页
Computer Simulation
关键词
数据挖掘
网络购物
用户
兴趣
分类
Data mining
Network shopping
User
Interest
Classification