摘要
针对传统的协同过滤算法的数据稀疏性以及相似度计算中对用户的共同评分项过度依赖问题,提出一种改进相似度与矩阵分解的协同过滤推荐算法。首先,利用矩阵分解对用户评分矩阵降维处理,缓解数据稀疏对推荐准确率的影响;其次,将巴氏系数融入到用户的相似度计算公式中,解决传统计算方法中依赖于用户间的共同评分信息的问题,并将用户属性相似度与改进的相似度加权融合,解决系统新用户的评分数据稀少问题;最后,根据预测评分给目标用户生成推荐。将该算法在MovieLens数据集.上与传统协同过滤算法、基于巴氏系数与用户属性改进的算法以及基于矩阵分解的推荐算法进行准确率、召回率、F1值对比分析。实验结果表明,当邻居个数为25时,与后三种算法相比,所提出算法的准确率分别提高了38.89%、37.97%、33.71%;召回率分别提高了38.92%、38.05%、33.84%;F1值分别提高了38.88%、38.02%、33.74%。
Focusing on the data sparsity of traditional collaborative filtering algorithms and the overdependence problem on common scoring items in the similarity calculation, a collaborative filtering recommendation algorithm with improved similarity and matrix decomposition is proposed. Matrix factorization is used to reduce the dimension of the user rating matrix to mitigate the impact of data sparsity on recommendation accuracy. Bhattacharyya coefficient is incorporated into the user’s similarity calculation formula to solve the problem of relying on common rating information in traditional calculation methods. The user attribute information is integrated into the calculation of improved user’s similarity to solve the problem that the scoring data of new users in the system is scarce. Recommendations are generated for target users based on the prediction score. The algorithm is compared with the traditional collaborative filtering algorithm, the algorithm with the improvement of the Bhattacharyya coefficient and user attributes, and the recommendation algorithm based on matrix decomposition on the MovieLens data set for accuracy, recall, and F1 values. The experimental results show that when the number of neighbors is 25, compared with the latter three algorithms, the accuracy of the proposed algorithm increases by 38.89%, 37.97% and 33.71%, respectively;the recall rate increases by 38.92%, 38.05% and 33.84%, respectively;and the F1 value increases by 38.88%, 38.02% and 33.74%, respectively.
作者
孙海娇
陈海龙
闫五岳
程苗
SUN Haijiao;CHEN Hailong;YAN Wuyue;CHENG Miao(College of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China)
出处
《黑龙江大学自然科学学报》
CAS
2020年第4期419-426,共8页
Journal of Natural Science of Heilongjiang University
基金
国家自然科学基金面上项目(61772160)
哈尔滨市科技创新人才研究专项资金项目(2017RAQXJ045)。
关键词
协同过滤
矩阵分解
巴氏系数
用户属性
collaborative filtering
matrix factorization
Bhattacharyya coefficient
user attribute