摘要
协同过滤算法在个性化推荐系统中应用广泛,为保证其在用户规模扩大的同时可以保持推荐的高效性和准确性,设计了一种基于PCA降维和二分K-means聚类的协同过滤推荐算法PK-CF。该算法为解决用户-项目评分矩阵极度稀疏造成的相似度计算误差的问题,采用主成分分析法对用户-项目评分矩阵进行降维,去除含信息量少的维度,只保留最能代表用户特征的维度;为解决协同过滤算法在系统规模庞大情况下的相似度计算时耗问题,通过在降维后的低维向量空间上进行二分K-means聚类来减小目标用户最近邻的搜索范围。在MovieLens数据集上对传统协同过滤算法、基于K-means聚类的协同过滤算法及PK-CF算法进行性能测试的结果表明:PK-CF算法不仅能有效地提高推荐结果的准确率与召回率,而且具有较高的时间效率。
Collaborative filtering algorithms are widely used in personalized recommendation system.In order to ensure the high efficiency and accuracy of recommendation while expanding the scale of users,a collaborative filtering recommendation algorithm PK-CF based on PCA dimensionality reduction and binary K-means clustering is designed.The principal component analysis is used to reduce the dimension of the user-item scoring matrix,remove dimensions with little information,and only retain the dimension that best represents the user’s characteristics,so as to solve the problem of the larger similarity calculation error caused by the extremely sparse user-item scoring matrix.In order to solve the problem of time-consuming of similarity calculation under the condition of large scale of the system,the algorithm reduces the search range of the nearest neighbor of the target user by performing binary K-means clustering on the low-dimensional vector space.The results of testing performance of traditional collaborative filtering algorithm,the collaborative filtering algorithm based on K-means clustering and the PK-CF algorithm on Movie Lens dataset show that the PK-CF algorithm can effectively improve the accuracy and recall rate of the recommendation results with higher time efficiency.
作者
陈希
李玲娟
CHEN Xi;LI Ling-juan(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210003,China)
出处
《计算机技术与发展》
2020年第2期138-142,共5页
Computer Technology and Development
基金
国家自然科学基金(61571238)
关键词
主成分分析
二分K-means聚类
协同过滤
个性化推荐
principal component analysis
binary K-means clustering
collaborative filtering
personalized recommendation