摘要
提出一种基于"三矩阵"模型的偏爱浏览路径的挖掘方法。在单元数组存储结构(存储矩阵)基础上建立以浏览兴趣度为基本元素的会话矩阵和路径矩阵。在会话矩阵上采用2个页面向量夹角余弦作为相似用户的页面距离公式进行页面聚类,求得相似用户的相关页面集。并利用路径选择偏爱度在相似用户的路径矩阵上挖掘出相似用户的浏览偏爱路径。实验证明,该方法是合理有效的,能够得到更精准的用户偏爱浏览路径。
This paper proposes a new mining approach of user's preferred browsing paths through Web logs based on "three matrices" models. This approach establishes session matrix and trace matrix by taking browsing interest as the fundamental element based on cell storage structure (storage matrix), and carries on page clustering in the session matrix through using angle cosine in vector space between two pages, which is called the similar user's page distance formula. The similar user's relative pages set can be got. The similar user's browsing preferred paths by using path choice-preference in similar user's trace matrix are mined. Experiments prove that this method is reasonable effective and can obtain a more accurate user's preferred browsing paths.
出处
《计算机工程》
CAS
CSCD
北大核心
2009年第8期47-49,共3页
Computer Engineering
基金
国家自然科学基金资助项目(60603047)
辽宁省教育厅高等学校科研基金资助项目(2008341)
辽宁省科技计划基金资助项目(2008216014)
大连市优秀青年科技人才基金资助项目(2008J23JH026)
关键词
WEB日志
浏览兴趣度
页面聚类算法
Web logs
browsing interest level
page clustering algorithm