摘要
在Web日志挖掘中数据预处理是整个挖掘过程的基础,直接影响日志挖掘的质量和结果。提出了一种基于用户访问树的Web日志挖掘数据预处理方法,该方法在处理过程中根据Web日志建立用户访问树,并利用用户访问树进行用户和事务识别,从而可以在缺乏网站拓扑结构的情况下准确地对Web日志进行预处理。
Data preprocessing is the basis of the whole process of data mining in Web log mining,which directly influences the quality of the Web log mining and its result. A method of data preprocessing in Web log mining based on the user access tree was proposed. The user access tree was created according to the Web logs in the preprocessing and it was used to identify the user and transaction. So the preprocessing can be worked well without the site topology.
出处
《计算机科学》
CSCD
北大核心
2009年第9期154-156,210,共4页
Computer Science
关键词
WEB日志挖掘
数据预处理
用户识别
事务识别
Web log mining, Data preprocessing, User identification, Transaction identification