期刊文献+
共找到12篇文章
< 1 >
每页显示 20 50 100
Data-driven human and bot recognition from web activity logs based on hybrid learning techniques
1
作者 Marek Gajewski Olgierd Hryniewicz +5 位作者 Agnieszka Jastrzębska Mariusz Kozakiewicz Karol Opara Jan Wojciech Owsiński Sławomir Zadrozny Tomasz Zwierzchowski 《Digital Communications and Networks》 SCIE CSCD 2024年第4期1178-1188,共11页
Distinguishing between web traffic generated by bots and humans is an important task in the evaluation of online marketing campaigns.One of the main challenges is related to only partial availability of the performanc... Distinguishing between web traffic generated by bots and humans is an important task in the evaluation of online marketing campaigns.One of the main challenges is related to only partial availability of the performance metrics:although some users can be unambiguously classified as bots,the correct label is uncertain in many cases.This calls for the use of classifiers capable of explaining their decisions.This paper demonstrates two such mechanisms based on features carefully engineered from web logs.The first is a man-made rule-based system.The second is a hierarchical model that first performs clustering and next classification using human-centred,interpretable methods.The stability of the proposed methods is analyzed and a minimal set of features that convey the classdiscriminating information is selected.The proposed data processing and analysis methodology are successfully applied to real-world data sets from online publishers. 展开更多
关键词 web logs Classification CLUSTERING web traffic Bots INTERPRETABILITY
下载PDF
Web Mining Model Based on Rough Set Theory
2
作者 吴冰 赵林度 《Journal of Southeast University(English Edition)》 EI CAS 2002年第1期54-58,共5页
Due to a great deal of valuable information contained in the Web log file, the result of Web mining can be used to enhance the decision making for electronic commerce (EC) operation and management. Because of ambiguo... Due to a great deal of valuable information contained in the Web log file, the result of Web mining can be used to enhance the decision making for electronic commerce (EC) operation and management. Because of ambiguous and abundance of the Web log file, the least decision making model based on rough set theory was presented for Web mining. And an example was given to explain the model. The model can predigest the decision making table, so that the least solution of the table can be acquired. According to the least solution, the corresponding decision for individual service can be made in sequence. Web mining based on rough set theory is also currently the original and particular method. 展开更多
关键词 web mining rough sets electronic commerce knowledge reasoning web log
下载PDF
Web multimedia information retrieval using improved Bayesian algorithm 被引量:3
3
作者 余铁军 陈纯 +1 位作者 余铁民 林怀忠 《Journal of Zhejiang University Science》 EI CSCD 2003年第4期415-420,共6页
The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based... The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based on data mining, and then integrated into the original information space model to improve the accuracy of the new information space model. It can remove clutter and irrelevant text information and help to eliminate mismatch between the page author' s expression and the user' s understanding and expectation. User spacemodel was also utilized to discover the relationship between high-level and low-level features for assigning weight. The authors proposed improved Bayesian algorithm for data mining. Experiment proved that the au-thors' proposed algorithm was efficient. 展开更多
关键词 Relevant feedback web log mining Improved Bayesian algorithm User space model
下载PDF
Testing and Evaluation for Web Usability Based on Extended Markov Chain Model 被引量:2
4
作者 MAOCheng-ying LUYan-sheng 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期687-693,共7页
As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. W... As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. Web applications generally contain lots of pages and are used by enormous users. Statistical testing is an effective way of ensuring their quality. Web usage can be accurately described by Markov chain which has been proved to be an ideal model for software statistical testing. The results of unit testing can be utilized in the latter stages, which is an important strategy for bottom-to-top integration testing, and the other improvement of extended Markov chain model (EMM) is to present the error type vector which is treated as a part of page node. this paper also proposes the algorithm for generating test cases of usage paths. Finally, optional usage reliability evaluation methods and an incremental usability regression testing model for testing and evaluation are presented. Key words statistical testing - evaluation for Web usability - extended Markov chain model (EMM) - Web log mining - reliability evaluation CLC number TP311. 5 Foundation item: Supported by the National Defence Research Project (No. 41315. 9. 2) and National Science and Technology Plan (2001BA102A04-02-03)Biography: MAO Cheng-ying (1978-), male, Ph.D. candidate, research direction: software testing. Research direction: advanced database system, software testing, component technology and data mining. 展开更多
关键词 statistical testing evaluation for web usability extended Markov chain model (EMM) web log mining reliability evaluation
下载PDF
The design and implementation of web mining in web sites security 被引量:2
5
作者 LI Jian, ZHANG Guo-yin , GU Guo-chang, LI Jian-li College of Computer Science and Technology, Harbin Engineering University, Harbin 150001China 《Journal of Marine Science and Application》 2003年第1期81-86,共6页
The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illeg... The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illegal access can be avoided. Firstly, the system for discovering the patterns of information leakages in CGI scripts from Web log data was proposed. Secondly, those patterns for system administrators to modify their codes and enhance their Web site security were provided. The following aspects were described: one is to combine web application log with web log to extract more information,so web data mining could be used to mine web log for discovering the information that firewall and Information Detection System cannot find. Another approach is to propose an operation module of web site to enhance Web site security. In cluster server session, Density -Based Clustering technique is used to reduce resource cost and obtain better efficiency. 展开更多
关键词 data mining web log mining web sites security density-based clustering
下载PDF
Fuzzy Clustering Method for Web User Based on Pages Classification 被引量:2
6
作者 ZHANLi-qiang LIUDa-xin 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期553-556,共4页
A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the... A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the site, then computes fuzzy degree of cross page through aggregating on data of Web log. After that, by using fuzzy comprehensive evaluation method, the method constructs user interest vectors according to page viewing times and frequency of hits, and derives the fuzzy similarity matrix from the interest vectors for the Web users. Finally, it gets the clustering result through the fuzzy clustering method. The experimental results show the effectiveness of the method. Key words Web log mining - fuzzy similarity matrix - fuzzy comprehensive evaluation - fuzzy clustering CLC number TP18 - TP311 - TP391 Foundation item: Supported by the Natural Science Foundation of Heilongjiang Province of China (F0304)Biography: ZHAN Li-qiang (1966-), male, Lecturer, Ph. D. research direction: the theory methods of data mining and theory of database. 展开更多
关键词 web log mining fuzzy similarity matrix fuzzy comprehensive evaluation fuzzy clustering
下载PDF
Semantic Session Analysis for Web Usage Mining 被引量:1
7
作者 ZHANG Hui SONG Hantao XU Xiaomei 《Wuhan University Journal of Natural Sciences》 CAS 2007年第5期773-776,共4页
A semantic session analysis method partitioning Web usage logs is presented. Semantic Web usage log preparation model enhances usage logs with semantic. The Markov chain model based on ontology semantic measurement is... A semantic session analysis method partitioning Web usage logs is presented. Semantic Web usage log preparation model enhances usage logs with semantic. The Markov chain model based on ontology semantic measurement is used to identifying which active session a request should belong to. The competitive method is applied to determine the end of the sessions. Compared with other algorithms, more successful sessions are additionally detected by semantic outlier analysis. 展开更多
关键词 web usage mining web log preparation session analysis
下载PDF
Asynchronous and Synchronous Communication in College English Writing in Web-based Learning Environment
8
作者 丁洁 《科技信息》 2009年第1期-,共2页
In web-based learning environment,College English writing has always been a thorny issue.Here both asynchronous and synchronous communications in college English writing mean the new interactive teaching belief. This ... In web-based learning environment,College English writing has always been a thorny issue.Here both asynchronous and synchronous communications in college English writing mean the new interactive teaching belief. This paper attempts to do the blending of two in the traditional writing learning and teaching in college English in order to promote a more flexible,efficient and interactive learning environment in accordance with students' interests and needs. 展开更多
关键词 asynchronous and synchronous communication web-based learning E-MAIL key pal web log
下载PDF
An Effective Network Traffic Data Control Using Improved Apriori Rule Mining 被引量:1
9
作者 Subbiyan Prakash Murugasamy Vijayakumar 《Circuits and Systems》 2016年第10期3162-3173,共12页
The increasing usage of internet requires a significant system for effective communication. To pro- vide an effective communication for the internet users, based on nature of their queries, shortest routing ... The increasing usage of internet requires a significant system for effective communication. To pro- vide an effective communication for the internet users, based on nature of their queries, shortest routing path is usually preferred for data forwarding. But when more number of data chooses the same path, in that case, bottleneck occurs in the traffic this leads to data loss or provides irrelevant data to the users. In this paper, a Rule Based System using Improved Apriori (RBS-IA) rule mining framework is proposed for effective monitoring of traffic occurrence over the network and control the network traffic. RBS-IA framework integrates both the traffic control and decision making system to enhance the usage of internet trendier. At first, the network traffic data are ana- lyzed and the incoming and outgoing data information is processed using apriori rule mining algorithm. After generating the set of rules, the network traffic condition is analyzed. Based on the traffic conditions, the decision rule framework is introduced which derives and assigns the set of suitable rules to the appropriate states of the network. The decision rule framework improves the effectiveness of network traffic control by updating the traffic condition states for identifying the relevant route path for packet data transmission. Experimental evaluation is conducted by extrac- ting the Dodgers loop sensor data set from UCI repository to detect the effectiveness of theproposed Rule Based System using Improved Apriori (RBS-IA) rule mining framework. Performance evaluation shows that the proposed RBS-IA rule mining framework provides significant improvement in managing the network traffic control scheme. RBS-IA rule mining framework is evaluated over the factors such as accuracy of the decision being obtained, interestingness measure and execution time. 展开更多
关键词 Network Traffic Internet Traffic Condition Rule Mining Decision Rule Framework INTERESTINGNESS Traffic Data web Log
下载PDF
Discovering User Profiles for Web Personalized Recommendation 被引量:2
10
作者 Ai-BoSong Mao-XianZhao +2 位作者 Zuo-PengLiang Yi-ShengDong Jun-ZhouLuo 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第3期320-328,共9页
With the growing popularity of the World Wide Web, large volume of useraccess data has been gathered automatically by Web servers and stored in Web logs. Discovering andunderstanding user behavior patterns from log fi... With the growing popularity of the World Wide Web, large volume of useraccess data has been gathered automatically by Web servers and stored in Web logs. Discovering andunderstanding user behavior patterns from log files can provide Web personalized recommendationservices. In this paper, a novel clustering method is presented for log files called Clusteringlarge Weblog based on Key Path Model (CWKPM), which is based on user browsing key path model, to getuser behavior profiles. Compared with the previous Boolean model, key path model considers themajor features of users'' accessing to the Web: ordinal, contiguous and duplicate. Moreover, forclustering, it has fewer dimensions. The analysis and experiments show that CWKPM is an efficientand effective approach for clustering large and high-dimension Web logs. 展开更多
关键词 web log user profile PERSONALIZATION generalized suffix tree CLUSTERING
原文传递
Web log classification framework with data augmentation based on GANs 被引量:1
11
作者 He Mingshu Jin Lei +1 位作者 Wang Xiaojuan Li Yuan 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2020年第5期34-46,共13页
Attacks on web servers are part of the most serious threats in network security fields.Analyzing logs of web attacks is an effective approach for malicious behavior identification.Traditionally,machine learning models... Attacks on web servers are part of the most serious threats in network security fields.Analyzing logs of web attacks is an effective approach for malicious behavior identification.Traditionally,machine learning models based on labeled data are popular identification methods.Some deep learning models are also recently introduced for analyzing logs based on web logs classification.However,it is limited to the amount of labeled data in model training.Web logs with labels which mark specific categories of data are difficult to obtain.Consequently,it is necessary to follow the problem about data generation with a focus on learning similar feature representations from the original data and improve the accuracy of classification model.In this paper,a novel framework is proposed,which differs in two important aspects:one is that long short-term memory(LSTM)is incorporated into generative adversarial networks(GANs)to generate the logs of web attack.The other is that a data augment model is proposed by adding logs of web attack generated by GANs to the original dataset and improved the performance of the classification model.The results experimentally demonstrate the effectiveness of the proposed method.It improved the classification accuracy from 89.04%to 95.04%. 展开更多
关键词 generative adversarial networks(GANs) web log data augmentation CLASSIFICATION
原文传递
Conceptualizing Mining of Firm's Web Log Files 被引量:1
12
作者 Ruangsak TRAKUNPHUTTHIRAK Yen CHEUNG Vincent C.S.LEE 《Journal of Systems Science and Information》 CSCD 2017年第6期489-510,共22页
In this era of a data-driven society, useful data(Big Data) is often unintentionally ignored due to lack of convenient tools and expensive software. For example, web log files can be used to identify explicit informat... In this era of a data-driven society, useful data(Big Data) is often unintentionally ignored due to lack of convenient tools and expensive software. For example, web log files can be used to identify explicit information of browsing patterns when users access web sites. Some hidden information,however, cannot be directly derived from the log files. We may need external resources to discover more knowledge from browsing patterns. The purpose of this study is to investigate the application of web usage mining based on web log files. The outcome of this study sets further directions of this investigation on what and how implicit information embedded in log files can be efficiently and effectively extracted. Further work involves combining the use of social media data to improve business decision quality. 展开更多
关键词 web usage mining web log files Big Data machine learning business intelligence
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部