期刊文献+

基于网络安全的网页过滤模型及其关键算法 被引量:1

Webpagefilteralgorithmmodelanditskeyalgorithmsbasedonnetworksecurity
下载PDF
导出
摘要 识别存在于大量的WEB网页中的不良信息的非法文本,并将其有效屏蔽,是未来信息过滤研究的新领域。在传统方法的基础上,在对抓取到的网页进行预处理后,设置加权的关键字词典;应用汉语语料库里同类词的概念,从词汇关联的角度出发,最终提出了基于同类词权重均值的关联过滤算法。最后,从两个角度进行算法评估,该过滤算法更为高效,并且能够很好的应对不良网站的反关键字过滤策略。 As the World Wide Web continues to grow at an exponential rate,the Webpage Information Filtering used for identify the illegitimate text includes ill information,and then delete them.Result from the ever-increasing of the ill information in webpage,in the future it is a new field in the research of information filtering.Based on the traditional way of keywords,the webpage grasped was per-treated and then the key word dictionary was set up with weight;by applying the concepts of the same category words in Chinese corpus,from an angle of lexical relevance,the relevance filtering algorithm based on same category words weight was put forward.Finally,an algorithm evaluation from two angles consideration was carried out.The filter algorithm is more effective and copes with the strategy to the anti-keyword filtering of eroticism website.
出处 《中南林业科技大学学报》 CAS CSCD 北大核心 2011年第12期197-201,共5页 Journal of Central South University of Forestry & Technology
关键词 网页过滤 矩阵词典 权重均值 webpage filtering matrix dictionary weight equal value
  • 相关文献

参考文献4

  • 1Knuth D E,Morris J H,Pratt V R.Fast pattern matching instrings[J].SIAM Journal on Computing,1977,6(2):323-350.
  • 2Boyer R S,Moore J S.A Fast String Searching Algorithm[J].Communication ACM,1977,20(10):762-772.
  • 3Aho A V,Corasick M J.Efficient String Matching:an aid tobibliographic search[J].Communication.ACM,1975,18(6):333-340.
  • 4韩客松,王永成,滕伟.Web页面中文文本主题的自动提取研究[J].情报学报,2001,20(2):217-223. 被引量:12

二级参考文献3

共引文献11

同被引文献9

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部