期刊文献+

基于动态评价URL链接结构的主题爬行策略

下载PDF
导出
摘要 在深入分析了HTML页面的超链结构的基础上,加入了锚文本内容分析权重和动态评价策略,提出了算法的改进,形成了综合动态价值的URL链接结构的搜索策略。改进的算法根据链接类型的不同赋予了不同的权重因子,并结合了动态价值评价穿越"隧道",简化了优先级的计算,有效地降低了"短视"问题和"主题漂移"现象,是一种高效实用的主题采集策略。
作者 郑凯
出处 《福建电脑》 2010年第2期83-84,96,共3页 Journal of Fujian Computer
  • 相关文献

参考文献13

  • 1P De Bra,GJ Houben,Y Komatzky,ct al.Information Retrieval in Distributed Hypertexts.in:Proceeding of the 4th RIAO Conference.New York, USA.1994.481-491.
  • 2罗方芳,陈国龙,郭文忠.基于改进的Fish-search算法的信息检索研究[J].福州大学学报(自然科学版),2006,34(2):184-188. 被引量:9
  • 3Michael Hersovici,Michal Jacovi,Dan PeUeg,et al.The shark-search algo- rithm-An application:Tailored Web site mapping.Computer Networks and ISDN System, 1998,30: 256-264.
  • 4林海霞,原福永,陈金森.主题网络蜘蛛搜索策略贪婪性解决方法[J].微电子学与计算机,2006,23(z1):278-280. 被引量:4
  • 5李盛韬.WebCrawling技术研究[R].北京:中国科学院计算机技术研究所.2003.
  • 6徐群岭.搜索引擎的定性、定量评价研究与合理选择[J].情报杂志,2003,22(3):32-33. 被引量:8
  • 7Charu C Aggarwal,Fatima Al-Gamwi,Philip S Yu.Intelligent Crawling on the World Wide Web with Arbitrary Predicates.in:Proceedings of the 10th International Wodd Wide Web Conference.Hong Kong,China,2001. 96-105.
  • 8Davison B.Topical locality in the web [A].Annual International Conference Information Retrieval [C].Athens, 2000 : 272-279.
  • 9M Diligenfi,F M Coetzee,S Lawrence,et al .Focused crawling using context graphs [A] .26th International Conference on Very Large Database [C].eBusiness Research Center, 2000 : 527-534.
  • 10Ester M.Grob M,Kriegel H.Focused Web crawling:a genetic fi:amwork for specifying the user interest and for adaptive crawling Stratrgies [A]. Proc of the International Conference on Very Large Database (VLDB 01)[C] . ACM,2001.

二级参考文献65

  • 1[8]Cho,Molina. Synchronizing a database to improve freshness. In:Junghoo Cho, Hector Garcia-Molina, eds. Proc. of 2000 ACM Intl. Conf. on Management of Data(SIGMOD),May 2000
  • 2[9]Cho, Molina, Page. Efficient Crawling Through URL Ordering.In: Junghoo Cho,Hector Garcia-Molina and Lawrence Page, eds.Proc. of the Seventh Intl. World Wide Web Conf. Toronto,Canada,May 1999
  • 3[10]Edwards,et al. An Adaptive Model for Optimizing Performance of an Incremental Web Crawler. In: J. Edwards, K. McCurley, J.Tomlin,eds. Proc. of the 10th Intl. World Wide Web Conf. Hong Kong ,May 2001
  • 4[11]Heydon ,Najork .Mercator:A Scalable,Extensible Web Crawler.A. Heydon and M. Najork. In World Wide Web Journal, Dec.1999. 219~229
  • 5[12]Kamba T,Bharat K,Albers M. The Krakatoa Chronicle - An Interactive, Personalized, Newspaper on the Web. In: Proc. of WWW 4,Boston, USA,Dec. 1995
  • 6[13]Kahle B. Preserving the Internet,Scientific American,March 1997
  • 7[14]Koster M. The Web Robots Pages. 1999
  • 8[15]Lawrence S,Giles C L. Accessibility of information on the Web.Nature, 1999,400(6740) :107~109
  • 9[16]Letizia. An Agent That Assists Web Browsing. In:H. Lieberman,ed. Proc. of the Intl. Joint Conf. on AI,Montreal ,Canada,Aug.1995
  • 10[17]Is Agent-Based Online Search Feasible?. In: F. Menzcer, ed.Working Notes of the AAAI Spring Symposium on Intelligent Agents in Cyberspace,Stanford,USA,March 1999

共引文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部