期刊文献+

基于启发式规则的Deep Web接口发现 被引量:1

Discovery of Deep Web Interface Based on Heuristic Rules
下载PDF
导出
摘要 为了有效地利用Deep Web资源,Deep Web数据集成成为当前研究的热点之一.能否高效地发现Deep Web站点是Deep Web数据集成的基础和关键.在此,提出了一种Deep Web接口发现方法,包括基于领域知识来确定合适的查询提交词和用启发式规则发现领域内Deep Web接口.实验结果表明,该方法达到了较高的准确率和召回率,具有良好的可行性和实用性. To make use of deep web resource effectively, Deep Web data integration has become one of the hot-spot in current study. It is the basis and crucial to integrate deep web data that whether or not discovery deep web sites efficiently. In this case, we present a deep web interface discovery method, which includes to deterimine the query terms based on domain knowledge and to discovery deep web interfaces with heuristic rules. The experimental results show that the method can achieve high accuracy and recall with good feasibility and practicability.
出处 《河北大学学报(自然科学版)》 CAS 北大核心 2010年第1期107-112,共6页 Journal of Hebei University(Natural Science Edition)
基金 河北省教育厅科学研究重点项目(ZH200804)
关键词 领域知识 启发式规则 DEEP Web接口发现 domain knowledge heuristic rules deep web interface discovery
  • 相关文献

参考文献11

二级参考文献117

  • 1高岭,赵朋朋,崔志明.Deep Web查询接口的自动判定[J].计算机技术与发展,2007,17(5):148-151. 被引量:13
  • 2Boykin S, Merlino A. Machine learning of event segmentation for news on demand[J]. Communications of the ACM, 2000,43(2):35-41.
  • 3Luhn H P. A statistical approach to mechanized encoding and searching of literary information[J]. IBM Journal, 1957,10(1):309-317.
  • 4Edmundson H. New methods in automatic extracting[J]. Journal of the ACM, 1969,16(2):264-285.
  • 5Salton G, James A, Buckley C. Automatic analysis, theme generation, and summarization of machine-readable texts[J]. Science, 1994,264(3):1421-1426.
  • 6Lehnert W, Loiselle C. An introduction to plot unit[A]. Semantic Structures-Advances in Natural Language Processing[C]. Hillsdale: Lawrence Erlbaum Associates, 1989.88-111.
  • 7Hearst A. Context and structure in automated full-text information access[D]. Berkeley:University of California, 1994.103-105.
  • 8Peter W F. Latent semantic analysis for text-based research, behavior research methods[J]. Instruments and Computers, 1996,28(2):197-202.
  • 9Fabrizio S. Machine learning in automated text categorization[J]. ACM Computing Surveys, 2002,34(1):1-47.
  • 10Sangkon L, Masami S. Passage segmentation based on topic matter[J]. Computer Processing of Oriental Languages, 2002,15(3):305-340.

共引文献163

同被引文献9

  • 1Cope J, Craswell N, Hawking D. Automated discovery of search interfaces on the Web [ C ]//Proceedings of the 14th Australasian Database Conf. Adelaide: Australian Computer Society Press, 2003 : 181-189.
  • 2Manuel Alvarez,Juan Raposo,Alberto Pan, et al. Deep- Bot: A Focused Crawler for Accessing Hidden Web Content [ C ]//Proceedings of DEECS, 2007 : 18-25.
  • 3Raghavan S, Garcia-Molina H. Crawling the hidden Web[ C]//Proceedings of the 27th Int'l Conf. on Very Large Data Bases. Rome: ACM Press, 2001 : 129-138.
  • 4Fuzhi Zhang, Junfeng Chang, Xianshuang Zhang. A Deep Web Query Interface Automatic Identification Approach Based on SVM [ J]. ICIC Express Letters, 2011, 5 (1) : 59 -64.
  • 5He H, Meng W, Yu C T, et al. Constructing Interface Schemas for Search Interfaces of Web Databases [ C ]// Proceedings of WISE, 2005:29-42.
  • 6Zhen Zhang, Bin He, Kevin Chen-Chuan Chang. Light- weight Domain-based Form Assistant: Querying Web Databases On the Fly[ C]//Proceedings of the 31st Very Large Data Bases Conference, 2005:97-108.
  • 7刘伟,孟小峰,凌妍妍.一种基于图模型的Web数据库采样方法[J].软件学报,2008,19(2):179-193. 被引量:29
  • 8王辉,刘艳威,左万利.使用分类器自动发现特定领域的深度网入口(英文)[J].软件学报,2008,19(2):246-256. 被引量:14
  • 9徐鹏,林森.基于C4.5决策树的流量分类方法[J].软件学报,2009,20(10):2692-2704. 被引量:171

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部