摘要
XPATH在Web信息提取中起重要作用,但是这些XPATH规则通常要人工生成。文中讨论了在XPATH与基于文本上下文规则的信息提取方法结合的系统中如何归纳学习XPATH规则。生成的XPATH规则结构简单,可以为基于文本上下文的信息提取系统提供较为准确的信息定位。
XPATH plays an important role in Web information extraction, but these XPATH rules usually generated by hand. Discusses about how to inductively learn XPATH rules used in an XPATH and text - context - based rules combined infomlation extraction system. The generated rules have simple structure, and they can support as an accurate locator for text- context- based informstation extraction system.
出处
《计算机技术与发展》
2007年第3期98-101,共4页
Computer Technology and Development
基金
江苏省高技术研究计划(G2004034)