单选题 One of the difficulties in building an SQL-like query lange for the Web is the absence of a database schema for this huge, heterogeneous repository of information. However, if we are interested in HTML documents only, we can construct a virtual (66) from the implicit structure of these files. Thus, at the highest level of (67) , every such document is identified by its Uniform Resource Locator (URL), has a title and a text Also, Web servers provide some additional information such as the type, length, and the last modification date of a document. So, for data mining purposes, we can consider the site of all HTML documents as arelation:
Document (url, (68) , text, type, length, modify)
Where all the (69) are character strings. In this framework, anindividual document is identified with a (70) in this relation. Of course, if some optional information is missing from the HTML document, the associate fields will de left blank, but this is not uncommon in any database.

【正确答案】 D

【答案解析】[参考译文] 在万维网上建立一个类似于SQL的查询语言的困难之一是缺乏一种适用于这种巨大的、异构型信息仓库的数据库模式。然而，如果仅限于HTML文档，我们就可以由这种文件的隐含结构建立一种虚拟模式。这样，在最高抽象级别，每个文档都可以由统一资源定位器(URL)来标识，有一个标题和一个文本。同时，由Web服务器了来提供某些附加的信息，例如，类型、长度和文档的最后修改日期。这样，对于数据挖掘应用来说，我们可以把所有HTML文档的集合看做一个关系：
Document (ur1，title，text，type，length，modify)
这里，所有的属性都是字符串。在这种框架下，一个单独的文档可以用这种关系的一个元组来标识。当然，如果某些任选信息在HTML文档中缺失，有关字段就留做空白，但这种情况在任何数据库中都是常见的。