摘要
本文在概述数据挖掘技术的基本概念、分析方法的基础上,分析与比较了文本自动聚类算法;综述了国内外面向MEDLINE文献数据库的数据挖掘、知识提取研究;简述了可扩展标识语言(XML)的基本概念、文档格式及其在数据管理及数据挖掘中的应用。
In the paper, basic concepts, analysis means and algorithms of the data mining technique, and automatic textual clustering algorithm together with the comparing were outlined. Firstly, studies of data mining and knowledge extraction from MEDLINE database at home and abroad were reviewed secondly; then the XML basic concepts, the document form and its application in data management and data mining were introduced.
出处
《图书馆学研究》
CSSCI
2008年第1期22-24,38,共4页
Research on Library Science