摘要
[目的/意义]信息资源爆炸式增长使科技文献知识的组织趋于自动化,文献自动标引是组织构建科技文献数字资源的基础和关键。针对目前科技文献自动标引语义粒度不够精准以及难以适应大规模海量文献标引的问题,提出了基于语义层级细粒度的自动标引方法。[方法/过程]在传统知识组织自动标引方法的基础上,对知识组织工具中的语义资源进行深入挖掘,借助知识组织中概念间的语义层级结构对概念信息进行语义扩展,并设计基于语义层级细粒度的概念遴选方法以解决传统方法标引效率过低的问题,从而实现对大规模文献的概念高效标引。[结果/结论]实验结果表明,文章所提出的方法较好地实现了概念表示效果,有效降低了不相关概念在标引结果中出现的几率,并且在提高了标引结果文献相关性的同时大大减少了标引所需的时间,实现了知识组织工具在自动标引领域更深层次的利用,为科技文献数字资源的挖掘计算提供有价值的参考和支持。
[Purpose/significance]The explosive growth of information resources has led to the automation of the organization of scientific and technological literature knowledge.Automatic indexing of literature is the foundation and key to the construction of digital resources for scientific and technological literature.A semantic hierarchical granularity automatic indexing method is proposed to address the current issues of imprecise semantic granularity and difficulty in adapting to large-scale literature indexing in scientific and technological literature.[Method/process]On the basis of traditional knowledge organization automatic indexing methods,carries out in-depth exploration of semantic resources in knowledge organization tools and semantic extension of concept information by using the semantic hierarchy structure between concepts in knowledge organization.At the same time,a concept selection method based on semantic hierarchy granularity is designed to solve the problem of low indexing efficiency in traditional methods,thereby a-chieving efficient indexing of large-scale literature concepts.This method achieves deeper utilization of knowledge organization tools in the field of automatic indexing from the perspective of semantic relationships,and achieves good indexing effects and efficiency,improving the efficiency of concept indexing.[Result/conclusion]The experimental results show that this method achieves good concept representation,effectively reducing the probability of irrelevant concepts appearing in the indexing results,and greatly re-ducing the time required for indexing while improving the literature relevance of the indexing results.This provides valuable reference and support for the mining and calculation of digital resources in scientific and technological literature.
出处
《情报理论与实践》
CSSCI
北大核心
2024年第5期194-203,193,共11页
Information Studies:Theory & Application
关键词
自动标引
语义层级
语义扩展
语义关系
词向量
automatic indexing
semantic hierarchy
semantic extension
semantic relationship
word vector