摘要
准确的科学主题预测能够明确学科未来的发展方向,为科研领域的发展规划和管理决策提供参考。本文着眼于新生科学主题的预测,基于知识单元重组视角,将主题-特征词的表征关系类比为科学概念-知识单元的表征关系,提出科学主题预测方法。首先,使用LDA(latent Dirichlet allocation)主题模型获取全局主题、特征词与概率矩阵,通过转置向量空间获得特征词向量;其次,运用ARIMA(autoregressive integrated moving average model)模型预测特征词的词频并计算向量调节系数,从而获得特征词预测向量,运用t-SNE(t-distributed stochastic neighbor embedding)算法将预测向量降维,并使用模糊C-均值算法将低维预测向量聚类生成预测主题,实现知识单元的重组;最后,筛选出由多个原始主题聚合而来、具有全新释义的预测主题,将其视为科学主题预测结果。本文以“知识管理-知识组织-知识服务”领域为例进行实证研究,预测出智库、数字人文等在已有领域研究中尚未出现的新词与相关主题,并通过特征词直接聚合与概念集成这两种主题映射模式,获得这些新生主题的基本内涵与相关研究内容。实证结果表明,本文提出的科学主题预测方法能够准确地预测出新生主题。
Accurate scientific topic prediction can clarify the future development direction of a given discipline and provide a reference for the development planning and management decision-making in the field of scientific research.This paper focuses on the prediction of new scientific topics based on the perspective of knowledge unit reorganization,compares the representation relationship between the topic and feature words to the representation relationship between scientific concepts and knowledge units,and proposes a scientific topic prediction method.First,the LDA(latent Dirichlet allocation)topic model is used to obtain the global topic,feature words,and probability matrix and obtains the feature word vector by transposing the vector space;second,the vector adjustment coefficients are calculated based on the feature word frequencies predicted by the ARIMA(autoregressive integrated moving average model)model to obtain the feature word prediction vectors,the t-SNE(t-distributed stochastic neighbor embedding)algorithm is applied to reduce the dimensionality of the prediction vectors,and then the low-dimensional prediction vectors are clustered by the fuzzy C-mean algorithm to generate prediction topics to realize the reorganization of knowledge units.Finally,the prediction topic with a new interpretation is selected from the aggregation of several original topics,and this is regarded as the scientific topic prediction re‐sult.This paper takes the field of“knowledge management-knowledge organization-knowledge service”as an example for conducting empirical research.The results show that the proposed scientific topic prediction method in this paper can effectively predict new scientific topics from which the essential concepts and the corresponding research content of some words have not appeared at that time,such as digital humanities and knowledge payment.
作者
梁继文
杨建林
王伟
Liang Jiwen;Yang Jianlin;Wang Wei(School of Information Management,Nanjing University,Nanjing 210023;Jiangsu Key Laboratory of Data Engineering&Knowledge Service,Nanjing 210023)
出处
《情报学报》
CSSCI
CSCD
北大核心
2023年第5期511-524,共14页
Journal of the China Society for Scientific and Technical Information
基金
国家社会科学基金重点项目“大数据环境下领域知识加工与组织模式研究(20ATQ006)”。
关键词
知识单元
科学概念
科学主题
主题预测
向量调节
knowledge unit
scientific concepts
science topics
topic prediction
vector adjustment