期刊文献+
共找到92,452篇文章
< 1 2 250 >
每页显示 20 50 100
Aspect-Based Sentiment Classification Using Deep Learning and Hybrid of Word Embedding and Contextual Position
1
作者 Waqas Ahmad Hikmat Ullah Khan +3 位作者 Fawaz Khaled Alarfaj Saqib Iqbal Abdullah Mohammad Alomair Naif Almusallam 《Intelligent Automation & Soft Computing》 SCIE 2023年第9期3101-3124,共24页
Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,p... Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,prior methodologies widely utilize either word embedding or tree-based rep-resentations.Meanwhile,the separate use of those deep features such as word embedding and tree-based dependencies has become a significant cause of information loss.Generally,word embedding preserves the syntactic and semantic relations between a couple of terms lying in a sentence.Besides,the tree-based structure conserves the grammatical and logical dependencies of context.In addition,the sentence-oriented word position describes a critical factor that influences the contextual information of a targeted sentence.Therefore,knowledge of the position-oriented information of words in a sentence has been considered significant.In this study,we propose to use word embedding,tree-based representation,and contextual position information in combination to evaluate whether their combination will improve the result’s effectiveness or not.In the meantime,their joint utilization enhances the accurate identification and extraction of targeted aspect terms,which also influences their classification process.In this research paper,we propose a method named Attention Based Multi-Channel Convolutional Neural Net-work(Att-MC-CNN)that jointly utilizes these three deep features such as word embedding with tree-based structure and contextual position informa-tion.These three parameters deliver to Multi-Channel Convolutional Neural Network(MC-CNN)that identifies and extracts the potential terms and classifies their polarities.In addition,these terms have been further filtered with the attention mechanism,which determines the most significant words.The empirical analysis proves the proposed approach’s effectiveness compared to existing techniques when evaluated on standard datasets.The experimental results represent our approach outperforms in the F1 measure with an overall achievement of 94%in identifying aspects and 92%in the task of sentiment classification. 展开更多
关键词 Sentiment analysis word embedding aspect extraction consistency tree multichannel convolutional neural network contextual position information
下载PDF
Enhanced Image Captioning Using Features Concatenation and Efficient Pre-Trained Word Embedding
2
作者 Samar Elbedwehy T.Medhat +1 位作者 Taher Hamza Mohammed F.Alrahmawy 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3637-3652,共16页
One of the issues in Computer Vision is the automatic development of descriptions for images,sometimes known as image captioning.Deep Learning techniques have made significant progress in this area.The typical archite... One of the issues in Computer Vision is the automatic development of descriptions for images,sometimes known as image captioning.Deep Learning techniques have made significant progress in this area.The typical architecture of image captioning systems consists mainly of an image feature extractor subsystem followed by a caption generation lingual subsystem.This paper aims to find optimized models for these two subsystems.For the image feature extraction subsystem,the research tested eight different concatenations of pairs of vision models to get among them the most expressive extracted feature vector of the image.For the caption generation lingual subsystem,this paper tested three different pre-trained language embedding models:Glove(Global Vectors for Word Representation),BERT(Bidirectional Encoder Representations from Transformers),and TaCL(Token-aware Contrastive Learning),to select from them the most accurate pre-trained language embedding model.Our experiments showed that building an image captioning system that uses a concatenation of the two Transformer based models SWIN(Shiftedwindow)and PVT(PyramidVision Transformer)as an image feature extractor,combined with the TaCL language embedding model is the best result among the other combinations. 展开更多
关键词 Image captioning word embedding CONCATENATION TRANSFORMER
下载PDF
Word Embeddings and Semantic Spaces in Natural Language Processing
3
作者 Peter J. Worth 《International Journal of Intelligence Science》 2023年第1期1-21,共21页
One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse ... One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP. 展开更多
关键词 Natural Language Processing Vector Space Models Semantic Spaces word embeddings Representation Learning Text Vectorization Machine Learning Deep Learning
下载PDF
Pattern Matching of Industrial Alarm Floods Using Word Embedding and Dynamic Time Warping
4
作者 Wenkai Hu Xiangxiang Zhang +2 位作者 Jiandong Wang Guang Yang Yuxin Cai 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第4期1096-1098,共3页
Dear Editor,This letter proposes a new pattern matching method based on word embedding and dynamic time warping(DTW)to identify groups of similar alarm floods.First,alarm messages are transformed into numeric values t... Dear Editor,This letter proposes a new pattern matching method based on word embedding and dynamic time warping(DTW)to identify groups of similar alarm floods.First,alarm messages are transformed into numeric values that represent alarms and also reflect the relationships between alarm occurrences.Then,similarities between numerically encoded alarm flood sequences are calculated by DTW and groups of similar floods are identified via clustering.The effectiveness of the proposed method is demonstrated by a case study with alarm&event data obtained from a public industrial simulation model. 展开更多
关键词 word ALARM DTW
下载PDF
基于word embedding的短文本特征扩展与分类 被引量:8
5
作者 孟欣 左万利 《小型微型计算机系统》 CSCD 北大核心 2017年第8期1712-1717,共6页
近几年短文本的大量涌现,给传统的自动文本分类技术带来了挑战.针对短文本特征稀疏、特征覆盖率低等特点,提出了一种基于word embedding扩展短文本特征的分类方法.word embedding是一种词的分布式表示,表示形式为低维连续的向量形式,并... 近几年短文本的大量涌现,给传统的自动文本分类技术带来了挑战.针对短文本特征稀疏、特征覆盖率低等特点,提出了一种基于word embedding扩展短文本特征的分类方法.word embedding是一种词的分布式表示,表示形式为低维连续的向量形式,并且好的word embedding训练模型可以编码很多语言规则和语言模式.本文利用word embedding空间分布特点和其蕴含的线性规则提出了一种新的文本特征扩展方法.结合扩展特征我们分别在谷歌搜索片段、中国日报新闻摘要两类数据集上进行了短文本分类实验,对比于仅使用词袋表示文本特征的分类方法,准确率分别提高:8.59%,7.42%. 展开更多
关键词 word embedding 文本特征 语义推理 短文本分类
下载PDF
基于Word Embedding语义相似度的字母缩略术语消歧 被引量:6
6
作者 于东 荀恩东 《中文信息学报》 CSCD 北大核心 2014年第5期51-59,共9页
该文提出基于Word Embedding的歧义词多个义项语义表示方法,实现基于知识库的无监督字母缩略术语消歧。方法分两步聚类,首先采用显著相似聚类获得高置信度类簇,构造带有语义标签的文档集作为训练数据。利用该数据训练多份Word Embeddin... 该文提出基于Word Embedding的歧义词多个义项语义表示方法,实现基于知识库的无监督字母缩略术语消歧。方法分两步聚类,首先采用显著相似聚类获得高置信度类簇,构造带有语义标签的文档集作为训练数据。利用该数据训练多份Word Embedding模型,以余弦相似度均值表示两个词之间的语义关系。在第二步聚类时,提出使用特征词扩展和语义线性加权来提高歧义分辨能力,提高消歧性能。该方法根据语义相似度扩展待消歧文档的特征词集合,挖掘聚类文档中缺失的语义信息,并使用语义相似度对特征词权重进行线性加权。针对25个多义缩略术语的消歧实验显示,特征词扩展使系统F值提高约4%,使用语义线性加权后F值再提高约2%,达到89.40%。 展开更多
关键词 字母缩略术语 术语消歧 word embedding 语义相似度
下载PDF
Novel Representations of Word Embedding Based on the Zolu Function
7
作者 Jihua Lu Youcheng Zhang 《Journal of Beijing Institute of Technology》 EI CAS 2020年第4期526-530,共5页
Two learning models,Zolu-continuous bags of words(ZL-CBOW)and Zolu-skip-grams(ZL-SG),based on the Zolu function are proposed.The slope of Relu in word2vec has been changed by the Zolu function.The proposed models can ... Two learning models,Zolu-continuous bags of words(ZL-CBOW)and Zolu-skip-grams(ZL-SG),based on the Zolu function are proposed.The slope of Relu in word2vec has been changed by the Zolu function.The proposed models can process extremely large data sets as well as word2vec without increasing the complexity.Also,the models outperform several word embedding methods both in word similarity and syntactic accuracy.The method of ZL-CBOW outperforms CBOW in accuracy by 8.43%on the training set of capital-world,and by 1.24%on the training set of plural-verbs.Moreover,experimental simulations on word similarity and syntactic accuracy show that ZL-CBOW and ZL-SG are superior to LL-CBOW and LL-SG,respectively. 展开更多
关键词 Zolu function word embedding continuous bags of words word similarity accuracy
下载PDF
基于Word Embedding的遥感影像检测分割 被引量:5
8
作者 尤洪峰 田生伟 +1 位作者 禹龙 吕亚龙 《电子学报》 EI CAS CSCD 北大核心 2020年第1期75-83,共9页
遥感影像检测分割技术通常需提取影像特征并通过深度学习算法挖掘影像的深层特征来实现.然而传统特征(如颜色特征、纹理特征、空间关系特征等)不能充分描述影像语义信息,而单一结构或串联算法无法充分挖掘影像的深层特征和上下文语义信... 遥感影像检测分割技术通常需提取影像特征并通过深度学习算法挖掘影像的深层特征来实现.然而传统特征(如颜色特征、纹理特征、空间关系特征等)不能充分描述影像语义信息,而单一结构或串联算法无法充分挖掘影像的深层特征和上下文语义信息.针对上述问题,本文通过词嵌入将空间关系特征映射成实数密集向量,与颜色、纹理特征的结合.其次,本文构建基于注意力机制下图卷积网络和独立循环神经网络的遥感影像检测分割并联算法(Attention Graph Convolution Networks and Independently Recurrent Neural Network,ATGIR).该算法首先通过注意力机制对结合后的特征进行概率权重分配;然后利用图卷积网络(GCNs)算法对高权重的特征进一步挖掘并生成方向标签,同时使用独立循环神经网络(IndRNN)算法挖掘影像特征中的上下文信息,最后用Sigmoid分类器完成影像检测分割任务.以胡杨林遥感影像检测分割任务为例,我们验证了提出的特征提取方法和ATGIR算法能有效提升胡杨林检测分割任务的性能. 展开更多
关键词 注意力机制 图卷积网络 独立循环神经网络 并联算法 词嵌入
下载PDF
基于word embedding和CNN的情感分类模型 被引量:19
9
作者 蔡慧苹 王丽丹 段书凯 《计算机应用研究》 CSCD 北大核心 2016年第10期2902-2905,2909,共5页
尝试将word embedding和卷积神经网络(CNN)相结合来解决情感分类问题。首先,利用skip-gram模型训练出数据集中每个词的word embedding,然后将每条样本中出现的word embedding组合为二维特征矩阵作为卷积神经网络的输入,此外每次迭代训... 尝试将word embedding和卷积神经网络(CNN)相结合来解决情感分类问题。首先,利用skip-gram模型训练出数据集中每个词的word embedding,然后将每条样本中出现的word embedding组合为二维特征矩阵作为卷积神经网络的输入,此外每次迭代训练过程中,输入特征也作为参数进行更新;其次,设计了一种具有三种不同大小卷积核的神经网络结构,从而完成多种局部抽象特征的自动提取过程。与传统机器学习方法相比,所提出的基于word embedding和CNN的情感分类模型成功地将分类正确率提升了5.04%。 展开更多
关键词 卷积神经网络 自然语言处理 深度学习 词嵌入 情感分类
下载PDF
基于计数模型的Word Embedding算法
10
作者 裴楠 王裴岩 张桂平 《沈阳航空航天大学学报》 2017年第2期66-72,共7页
Word Embedding是当今非常流行的用于文本处理任务的一种技术。基于计数模型的Word Embedding相比预测模型具有简单、快捷、易训练、善于捕捉词语相似性等优势。基于计数模型,选取2种上下文环境,运用2种权重计算方法和2种相似度计算方法... Word Embedding是当今非常流行的用于文本处理任务的一种技术。基于计数模型的Word Embedding相比预测模型具有简单、快捷、易训练、善于捕捉词语相似性等优势。基于计数模型,选取2种上下文环境,运用2种权重计算方法和2种相似度计算方法,构建了5种Word Embedding模型。在词语相似性任务上比较和分析了5种Word Embedding模型,发现采用降维策略后的词表达效果要优于降维前的词表达效果;5种模型中,选取窗口上下文,PMI权重计算方法和余弦相似度计算方法的Word Embedding模型在词语相似性任务上表现最为出色。将5种模型和基于预测的Skip-gram模型进行了对比,结果表明在选取训练向量维度为100维时,基于计数的大部分模型在词语相似性任务上可以达到和Skip-gram一样甚至更好的性能。 展开更多
关键词 词表达 计数模型 分布式词表达 词语相似性
下载PDF
TWE‐WSD: An effective topical word embedding based word sense disambiguation
11
作者 Lianyin Jia Jilin Tang +3 位作者 Mengjuan Li Jinguo You Jiaman Ding Yinong Chen 《CAAI Transactions on Intelligence Technology》 EI 2021年第1期72-79,共8页
Word embedding has been widely used in word sense disambiguation(WSD)and many other tasks in recent years for it can well represent the semantics of words.However,the existing word embedding methods mostly represent e... Word embedding has been widely used in word sense disambiguation(WSD)and many other tasks in recent years for it can well represent the semantics of words.However,the existing word embedding methods mostly represent each word as a single vector,without considering the homonymy and polysemy of the word;thus,their performances are limited.In order to address this problem,an effective topical word embedding(TWE)‐based WSD method,named TWE‐WSD,is proposed,which integrates Latent Dirichlet Allocation(LDA)and word embedding.Instead of generating a single word vector(WV)for each word,TWE‐WSD generates a topical WV for each word under each topic.Effective integrating strategies are designed to obtain high quality contextual vectors.Extensive experiments on SemEval‐2013 and SemEval‐2015 for English all‐words tasks showed that TWE‐WSD outperforms other state‐of‐the‐art WSD methods,especially on nouns. 展开更多
关键词 embedding word WSD
下载PDF
Bayesian estimation-based sentiment word embedding model for sentiment analysis
12
作者 Jingyao Tang Yun Xue +5 位作者 Ziwen Wang Shaoyang Hu Tao Gong Yinong Chen Haoliang Zhao Luwei Xiao 《CAAI Transactions on Intelligence Technology》 SCIE EI 2022年第2期144-155,共12页
Sentiment word embedding has been extensively studied and used in sentiment analysis tasks.However,most existing models have failed to differentiate high-frequency and low-frequency words.Accordingly,the sentiment inf... Sentiment word embedding has been extensively studied and used in sentiment analysis tasks.However,most existing models have failed to differentiate high-frequency and low-frequency words.Accordingly,the sentiment information of low-frequency words is insufficiently captured,thus resulting in inaccurate sentiment word embedding and degradation of overall performance of sentiment analysis.A Bayesian estimation-based sentiment word embedding(BESWE)model,which aims to precisely extract the sentiment information of low-frequency words,has been proposed.In the model,a Bayesian estimator is constructed based on the co-occurrence probabilities and sentiment proba-bilities of words,and a novel loss function is defined for sentiment word embedding learning.The experimental results based on the sentiment lexicons and Movie Review dataset show that BESWE outperforms many state-of-the-art methods,for example,C&W,CBOW,GloVe,SE-HyRank and DLJT1,in sentiment analysis tasks,which demonstrate that Bayesian estimation can effectively capture the sentiment information of low-frequency words and integrate the sentiment information into the word embedding through the loss function.In addition,replacing the embedding of low-frequency words in the state-of-the-art methods with BESWE can significantly improve the performance of those methods in sentiment analysis tasks. 展开更多
关键词 FUNCTION embedding ESTIMATION
下载PDF
Application of Word Embedding to Drug Repositioning
13
作者 Duc Luu Ngo Naoki Yamamoto +5 位作者 Vu Anh Tran Ngoc Giang Nguyen Dau Phan Favorisen Rosyking Lumbanraja Mamoru Kubo Kenji Satou 《Journal of Biomedical Science and Engineering》 2016年第1期7-16,共10页
As a key technology of rapid and low-cost drug development, drug repositioning is getting popular. In this study, a text mining approach to the discovery of unknown drug-disease relation was tested. Using a word embed... As a key technology of rapid and low-cost drug development, drug repositioning is getting popular. In this study, a text mining approach to the discovery of unknown drug-disease relation was tested. Using a word embedding algorithm, senses of over 1.7 million words were well represented in sufficiently short feature vectors. Through various analysis including clustering and classification, feasibility of our approach was tested. Finally, our trained classification model achieved 87.6% accuracy in the prediction of drug-disease relation in cancer treatment and succeeded in discovering novel drug-disease relations that were actually reported in recent studies. 展开更多
关键词 Distributed Representation of word Sense Discovery of Drug-Disease Relation word Analogy
下载PDF
基于Word Embedding的软件工程领域语义相关词挖掘方法 被引量:2
14
作者 胡望胜 《计算机与现代化》 2017年第9期19-23,49,共6页
软件的开发及维护过程中经常要对代码进行搜索。基于关键字匹配的代码搜索面临与传统信息检索一样的问题,即用户查询关键字与代码文本用词不匹配。为提高代码搜索精度,需要挖掘软件中的语义相关词进行查询扩展。本文针对软件工程领域设... 软件的开发及维护过程中经常要对代码进行搜索。基于关键字匹配的代码搜索面临与传统信息检索一样的问题,即用户查询关键字与代码文本用词不匹配。为提高代码搜索精度,需要挖掘软件中的语义相关词进行查询扩展。本文针对软件工程领域设计了一种基于Word Embedding的语义相关词挖掘方法,并且采用IT技术问答网站Stack Overflow的文档作为语料库训练得到了共包含19332个单词的语义相关词表。与前人工作的对比实验验证了本文方法挖掘的语义相关词能有效提高代码搜索精度。 展开更多
关键词 代码搜索 查询扩展 语义相关词
下载PDF
Statute Recommendation Based on Word Embedding
15
作者 Peitang Ling Zian Wang +4 位作者 Yi Feng Jidong Ge Mengting He Chuanyi Li Bin Luo 《国际计算机前沿大会会议论文集》 2019年第1期546-548,共3页
The statute recommendation problem is a sub problem of the automated decision system, which can help the legal staff to deal with the process of the case in an intelligent and automated way. In this paper, an improved... The statute recommendation problem is a sub problem of the automated decision system, which can help the legal staff to deal with the process of the case in an intelligent and automated way. In this paper, an improved common word similarity algorithm is proposed for normalization. Meanwhile, word mover’s distance (WMD) algorithm was applied to the similarity measurement and statute recommendation problem, and the problem scene which was originally used for classification was extended. Finally, a variety of recommendation strategies different from traditional collaborative filtering methods were proposed. The experimental results show that it achieves the best value of Fmeasure reaching 0.799. And the comparative experiment shows that WMD algorithm can achieve better results than TF-IDF and LDA algorithm. 展开更多
关键词 Statute RECOMMENDATION word embedding word mover’s DISTANCE COLLABORATIVE FILTERING
下载PDF
基于Word VBA辅助技术文件编制的数字化协同建设的探索
16
作者 赵静 赵方鑫 《计算机应用文摘》 2024年第6期82-84,共3页
文件编制是设计研发人员日常工作的重要组成部分,其中技术文件的编制涉及大量文件结构和起草规则的应用,基于相关标准中关于文件的编写要求,文章利用WordVBA编程技术辅助技术文件编写工作中的格式编排,实现了Word文档标准格式技术文件... 文件编制是设计研发人员日常工作的重要组成部分,其中技术文件的编制涉及大量文件结构和起草规则的应用,基于相关标准中关于文件的编写要求,文章利用WordVBA编程技术辅助技术文件编写工作中的格式编排,实现了Word文档标准格式技术文件的自动化编制,从而保障文件编制符合标准格式要求,有效提高了工作效率。 展开更多
关键词 技术文件 word VBA编程 自动化
下载PDF
Enhancing visual security: An image encryption scheme based on parallel compressive sensing and edge detection embedding
17
作者 王一铭 黄树锋 +2 位作者 陈煌 杨健 蔡述庭 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第1期287-302,共16页
A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete... A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete wavelet transform.Then, the coefficient matrix is scrambled and compressed to obtain a size-reduced image using the Fisher–Yates shuffle and parallel compressive sensing. Subsequently, to increase the security of the proposed algorithm, the compressed image is re-encrypted through permutation and diffusion to obtain a noise-like secret image. Finally, an adaptive embedding method based on edge detection for different carrier images is proposed to generate a visually meaningful cipher image. To improve the plaintext sensitivity of the algorithm, the counter mode is combined with the hash function to generate keys for chaotic systems. Additionally, an effective permutation method is designed to scramble the pixels of the compressed image in the re-encryption stage. The simulation results and analyses demonstrate that the proposed algorithm performs well in terms of visual security and decryption quality. 展开更多
关键词 visual security image encryption parallel compressive sensing edge detection embedding
原文传递
A chaotic hierarchical encryption/watermark embedding scheme for multi-medical images based on row-column confusion and closed-loop bi-directional diffusion
18
作者 张哲祎 牟俊 +1 位作者 Santo Banerjee 曹颖鸿 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第2期228-237,共10页
Security during remote transmission has been an important concern for researchers in recent years.In this paper,a hierarchical encryption multi-image encryption scheme for people with different security levels is desi... Security during remote transmission has been an important concern for researchers in recent years.In this paper,a hierarchical encryption multi-image encryption scheme for people with different security levels is designed,and a multiimage encryption(MIE)algorithm with row and column confusion and closed-loop bi-directional diffusion is adopted in the paper.While ensuring secure communication of medical image information,people with different security levels have different levels of decryption keys,and differentiated visual effects can be obtained by using the strong sensitivity of chaotic keys.The highest security level can obtain decrypted images without watermarks,and at the same time,patient information and copyright attribution can be verified by obtaining watermark images.The experimental results show that the scheme is sufficiently secure as an MIE scheme with visualized differences and the encryption and decryption efficiency is significantly improved compared to other works. 展开更多
关键词 chaotic hierarchical encryption multi-medical image encryption differentiated visual effects row-column confusion closed-loop bi-directional diffusion transform domain watermark embedding
原文传递
计算机办公软件Word的具体操作应用探析
19
作者 黄美琴 《数字通信世界》 2024年第3期142-144,共3页
在办公软件中,Word作为一种基础软件,主要用于文字编辑、排版等方面,但是其功能多样,如果操作不熟练则会影响到软件功能的正常使用。鉴于此,文章主要围绕计算机办公软件Word的实际操作应用内容展开介绍,以期能够为相关人员的工作开展提... 在办公软件中,Word作为一种基础软件,主要用于文字编辑、排版等方面,但是其功能多样,如果操作不熟练则会影响到软件功能的正常使用。鉴于此,文章主要围绕计算机办公软件Word的实际操作应用内容展开介绍,以期能够为相关人员的工作开展提供借鉴和参考。 展开更多
关键词 word 计算机 办公软件 文档操作 文档管理 文本处理
下载PDF
基于LDA-Word2vec的图书情报领域机器学习研究主题演化与热点主题识别
20
作者 胡泽文 韩雅蓉 王梦雅 《现代情报》 北大核心 2024年第4期154-167,共14页
[目的/意义]在人工智能技术及应用快速发展与深刻变革背景下,机器学习领域不断出现新的研究主题和方法,深度学习和强化学习技术持续发展。因此,有必要探索不同领域机器学习研究主题演化过程,并识别出热点与新兴主题。[方法/过程]本文以... [目的/意义]在人工智能技术及应用快速发展与深刻变革背景下,机器学习领域不断出现新的研究主题和方法,深度学习和强化学习技术持续发展。因此,有必要探索不同领域机器学习研究主题演化过程,并识别出热点与新兴主题。[方法/过程]本文以图书情报领域中2011—2022年Web of Science数据库中的机器学习研究论文为例,融合LDA和Word2vec方法进行主题建模和主题演化分析,引入主题强度、主题影响力、主题关注度与主题新颖性指标识别热点主题与新兴热点主题。[结果/结论]研究结果表明,(1)Word2vec语义处理能力与LDA主题演化能力的结合能够更加准确地识别研究主题,直观展示研究主题的分阶段演化规律;(2)图书情报领域的机器学习研究主题主要分为自然语言处理与文本分析、数据挖掘与分析、信息与知识服务三大类范畴。各类主题之间的关联性较强,且具有主题关联演化特征;(3)设计的主题强度、主题影响力和主题关注度指标及综合指标能够较好地识别出2011—2014年、2015—2018年和2019—2022年3个不同周期阶段的热点主题。 展开更多
关键词 机器学习 LDA模型 word2vec 主题演化 热点主题 主题影响力 主题关注度
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部