期刊文献+
共找到59篇文章
< 1 2 3 >
每页显示 20 50 100
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
1
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
下载PDF
Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model
2
作者 Zhixiang Ji Xiaohui Wang +1 位作者 Jie Zhang Di Wu 《Global Energy Interconnection》 EI CSCD 2023年第4期493-504,共12页
With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power... With the construction of new power systems,the power grid has become extremely large,with an increasing proportion of new energy and AC/DC hybrid connections.The dynamic characteristics and fault patterns of the power grid are complex;additionally,power grid control is difficult,operation risks are high,and the task of fault handling is arduous.Traditional power-grid fault handling relies primarily on human experience.The difference in and lack of knowledge reserve of control personnel restrict the accuracy and timeliness of fault handling.Therefore,this mode of operation is no longer suitable for the requirements of new systems.Based on the multi-source heterogeneous data of power grid dispatch,this paper proposes a joint entity–relationship extraction method for power-grid dispatch fault processing based on a pre-trained model,constructs a knowledge graph of power-grid dispatch fault processing and designs,and develops a fault-processing auxiliary decision-making system based on the knowledge graph.It was applied to study a provincial dispatch control center,and it effectively improved the accident processing ability and intelligent level of accident management and control of the power grid. 展开更多
关键词 Power-grid dispatch fault handling Knowledge graph pre-trained model Auxiliary decision-making
下载PDF
Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
3
作者 Jieyu An Wan Mohd Nazmee Wan Zainon Binfen Ding 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1673-1689,共17页
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on... Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment. 展开更多
关键词 Multimodal sentiment analysis vision–language pre-trained model contrastive learning sentiment classification
下载PDF
A Classification–Detection Approach of COVID-19 Based on Chest X-ray and CT by Using Keras Pre-Trained Deep Learning Models 被引量:4
4
作者 Xing Deng Haijian Shao +2 位作者 Liang Shi Xia Wang Tongling Xie 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第11期579-596,共18页
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai... The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection. 展开更多
关键词 COVID-19 detection deep learning transfer learning pre-trained models
下载PDF
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 被引量:4
5
作者 Xiao Wang Guangyao Chen +5 位作者 Guangwu Qian Pengcheng Gao Xiao-Yong Wei Yaowei Wang Yonghong Tian Wen Gao 《Machine Intelligence Research》 EI CSCD 2023年第4期447-482,共36页
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp... With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey. 展开更多
关键词 Multi-modal(MM) pre-trained model(PTM) information fusion representation learning deep learning
原文传递
Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract
6
作者 JIN Huan LI Qinying 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2023年第3期237-245,共9页
Smart contracts,which automatically execute on decentralized platforms like Ethereum,require high security and low gas consumption.As a result,developers have a strong demand for semantic code search tools that utiliz... Smart contracts,which automatically execute on decentralized platforms like Ethereum,require high security and low gas consumption.As a result,developers have a strong demand for semantic code search tools that utilize natural language queries to efficiently search for existing code snippets.However,existing code search models face a semantic gap between code and queries,which requires a large amount of training data.In this paper,we propose a fine-tuning approach to bridge the semantic gap in code search and improve the search accuracy.We collect 80723 different pairs of<comment,code snippet>from Etherscan.io and use these pairs to fine-tune,validate,and test the pre-trained CodeBERT model.Using the fine-tuned model,we develop a code search engine specifically for smart contracts.We evaluate the Recall@k and Mean Reciprocal Rank(MRR)of the fine-tuned CodeBERT model using different proportions of the finetuned data.It is encouraging that even a small amount of fine-tuned data can produce satisfactory results.In addition,we perform a comparative analysis between the fine-tuned CodeBERT model and the two state-of-the-art models.The experimental results show that the finetuned CodeBERT model has superior performance in terms of Recall@k and MRR.These findings highlight the effectiveness of our finetuning approach and its potential to significantly improve the code search accuracy. 展开更多
关键词 code search smart contract pre-trained code models program analysis machine learning
原文传递
Unsupervised statistical text simplification using pre-trained language modeling for initialization
7
作者 Jipeng QIANG Feng ZHANG +3 位作者 Yun LI Yunhao YUAN Yi ZHU Xindong WU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第1期81-90,共10页
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based mach... Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines. 展开更多
关键词 text simplification pre-trained language modeling BERT word embeddings
原文传递
DynamicRetriever:A Pre-trained Model-based IR System Without an Explicit Index
8
作者 Yu-Jia Zhou Jing Yao +2 位作者 Zhi-Cheng Dou Ledell Wu Ji-Rong Wen 《Machine Intelligence Research》 EI CSCD 2023年第2期276-288,共13页
Web search provides a promising way for people to obtain information and has been extensively studied.With the surge of deep learning and large-scale pre-training techniques,various neural information retrieval models... Web search provides a promising way for people to obtain information and has been extensively studied.With the surge of deep learning and large-scale pre-training techniques,various neural information retrieval models are proposed,and they have demonstrated the power for improving search(especially,the ranking)quality.All these existing search methods follow a common paradigm,i.e.,index-retrieve-rerank,where they first build an index of all documents based on document terms(i.e.,sparse inverted index)or representation vectors(i.e.,dense vector index),then retrieve and rerank retrieved documents based on the similarity between the query and documents via ranking models.In this paper,we explore a new paradigm of information retrieval without an explicit index but only with a pre-trained model.Instead,all of the knowledge of the documents is encoded into model parameters,which can be regarded as a differentiable indexer and optimized in an end-to-end manner.Specifically,we propose a pre-trained model-based information retrieval(IR)system called DynamicRetriever,which directly returns document identifiers for a given query.Under such a framework,we implement two variants to explore how to train the model from scratch and how to combine the advantages of dense retrieval models.Compared with existing search methods,the model-based IR system parameterizes the traditional static index with a pre-training model,which converts the document semantic mapping into a dynamic and updatable process.Extensive experiments conducted on the public search benchmark Microsoft machine reading comprehension(MS MARCO)verify the effectiveness and potential of our proposed new paradigm for information retrieval. 展开更多
关键词 Information retrieval(IR) document retrieval model-based IR pre-trained language model differentiable search index
原文传递
Red Alarm for Pre-trained Models:Universal Vulnerability to Neuron-level Backdoor Attacks
9
作者 Zhengyan Zhang Guangxuan Xiao +6 位作者 Yongwei Li Tian Lv Fanchao Qi Zhiyuan Liu Yasheng Wang Xin Jiang Maosong Sun 《Machine Intelligence Research》 EI CSCD 2023年第2期180-193,共14页
The pre-training-then-fine-tuning paradigm has been widely used in deep learning.Due to the huge computation cost for pre-training,practitioners usually download pre-trained models from the Internet and fine-tune them... The pre-training-then-fine-tuning paradigm has been widely used in deep learning.Due to the huge computation cost for pre-training,practitioners usually download pre-trained models from the Internet and fine-tune them on downstream datasets,while the downloaded models may suffer backdoor attacks.Different from previous attacks aiming at a target task,we show that a backdoored pre-trained model can behave maliciously in various downstream tasks without foreknowing task information.Attackers can restrict the output representations(the values of output neurons)of trigger-embedded samples to arbitrary predefined values through additional training,namely neuron-level backdoor attack(NeuBA).Since fine-tuning has little effect on model parameters,the fine-tuned model will retain the backdoor functionality and predict a specific label for the samples embedded with the same trigger.To provoke multiple labels in a specific task,attackers can introduce several triggers with predefined contrastive values.In the experiments of both natural language processing(NLP)and computer vision(CV),we show that NeuBA can well control the predictions for trigger-embedded instances with different trigger designs.Our findings sound a red alarm for the wide use of pre-trained models.Finally,we apply several defense methods to NeuBA and find that model pruning is a promising technique to resist NeuBA by omitting backdoored neurons. 展开更多
关键词 pre-trained language models backdoor attacks transformers natural language processing(NLP) computer vision(CV)
原文传递
Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization
10
作者 Liqiang Jing Yiren Li +3 位作者 Junhao Xu Yongcan Yu Pei Shen Xuemeng Song 《Machine Intelligence Research》 EI CSCD 2023年第2期289-298,共10页
Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MM... Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MMSS,they overlook the powerful generation ability of generative pre-trained language models(GPLMs),which have shown to be effective in many text generation tasks.To fill this research gap,we propose to using GPLMs to promote the performance of MMSS.Notably,adopting GPLMs to solve MMSS inevitably faces two challenges:1)What fusion strategy should we use to inject visual information into GPLMs properly?2)How to keep the GPLM′s generation ability intact to the utmost extent when the visual feature is injected into the GPLM.To address these two challenges,we propose a vision enhanced generative pre-trained language model for MMSS,dubbed as Vision-GPLM.In Vision-GPLM,we obtain features of visual and textual modalities with two separate encoders and utilize a text decoder to produce a summary.In particular,we utilize multi-head attention to fuse the features extracted from visual and textual modalities to inject the visual feature into the GPLM.Meanwhile,we train Vision-GPLM in two stages:the vision-oriented pre-training stage and fine-tuning stage.In the vision-oriented pre-training stage,we particularly train the visual encoder by the masked language model task while the other components are frozen,aiming to obtain homogeneous representations of text and image.In the fine-tuning stage,we train all the components of Vision-GPLM by the MMSS task.Extensive experiments on a public MMSS dataset verify the superiority of our model over existing baselines. 展开更多
关键词 Multimodal sentence summarization(MMSS) generative pre-trained language model(GPLM) natural language generation deep learning artificial intelligence
原文传递
Improving Extraction of Chinese Open Relations Using Pre-trained Language Model and Knowledge Enhancement
11
作者 Chaojie Wen Xudong Jia Tao Chen 《Data Intelligence》 EI 2023年第4期962-989,共28页
Open Relation Extraction(ORE)is a task of extracting semantic relations from a text document.Current ORE systems have significantly improved their efficiency in obtaining Chinese relations,when compared with conventio... Open Relation Extraction(ORE)is a task of extracting semantic relations from a text document.Current ORE systems have significantly improved their efficiency in obtaining Chinese relations,when compared with conventional systems which heavily depend on feature engineering or syntactic parsing.However,the ORE systems do not use robust neural networks such as pre-trained language models to take advantage of large-scale unstructured data effectively.In respons to this issue,a new system entitled Chinese Open Relation Extraction with Knowledge Enhancement(CORE-KE)is presented in this paper.The CORE-KE system employs a pre-trained language model(with the support of a Bidirectional Long Short-Term Memory(BiLSTM)layer and a Masked Conditional Random Field(Masked CRF)layer)on unstructured data in order to improve Chinese open relation extraction.Entity descriptions in Wikidata and additional knowledge(in terms of triple facts)extracted from Chinese ORE datasets are used to fine-tune the pre-trained language model.In addition,syntactic features are further adopted in the training stage of the CORE-KE system for knowledge enhancement.Experimental results of the CORE-KE system on two large-scale datasets of open Chinese entities and relations demonstrate that the CORE-KE system is superior to other ORE systems.The F1-scores of the CORE-KE system on the two datasets have given a relative improvement of 20.1%and 1.3%,when compared with benchmark ORE systems,respectively.The source code is available at https:/github.COm/cjwen15/CORE-KE. 展开更多
关键词 Chinese open relation extraction pre-trained language model Knowledge enhancement
原文传递
Offline Pre-trained Multi-agent Decision Transformer
12
作者 Linghui Meng Muning Wen +8 位作者 Chenyang Le Xiyun Li Dengpeng Xing Weinan Zhang Ying Wen Haifeng Zhang Jun Wang Yaodong Yang Bo Xu 《Machine Intelligence Research》 EI CSCD 2023年第2期233-248,共16页
Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity to access the real environment.Such a paradigm is also desirable for multi-agent reinforcement... Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity to access the real environment.Such a paradigm is also desirable for multi-agent reinforcement learning(MARL)tasks,given the combinatorially increased interactions among agents and with the environment.However,in MARL,the paradigm of offline pre-training with online fine-tuning has not been studied,nor even datasets or benchmarks for offline MARL research are available.In this paper,we facilitate the research by providing large-scale datasets and using them to examine the usage of the decision transformer in the context of MARL.We investigate the generalization of MARL offline pre-training in the following three aspects:1)between single agents and multiple agents,2)from offline pretraining to online fine tuning,and 3)to that of multiple downstream tasks with few-shot and zero-shot capabilities.We start by introducing the first offline MARL dataset with diverse quality levels based on the StarCraftII environment,and then propose the novel architecture of multi-agent decision transformer(MADT)for effective offline learning.MADT leverages the transformer′s modelling ability for sequence modelling and integrates it seamlessly with both offline and online MARL tasks.A significant benefit of MADT is that it learns generalizable policies that can transfer between different types of agents under different task scenarios.On the StarCraft II offline dataset,MADT outperforms the state-of-the-art offline reinforcement learning(RL)baselines,including BCQ and CQL.When applied to online tasks,the pre-trained MADT significantly improves sample efficiency and enjoys strong performance in both few-short and zero-shot cases.To the best of our knowledge,this is the first work that studies and demonstrates the effectiveness of offline pre-trained models in terms of sample efficiency and generalizability enhancements for MARL. 展开更多
关键词 pre-training model multi-agent reinforcement learning(MARL) decision making TRANSFORMER offline reinforcement learning
原文传递
Pre-trained models for natural language processing: A survey 被引量:115
13
作者 QIU XiPeng SUN TianXiang +3 位作者 XU YiGe SHAO YunFan DAI Ning HUANG XuanJing 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2020年第10期1872-1897,共26页
Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language rep... Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next,we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks. 展开更多
关键词 deep learning neural network natural language processing pre-trained model distributed representation word embedding self-supervised learning language modelling
原文传递
Satellite and instrument entity recognition using a pre-trained language model with distant supervision
14
作者 Ming Lin Meng Jin +1 位作者 Yufu Liu Yuqi Bai 《International Journal of Digital Earth》 SCIE EI 2022年第1期1290-1304,共15页
Earth observations,especially satellite data,have produced a wealth of methods and results in meeting global challenges,often presented in unstructured texts such as papers or reports.Accurate extraction of satellite ... Earth observations,especially satellite data,have produced a wealth of methods and results in meeting global challenges,often presented in unstructured texts such as papers or reports.Accurate extraction of satellite and instrument entities from these unstructured texts can help to link and reuse Earth observation resources.The direct use of an existing dictionary to extract satellite and instrument entities suffers from the problem of poor matching,which leads to low recall.In this study,we present a named entity recognition model to automatically extract satellite and instrument entities from unstructured texts.Due to the lack of manually labeled data,we apply distant supervision to automatically generate labeled training data.Accordingly,we fine-tune the pre-trained language model with early stopping and a weighted cross-entropy loss function.We propose the dictionary-based self-training method to correct the incomplete annotations caused by the distant supervision method.Experiments demonstrate that our method achieves significant improvements in both precision and recall compared to dictionary matching or standard adaptation of pre-trained language models. 展开更多
关键词 Earth observation named entity recognition pre-trained language model distant supervision dictionary-based self-training
原文传递
A comparative analysis of paddy crop biotic stress classification using pre-trained deep neural networks
15
作者 Naveen N.Malvade Rajesh Yakkundimath +2 位作者 Girish Saunshi Mahantesh C.Elemmi Parashuram Baraki 《Artificial Intelligence in Agriculture》 2022年第1期167-175,共9页
The agriculture sector is no exception to the widespread usage of deep learning tools and techniques.In this paper,an automated detection method on the basis of pre-trained Convolutional Neural Network(CNN)models is p... The agriculture sector is no exception to the widespread usage of deep learning tools and techniques.In this paper,an automated detection method on the basis of pre-trained Convolutional Neural Network(CNN)models is proposed to identify and classify paddy crop biotic stresses from the field images.The proposed work also provides the empirical comparison among the leading CNN models with transfer learning from the ImageNet weights namely,Inception-V3,VGG-16,ResNet-50,DenseNet-121 and MobileNet-28.Brown spot,hispa,and leaf blast,three of the most common and destructive paddy crop biotic stresses that occur during the flowering and ripening growth stages are considered for the experimentation.The experimental results reveal that the ResNet-50 model achieves the highest average paddy crop stress classification accuracy of 92.61%outperforming the other considered CNN models.The study explores the feasibility of CNN models for the paddy crop stress identification as well as the applicability of automated methods to non-experts. 展开更多
关键词 Paddy crop Stress classification Biotic stress PlantVillage ImageNet pre-trained CNN models
原文传递
Medical Named Entity Recognition from Un-labelled Medical Records based on Pre-trained Language Models and Domain Dictionary
16
作者 Chaojie Wen Tao Chen +1 位作者 Xudong Jia Jiang Zhu 《Data Intelligence》 2021年第3期402-417,共16页
Medical named entity recognition(NER)is an area in which medical named entities are recognized from medical texts,such as diseases,drugs,surgery reports,anatomical parts,and examination documents.Conventional medical ... Medical named entity recognition(NER)is an area in which medical named entities are recognized from medical texts,such as diseases,drugs,surgery reports,anatomical parts,and examination documents.Conventional medical NER methods do not make full use of un-labelled medical texts embedded in medical documents.To address this issue,we proposed a medical NER approach based on pre-trained language models and a domain dictionary.First,we constructed a medical entity dictionary by extracting medical entities from labelled medical texts and collecting medical entities from other resources,such as the YiduN4 K data set.Second,we employed this dictionary to train domain-specific pre-trained language models using un-labelled medical texts.Third,we employed a pseudo labelling mechanism in un-labelled medical texts to automatically annotate texts and create pseudo labels.Fourth,the BiLSTM-CRF sequence tagging model was used to fine-tune the pre-trained language models.Our experiments on the un-labelled medical texts,which were extracted from Chinese electronic medical records,show that the proposed NER approach enables the strict and relaxed F1 scores to be 88.7%and 95.3%,respectively. 展开更多
关键词 Medical named entity recognition pre-trained language model Domain dictionary Pseudo labelling Un-labelled medical data
原文传递
May ChatGPT be a tool producing medical information for common inflammatory bowel disease patients’questions?An evidencecontrolled analysis
17
作者 Antonietta Gerarda Gravina Raffaele Pellegrino +6 位作者 Marina Cipullo Giovanna Palladino Giuseppe Imperio Andrea Ventura Salvatore Auletta Paola Ciamarra Alessandro Federico 《World Journal of Gastroenterology》 SCIE CAS 2024年第1期17-33,共17页
Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including pa... Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including patients with inflammatory bowel diseases(IBD).However,significant ethical issues and pitfalls exist in innovative LLM tools.The hype generated by such systems may lead to unweighted patient trust in these systems.Therefore,it is necessary to understand whether LLMs(trendy ones,such as ChatGPT)can produce plausible medical information(MI)for patients.This review examined ChatGPT’s potential to provide MI regarding questions commonly addressed by patients with IBD to their gastroenterologists.From the review of the outputs provided by ChatGPT,this tool showed some attractive potential while having significant limitations in updating and detailing information and providing inaccurate information in some cases.Further studies and refinement of the ChatGPT,possibly aligning the outputs with the leading medical evidence provided by reliable databases,are needed. 展开更多
关键词 Crohn’s disease Ulcerative colitis Inflammatory bowel disease Chat Generative pre-trained Transformer Large language model Artificial intelligence
下载PDF
Personality Trait Detection via Transfer Learning
18
作者 Bashar Alshouha Jesus Serrano-Guerrero +2 位作者 Francisco Chiclana Francisco P.Romero Jose A.Olivas 《Computers, Materials & Continua》 SCIE EI 2024年第2期1933-1956,共24页
Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-... Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-ditional machine learning techniques have been broadly employed for personality trait identification;nevertheless,the development of new technologies based on deep learning has led to new opportunities to improve their performance.This study focuses on the capabilities of pre-trained language models such as BERT,RoBERTa,ALBERT,ELECTRA,ERNIE,or XLNet,to deal with the task of personality recognition.These models are able to capture structural features from textual content and comprehend a multitude of language facets and complex features such as hierarchical relationships or long-term dependencies.This makes them suitable to classify multi-label personality traits from reviews while mitigating computational costs.The focus of this approach centers on developing an architecture based on different layers able to capture the semantic context and structural features from texts.Moreover,it is able to fine-tune the previous models using the MyPersonality dataset,which comprises 9,917 status updates contributed by 250 Facebook users.These status updates are categorized according to the well-known Big Five personality model,setting the stage for a comprehensive exploration of personality traits.To test the proposal,a set of experiments have been performed using different metrics such as the exact match ratio,hamming loss,zero-one-loss,precision,recall,F1-score,and weighted averages.The results reveal ERNIE is the top-performing model,achieving an exact match ratio of 72.32%,an accuracy rate of 87.17%,and 84.41%of F1-score.The findings demonstrate that the tested models substantially outperform other state-of-the-art studies,enhancing the accuracy by at least 3%and confirming them as powerful tools for personality recognition.These findings represent substantial advancements in personality recognition,making them appropriate for the development of user-centric applications. 展开更多
关键词 Personality trait detection pre-trained language model big five model transfer learning
下载PDF
SHEL:a semantically enhanced hardware-friendly entity linking method
19
作者 亓东林 CHEN Shudong +2 位作者 DU Rong TONG Da YU Yong 《High Technology Letters》 EI CAS 2024年第1期13-22,共10页
With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of train... With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder. 展开更多
关键词 entity linking(EL) pre-trained models knowledge graph text summarization semantic enhancement
下载PDF
An Approach to Detect Structural Development Defects in Object-Oriented Programs
20
作者 Maxime Seraphin Gnagne Mouhamadou Dosso +1 位作者 Mamadou Diarra Souleymane Oumtanaga 《Open Journal of Applied Sciences》 2024年第2期494-510,共17页
Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detecti... Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects. 展开更多
关键词 Object-Oriented Programming Structural Development Defect Detection Software Maintenance pre-trained Models Features Extraction BAGGING Neural Network
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部