期刊文献+
共找到298篇文章
< 1 2 15 >
每页显示 20 50 100
Unlocking the Potential:A Comprehensive Systematic Review of ChatGPT in Natural Language Processing Tasks
1
作者 Ebtesam Ahmad Alomari 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期43-85,共43页
As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects in... As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues. 展开更多
关键词 Generative AI large languagemodel(LLM) natural language processing(NLP) ChatGPT GPT(generative pretraining transformer) GPT-4 sentiment analysis NER information extraction ANNOTATION text classification
下载PDF
Automatic Generation of Attribute-Based Access Control Policies from Natural Language Documents
2
作者 Fangfang Shan Zhenyu Wang +1 位作者 Mengyao Liu Menghan Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第9期3881-3902,共22页
In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This me... In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This method is aimed at organizations such as companies and schools that are transitioning from traditional access control models to the ABAC model.The manual retrieval and analysis involved in this transition are inefficient,prone to errors,and costly.Most organizations have high-level specifications defined for security policies that include a set of access control policies,which often exist in the form of natural language documents.Utilizing this rich source of information,our method effectively identifies and extracts the necessary attributes and rules for access control from natural language documents,thereby constructing and optimizing access control policies.This work transforms the problem of policy automation generation into two tasks:extraction of access control statements andmining of access control attributes.First,the Chat General Language Model(ChatGLM)isemployed to extract access control-related statements from a wide range of natural language documents by constructing unique prompts and leveraging the model’s In-Context Learning to contextualize the statements.Then,the Iterated Dilated-Convolutions-Conditional Random Field(ID-CNN-CRF)model is used to annotate access control attributes within these extracted statements,including subject attributes,object attributes,and action attributes,thus reassembling new access control policies.Experimental results show that our method,compared to baseline methods,achieved the highest F1 score of 0.961,confirming the model’s effectiveness and accuracy. 展开更多
关键词 Access control policy generation natural language deep learning
下载PDF
Literature classification and its applications in condensed matter physics and materials science by natural language processing
3
作者 吴思远 朱天念 +5 位作者 涂思佳 肖睿娟 袁洁 吴泉生 李泓 翁红明 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第5期117-123,共7页
The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classificatio... The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions. 展开更多
关键词 natural language processing text mining materials science
原文传递
Identification of Software Bugs by Analyzing Natural Language-Based Requirements Using Optimized Deep Learning Features
4
作者 Qazi Mazhar ul Haq Fahim Arif +4 位作者 Khursheed Aurangzeb Noor ul Ain Javed Ali Khan Saddaf Rubab Muhammad Shahid Anwar 《Computers, Materials & Continua》 SCIE EI 2024年第3期4379-4397,共19页
Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learn... Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode. 展开更多
关键词 natural language processing software bug prediction transfer learning ensemble learning feature selection
下载PDF
Deep Learning with Natural Language Processing Enabled Sentimental Analysis on Sarcasm Classification 被引量:2
5
作者 Abdul Rahaman Wahab Sait Mohamad Khairi Ishak 《Computer Systems Science & Engineering》 SCIE EI 2023年第3期2553-2567,共15页
Sentiment analysis(SA)is the procedure of recognizing the emotions related to the data that exist in social networking.The existence of sarcasm in tex-tual data is a major challenge in the efficiency of the SA.Earlier... Sentiment analysis(SA)is the procedure of recognizing the emotions related to the data that exist in social networking.The existence of sarcasm in tex-tual data is a major challenge in the efficiency of the SA.Earlier works on sarcasm detection on text utilize lexical as well as pragmatic cues namely interjection,punctuations,and sentiment shift that are vital indicators of sarcasm.With the advent of deep-learning,recent works,leveraging neural networks in learning lexical and contextual features,removing the need for handcrafted feature.In this aspect,this study designs a deep learning with natural language processing enabled SA(DLNLP-SA)technique for sarcasm classification.The proposed DLNLP-SA technique aims to detect and classify the occurrence of sarcasm in the input data.Besides,the DLNLP-SA technique holds various sub-processes namely preprocessing,feature vector conversion,and classification.Initially,the pre-processing is performed in diverse ways such as single character removal,multi-spaces removal,URL removal,stopword removal,and tokenization.Secondly,the transformation of feature vectors takes place using the N-gram feature vector technique.Finally,mayfly optimization(MFO)with multi-head self-attention based gated recurrent unit(MHSA-GRU)model is employed for the detection and classification of sarcasm.To verify the enhanced outcomes of the DLNLP-SA model,a comprehensive experimental investigation is performed on the News Headlines Dataset from Kaggle Repository and the results signified the supremacy over the existing approaches. 展开更多
关键词 Sentiment analysis sarcasm detection deep learning natural language processing N-GRAMS hyperparameter tuning
下载PDF
Numerical‐discrete‐scheme‐incorporated recurrent neural network for tasks in natural language processing 被引量:1
6
作者 Mei Liu Wendi Luo +3 位作者 Zangtai Cai Xiujuan Du Jiliang Zhang Shuai Li 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第4期1415-1424,共10页
A variety of neural networks have been presented to deal with issues in deep learning in the last decades.Despite the prominent success achieved by the neural network,it still lacks theoretical guidance to design an e... A variety of neural networks have been presented to deal with issues in deep learning in the last decades.Despite the prominent success achieved by the neural network,it still lacks theoretical guidance to design an efficient neural network model,and verifying the performance of a model needs excessive resources.Previous research studies have demonstrated that many existing models can be regarded as different numerical discretizations of differential equations.This connection sheds light on designing an effective recurrent neural network(RNN)by resorting to numerical analysis.Simple RNN is regarded as a discretisation of the forward Euler scheme.Considering the limited solution accuracy of the forward Euler methods,a Taylor‐type discrete scheme is presented with lower truncation error and a Taylor‐type RNN(T‐RNN)is designed with its guidance.Extensive experiments are conducted to evaluate its performance on statistical language models and emotion analysis tasks.The noticeable gains obtained by T‐RNN present its superiority and the feasibility of designing the neural network model using numerical methods. 展开更多
关键词 deep learning natural language processing neural network text analysis
下载PDF
Word Embeddings and Semantic Spaces in Natural Language Processing 被引量:1
7
作者 Peter J. Worth 《International Journal of Intelligence Science》 2023年第1期1-21,共21页
One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse ... One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP. 展开更多
关键词 natural Language Processing Vector Space Models Semantic Spaces Word Embeddings Representation Learning Text Vectorization Machine Learning Deep Learning
下载PDF
Natural Language Processing with Optimal Deep Learning-Enabled Intelligent Image Captioning System
8
作者 Radwa Marzouk Eatedal Alabdulkreem +5 位作者 Mohamed KNour Mesfer Al Duhayyim Mahmoud Othman Abu Sarwar Zamani Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2023年第2期4435-4451,共17页
The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models... The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models. 展开更多
关键词 natural language processing information retrieval image captioning deep learning metaheuristics
下载PDF
A Natural Language Generation Algorithm for Greek by Using Hole Semantics and a Systemic Grammatical Formalism
9
作者 Ioannis Giachos Eleni Batzaki +2 位作者 Evangelos C.Papakitsos Stavros Kaminaris Nikolaos Laskaris 《Journal of Computer Science Research》 2023年第4期27-37,共11页
This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and ro... This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and robots.In this paper,the authors attempt an algorithmic approach to natural language generation through hole semantics and by applying the OMAS-III computational model as a grammatical formalism.In the original work,a technical language is used,while in the later works,this has been replaced by a limited Greek natural language dictionary.This particular effort was made to give the evolving system the ability to ask questions,as well as the authors developed an initial dialogue system using these techniques.The results show that the use of these techniques the authors apply can give us a more sophisticated dialogue system in the future. 展开更多
关键词 natural language processing natural language generation natural language understanding Dialog system Systemic grammar formalism OMAS-III HRI Virtual assistant Hole semantics
下载PDF
Inquiring Natural Language Processing Capabilities on Robotic Systems through Virtual Assistants:A Systemic Approach
10
作者 Ioannis Giachos Evangelos C.Papakitsos +1 位作者 Petros Savvidis Nikolaos Laskaris 《Journal of Computer Science Research》 2023年第2期28-36,共9页
This paper attempts to approach the interface of a robot from the perspective of virtual assistants.Virtual assistants can also be characterized as the mind of a robot,since they manage communication and action with t... This paper attempts to approach the interface of a robot from the perspective of virtual assistants.Virtual assistants can also be characterized as the mind of a robot,since they manage communication and action with the rest of the world they exist in.Therefore,virtual assistants can also be described as the brain of a robot and they include a Natural Language Processing(NLP)module for conducting communication in their human-robot interface.This work is focused on inquiring and enhancing the capabilities of this module.The problem is that nothing much is revealed about the nature of the human-robot interface of commercial virtual assistants.Therefore,any new attempt of developing such a capability has to start from scratch.Accordingly,to include corresponding capabilities to a developing NLP system of a virtual assistant,a method of systemic semantic modelling is proposed and applied.For this purpose,the paper briefly reviews the evolution of virtual assistants from the first assistant,in the form of a game,to the latest assistant that has significantly elevated their standards.Then there is a reference to the evolution of their services and their continued offerings,as well as future expectations.The paper presents their structure and the technologies used,according to the data provided by the development companies to the public,while an attempt is made to classify virtual assistants,based on their characteristics and capabilities.Consequently,a robotic NLP interface is being developed,based on the communicative power of a proposed systemic conceptual model that may enhance the NLP capabilities of virtual assistants,being tested through a small natural language dictionary in Greek. 展开更多
关键词 natural language processing Robotic systems Virtual assistant Human-robot interface
下载PDF
DPAL-BERT:A Faster and Lighter Question Answering Model
11
作者 Lirong Yin Lei Wang +8 位作者 Zhuohang Cai Siyu Lu Ruiyang Wang Ahmed AlSanad Salman A.AlQahtani Xiaobing Chen Zhengtong Yin Xiaolu Li Wenfeng Zheng 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期771-786,共16页
Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the ... Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency. 展开更多
关键词 DPAL-BERT question answering systems knowledge distillation model compression BERT Bi-directional long short-term memory(BiLSTM) knowledge information transfer PAL-BERT training efficiency natural language processing
下载PDF
Terrorism Attack Classification Using Machine Learning: The Effectiveness of Using Textual Features Extracted from GTD Dataset
12
作者 Mohammed Abdalsalam Chunlin Li +1 位作者 Abdelghani Dahou Natalia Kryvinska 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第2期1427-1467,共41页
One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelli... One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier. 展开更多
关键词 Artificial intelligence machine learning natural language processing data analytic DistilBERT feature extraction terrorism classification GTD dataset
下载PDF
Comparing Fine-Tuning, Zero and Few-Shot Strategies with Large Language Models in Hate Speech Detection in English
13
作者 Ronghao Pan JoséAntonio García-Díaz Rafael Valencia-García 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第9期2849-2868,共20页
Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning... Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning,which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates.In recent years,the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior.In this study,we investigate the ability of different LLMs,ranging from zero-shot and few-shot learning to fine-tuning.Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval.Furthermore,it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach,scoring 86.811%on the Explainable Detection of Online Sexism(EDOS)test-set and 57.453%on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter(HatEval)test-set.Finally,it is confirmed that the evaluated models perform well in hate text detection,as they beat the best result in the HatEval task leaderboard.The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language.However,the fine-tuned approach tends to produce many false positives. 展开更多
关键词 Hate speech detection zero-shot few-shot fine-tuning natural language processing
下载PDF
AI-Driven Learning Management Systems:Modern Developments, Challenges and Future Trends during theAge of ChatGPT
14
作者 Sameer Qazi Muhammad Bilal Kadri +4 位作者 Muhammad Naveed Bilal AKhawaja Sohaib Zia Khan Muhammad Mansoor Alam Mazliham Mohd Su’ud 《Computers, Materials & Continua》 SCIE EI 2024年第8期3289-3314,共26页
COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of en... COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of entire nations had shifted to online education during this time.Many shortcomings of Learning Management Systems(LMSs)were detected to support education in an online mode that spawned the research in Artificial Intelligence(AI)based tools that are being developed by the research community to improve the effectiveness of LMSs.This paper presents a detailed survey of the different enhancements to LMSs,which are led by key advances in the area of AI to enhance the real-time and non-real-time user experience.The AI-based enhancements proposed to the LMSs start from the Application layer and Presentation layer in the form of flipped classroom models for the efficient learning environment and appropriately designed UI/UX for efficient utilization of LMS utilities and resources,including AI-based chatbots.Session layer enhancements are also required,such as AI-based online proctoring and user authentication using Biometrics.These extend to the Transport layer to support real-time and rate adaptive encrypted video transmission for user security/privacy and satisfactory working of AI-algorithms.It also needs the support of the Networking layer for IP-based geolocation features,the Virtual Private Network(VPN)feature,and the support of Software-Defined Networks(SDN)for optimum Quality of Service(QoS).Finally,in addition to these,non-real-time user experience is enhanced by other AI-based enhancements such as Plagiarism detection algorithms and Data Analytics. 展开更多
关键词 Learning management systems chatbots ChatGPT online education Internet of Things(IoT) artificial intelligence(AI) convolutional neural networks natural language processing
下载PDF
Comparative Analysis of Machine Learning Algorithms for Email Phishing Detection Using TF-IDF, Word2Vec, and BERT
15
作者 Arar Al Tawil Laiali Almazaydeh +3 位作者 Doaa Qawasmeh Baraah Qawasmeh Mohammad Alshinwan Khaled Elleithy 《Computers, Materials & Continua》 SCIE EI 2024年第11期3395-3412,共18页
Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Te... Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems. 展开更多
关键词 ATTACKS email phishing machine learning security representations from transformers(BERT) text classifeir natural language processing(NLP)
下载PDF
Sentiment Analysis of Low-Resource Language Literature Using Data Processing and Deep Learning
16
作者 Aizaz Ali Maqbool Khan +2 位作者 Khalil Khan Rehan Ullah Khan Abdulrahman Aloraini 《Computers, Materials & Continua》 SCIE EI 2024年第4期713-733,共21页
Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentime... Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language. 展开更多
关键词 Urdu sentiment analysis convolutional neural networks recurrent neural network deep learning natural language processing neural networks
下载PDF
Enhancing Arabic Cyberbullying Detection with End-to-End Transformer Model
17
作者 Mohamed A.Mahdi Suliman Mohamed Fati +2 位作者 Mohamed A.G.Hazber Shahanawaj Ahamad Sawsan A.Saad 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第11期1651-1671,共21页
Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a ... Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a new approach employing Bidirectional Encoder Representations from the Transformers(BERT)base model(cased),originally pretrained in English.This model is uniquely adapted to recognize the intricate nuances of Arabic online communication,a key aspect often overlooked in conventional cyberbullying detection methods.Our model is an end-to-end solution that has been fine-tuned on a diverse dataset of Arabic social media(SM)tweets showing a notable increase in detection accuracy and sensitivity compared to existing methods.Experimental results on a diverse Arabic dataset collected from the‘X platform’demonstrate a notable increase in detection accuracy and sensitivity compared to existing methods.E-BERT shows a substantial improvement in performance,evidenced by an accuracy of 98.45%,precision of 99.17%,recall of 99.10%,and an F1 score of 99.14%.The proposed E-BERT not only addresses a critical gap in cyberbullying detection in Arabic online forums but also sets a precedent for applying cross-lingual pretrained models in regional language applications,offering a scalable and effective framework for enhancing online safety across Arabic-speaking communities. 展开更多
关键词 CYBERBULLYING offensive detection Bidirectional Encoder Representations from the Transformers(BERT) continuous bag of words Social Media natural language processing
下载PDF
Audio-visual keyword transformer for unconstrained sentence-level keyword spotting
18
作者 Yidi Li Jiale Ren +3 位作者 Yawei Wang Guoquan Wang Xia Li Hong Liu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第1期142-152,共11页
As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-... As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging problem.To this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video clips.The authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual inputs.The outputs of audio and visual branches are combined in a decision fusion module.As humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified keyword.Moreover,the position of the keyword is localised in the attention map without additional position labels.Exper-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy conditions.The code is available at https://github.com/jialeren/AVKT. 展开更多
关键词 artificial intelligence multimodal approaches natural language processing neural network speech processing
下载PDF
A comprehensive review of existing corpora and methods for creating annotated corpora for event extraction tasks
19
作者 Mohd Hafizul Afifi Abdullah Norshakirah Aziz +3 位作者 Said Jadid Abdulkadir Kashif Hussain Hitham Alhussian Noureen Talpur 《Journal of Data and Information Science》 CSCD 2024年第4期196-238,共43页
Purpose:The purpose of this study is to serve as a comprehensive review of the existing annotated corpora.This review study aims to provide information on the existing annotated corpora for event extraction,which are ... Purpose:The purpose of this study is to serve as a comprehensive review of the existing annotated corpora.This review study aims to provide information on the existing annotated corpora for event extraction,which are limited but essential for training and improving the existing event extraction algorithms.In addition to the primary goal of this study,it provides guidelines for preparing an annotated corpus and suggests suitable tools for the annotation task.Design/methodology/approach:This study employs an analytical approach to examine available corpus that is suitable for event extraction tasks.It offers an in-depth analysis of existing event extraction corpora and provides systematic guidelines for researchers to develop accurate,high-quality corpora.This ensures the reliability of the created corpus and its suitability for training machine learning algorithms.Findings:Our exploration reveals a scarcity of annotated corpora for event extraction tasks.In particular,the English corpora are mainly focused on the biomedical and general domains.Despite the issue of annotated corpora scarcity,there are several high-quality corpora available and widely used as benchmark datasets.However,access to some of these corpora might be limited owing to closed-access policies or discontinued maintenance after being initially released,rendering them inaccessible owing to broken links.Therefore,this study documents the available corpora for event extraction tasks.Research limitations:Our study focuses only on well-known corpora available in English and Chinese.Nevertheless,this study places a strong emphasis on the English corpora due to its status as a global lingua franca,making it widely understood compared to other languages.Practical implications:We genuinely believe that this study provides valuable knowledge that can serve as a guiding framework for preparing and accurately annotating events from text corpora.It provides comprehensive guidelines for researchers to improve the quality of corpus annotations,especially for event extraction tasks across various domains.Originality/value:This study comprehensively compiled information on the existing annotated corpora for event extraction tasks and provided preparation guidelines. 展开更多
关键词 Information extraction Event extraction Text mining Large language model natural language processing
下载PDF
A Joint Entity Relation Extraction Model Based on Relation Semantic Template Automatically Constructed
20
作者 Wei Liu Meijuan Yin +1 位作者 Jialong Zhang Lunchong Cui 《Computers, Materials & Continua》 SCIE EI 2024年第1期975-997,共23页
The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of... The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of defining the semantic template of relation manually is particularly prominent in the extraction effect because it can obtain the deep semantic information of relation.However,this method has some problems,such as relying on expert experience and poor portability.Inspired by the rule-based entity relation extraction method,this paper proposes a joint entity relation extraction model based on a relation semantic template automatically constructed,which is abbreviated as RSTAC.This model refines the extraction rules of relation semantic templates from relation corpus through dependency parsing and realizes the automatic construction of relation semantic templates.Based on the relation semantic template,the process of relation classification and triplet extraction is constrained,and finally,the entity relation triplet is obtained.The experimental results on the three major Chinese datasets of DuIE,SanWen,and FinRE showthat the RSTAC model successfully obtains rich deep semantics of relation,improves the extraction effect of entity relation triples,and the F1 scores are increased by an average of 0.96% compared with classical joint extraction models such as CasRel,TPLinker,and RFBFN. 展开更多
关键词 natural language processing deep learning information extraction relation extraction relation semantic template
下载PDF
上一页 1 2 15 下一页 到第
使用帮助 返回顶部