As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects in...As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.展开更多
In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This me...In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This method is aimed at organizations such as companies and schools that are transitioning from traditional access control models to the ABAC model.The manual retrieval and analysis involved in this transition are inefficient,prone to errors,and costly.Most organizations have high-level specifications defined for security policies that include a set of access control policies,which often exist in the form of natural language documents.Utilizing this rich source of information,our method effectively identifies and extracts the necessary attributes and rules for access control from natural language documents,thereby constructing and optimizing access control policies.This work transforms the problem of policy automation generation into two tasks:extraction of access control statements andmining of access control attributes.First,the Chat General Language Model(ChatGLM)isemployed to extract access control-related statements from a wide range of natural language documents by constructing unique prompts and leveraging the model’s In-Context Learning to contextualize the statements.Then,the Iterated Dilated-Convolutions-Conditional Random Field(ID-CNN-CRF)model is used to annotate access control attributes within these extracted statements,including subject attributes,object attributes,and action attributes,thus reassembling new access control policies.Experimental results show that our method,compared to baseline methods,achieved the highest F1 score of 0.961,confirming the model’s effectiveness and accuracy.展开更多
The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classificatio...The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.展开更多
Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learn...Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.展开更多
Sentiment analysis(SA)is the procedure of recognizing the emotions related to the data that exist in social networking.The existence of sarcasm in tex-tual data is a major challenge in the efficiency of the SA.Earlier...Sentiment analysis(SA)is the procedure of recognizing the emotions related to the data that exist in social networking.The existence of sarcasm in tex-tual data is a major challenge in the efficiency of the SA.Earlier works on sarcasm detection on text utilize lexical as well as pragmatic cues namely interjection,punctuations,and sentiment shift that are vital indicators of sarcasm.With the advent of deep-learning,recent works,leveraging neural networks in learning lexical and contextual features,removing the need for handcrafted feature.In this aspect,this study designs a deep learning with natural language processing enabled SA(DLNLP-SA)technique for sarcasm classification.The proposed DLNLP-SA technique aims to detect and classify the occurrence of sarcasm in the input data.Besides,the DLNLP-SA technique holds various sub-processes namely preprocessing,feature vector conversion,and classification.Initially,the pre-processing is performed in diverse ways such as single character removal,multi-spaces removal,URL removal,stopword removal,and tokenization.Secondly,the transformation of feature vectors takes place using the N-gram feature vector technique.Finally,mayfly optimization(MFO)with multi-head self-attention based gated recurrent unit(MHSA-GRU)model is employed for the detection and classification of sarcasm.To verify the enhanced outcomes of the DLNLP-SA model,a comprehensive experimental investigation is performed on the News Headlines Dataset from Kaggle Repository and the results signified the supremacy over the existing approaches.展开更多
A variety of neural networks have been presented to deal with issues in deep learning in the last decades.Despite the prominent success achieved by the neural network,it still lacks theoretical guidance to design an e...A variety of neural networks have been presented to deal with issues in deep learning in the last decades.Despite the prominent success achieved by the neural network,it still lacks theoretical guidance to design an efficient neural network model,and verifying the performance of a model needs excessive resources.Previous research studies have demonstrated that many existing models can be regarded as different numerical discretizations of differential equations.This connection sheds light on designing an effective recurrent neural network(RNN)by resorting to numerical analysis.Simple RNN is regarded as a discretisation of the forward Euler scheme.Considering the limited solution accuracy of the forward Euler methods,a Taylor‐type discrete scheme is presented with lower truncation error and a Taylor‐type RNN(T‐RNN)is designed with its guidance.Extensive experiments are conducted to evaluate its performance on statistical language models and emotion analysis tasks.The noticeable gains obtained by T‐RNN present its superiority and the feasibility of designing the neural network model using numerical methods.展开更多
One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse ...One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP.展开更多
The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models...The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models.展开更多
This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and ro...This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and robots.In this paper,the authors attempt an algorithmic approach to natural language generation through hole semantics and by applying the OMAS-III computational model as a grammatical formalism.In the original work,a technical language is used,while in the later works,this has been replaced by a limited Greek natural language dictionary.This particular effort was made to give the evolving system the ability to ask questions,as well as the authors developed an initial dialogue system using these techniques.The results show that the use of these techniques the authors apply can give us a more sophisticated dialogue system in the future.展开更多
This paper attempts to approach the interface of a robot from the perspective of virtual assistants.Virtual assistants can also be characterized as the mind of a robot,since they manage communication and action with t...This paper attempts to approach the interface of a robot from the perspective of virtual assistants.Virtual assistants can also be characterized as the mind of a robot,since they manage communication and action with the rest of the world they exist in.Therefore,virtual assistants can also be described as the brain of a robot and they include a Natural Language Processing(NLP)module for conducting communication in their human-robot interface.This work is focused on inquiring and enhancing the capabilities of this module.The problem is that nothing much is revealed about the nature of the human-robot interface of commercial virtual assistants.Therefore,any new attempt of developing such a capability has to start from scratch.Accordingly,to include corresponding capabilities to a developing NLP system of a virtual assistant,a method of systemic semantic modelling is proposed and applied.For this purpose,the paper briefly reviews the evolution of virtual assistants from the first assistant,in the form of a game,to the latest assistant that has significantly elevated their standards.Then there is a reference to the evolution of their services and their continued offerings,as well as future expectations.The paper presents their structure and the technologies used,according to the data provided by the development companies to the public,while an attempt is made to classify virtual assistants,based on their characteristics and capabilities.Consequently,a robotic NLP interface is being developed,based on the communicative power of a proposed systemic conceptual model that may enhance the NLP capabilities of virtual assistants,being tested through a small natural language dictionary in Greek.展开更多
Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the ...Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.展开更多
One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelli...One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier.展开更多
Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning...Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning,which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates.In recent years,the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior.In this study,we investigate the ability of different LLMs,ranging from zero-shot and few-shot learning to fine-tuning.Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval.Furthermore,it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach,scoring 86.811%on the Explainable Detection of Online Sexism(EDOS)test-set and 57.453%on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter(HatEval)test-set.Finally,it is confirmed that the evaluated models perform well in hate text detection,as they beat the best result in the HatEval task leaderboard.The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language.However,the fine-tuned approach tends to produce many false positives.展开更多
COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of en...COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of entire nations had shifted to online education during this time.Many shortcomings of Learning Management Systems(LMSs)were detected to support education in an online mode that spawned the research in Artificial Intelligence(AI)based tools that are being developed by the research community to improve the effectiveness of LMSs.This paper presents a detailed survey of the different enhancements to LMSs,which are led by key advances in the area of AI to enhance the real-time and non-real-time user experience.The AI-based enhancements proposed to the LMSs start from the Application layer and Presentation layer in the form of flipped classroom models for the efficient learning environment and appropriately designed UI/UX for efficient utilization of LMS utilities and resources,including AI-based chatbots.Session layer enhancements are also required,such as AI-based online proctoring and user authentication using Biometrics.These extend to the Transport layer to support real-time and rate adaptive encrypted video transmission for user security/privacy and satisfactory working of AI-algorithms.It also needs the support of the Networking layer for IP-based geolocation features,the Virtual Private Network(VPN)feature,and the support of Software-Defined Networks(SDN)for optimum Quality of Service(QoS).Finally,in addition to these,non-real-time user experience is enhanced by other AI-based enhancements such as Plagiarism detection algorithms and Data Analytics.展开更多
Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Te...Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.展开更多
Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentime...Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language.展开更多
Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a ...Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a new approach employing Bidirectional Encoder Representations from the Transformers(BERT)base model(cased),originally pretrained in English.This model is uniquely adapted to recognize the intricate nuances of Arabic online communication,a key aspect often overlooked in conventional cyberbullying detection methods.Our model is an end-to-end solution that has been fine-tuned on a diverse dataset of Arabic social media(SM)tweets showing a notable increase in detection accuracy and sensitivity compared to existing methods.Experimental results on a diverse Arabic dataset collected from the‘X platform’demonstrate a notable increase in detection accuracy and sensitivity compared to existing methods.E-BERT shows a substantial improvement in performance,evidenced by an accuracy of 98.45%,precision of 99.17%,recall of 99.10%,and an F1 score of 99.14%.The proposed E-BERT not only addresses a critical gap in cyberbullying detection in Arabic online forums but also sets a precedent for applying cross-lingual pretrained models in regional language applications,offering a scalable and effective framework for enhancing online safety across Arabic-speaking communities.展开更多
As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-...As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging problem.To this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video clips.The authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual inputs.The outputs of audio and visual branches are combined in a decision fusion module.As humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified keyword.Moreover,the position of the keyword is localised in the attention map without additional position labels.Exper-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy conditions.The code is available at https://github.com/jialeren/AVKT.展开更多
Purpose:The purpose of this study is to serve as a comprehensive review of the existing annotated corpora.This review study aims to provide information on the existing annotated corpora for event extraction,which are ...Purpose:The purpose of this study is to serve as a comprehensive review of the existing annotated corpora.This review study aims to provide information on the existing annotated corpora for event extraction,which are limited but essential for training and improving the existing event extraction algorithms.In addition to the primary goal of this study,it provides guidelines for preparing an annotated corpus and suggests suitable tools for the annotation task.Design/methodology/approach:This study employs an analytical approach to examine available corpus that is suitable for event extraction tasks.It offers an in-depth analysis of existing event extraction corpora and provides systematic guidelines for researchers to develop accurate,high-quality corpora.This ensures the reliability of the created corpus and its suitability for training machine learning algorithms.Findings:Our exploration reveals a scarcity of annotated corpora for event extraction tasks.In particular,the English corpora are mainly focused on the biomedical and general domains.Despite the issue of annotated corpora scarcity,there are several high-quality corpora available and widely used as benchmark datasets.However,access to some of these corpora might be limited owing to closed-access policies or discontinued maintenance after being initially released,rendering them inaccessible owing to broken links.Therefore,this study documents the available corpora for event extraction tasks.Research limitations:Our study focuses only on well-known corpora available in English and Chinese.Nevertheless,this study places a strong emphasis on the English corpora due to its status as a global lingua franca,making it widely understood compared to other languages.Practical implications:We genuinely believe that this study provides valuable knowledge that can serve as a guiding framework for preparing and accurately annotating events from text corpora.It provides comprehensive guidelines for researchers to improve the quality of corpus annotations,especially for event extraction tasks across various domains.Originality/value:This study comprehensively compiled information on the existing annotated corpora for event extraction tasks and provided preparation guidelines.展开更多
The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of...The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of defining the semantic template of relation manually is particularly prominent in the extraction effect because it can obtain the deep semantic information of relation.However,this method has some problems,such as relying on expert experience and poor portability.Inspired by the rule-based entity relation extraction method,this paper proposes a joint entity relation extraction model based on a relation semantic template automatically constructed,which is abbreviated as RSTAC.This model refines the extraction rules of relation semantic templates from relation corpus through dependency parsing and realizes the automatic construction of relation semantic templates.Based on the relation semantic template,the process of relation classification and triplet extraction is constrained,and finally,the entity relation triplet is obtained.The experimental results on the three major Chinese datasets of DuIE,SanWen,and FinRE showthat the RSTAC model successfully obtains rich deep semantics of relation,improves the extraction effect of entity relation triples,and the F1 scores are increased by an average of 0.96% compared with classical joint extraction models such as CasRel,TPLinker,and RFBFN.展开更多
文摘As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.
基金supported by the National Natural Science Foundation of China Project(No.62302540),please visit their website at https://www.nsfc.gov.cn/(accessed on 18 June 2024)The Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020),Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 18 June 2024)Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422),you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html(accessed on 18 June 2024).
文摘In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This method is aimed at organizations such as companies and schools that are transitioning from traditional access control models to the ABAC model.The manual retrieval and analysis involved in this transition are inefficient,prone to errors,and costly.Most organizations have high-level specifications defined for security policies that include a set of access control policies,which often exist in the form of natural language documents.Utilizing this rich source of information,our method effectively identifies and extracts the necessary attributes and rules for access control from natural language documents,thereby constructing and optimizing access control policies.This work transforms the problem of policy automation generation into two tasks:extraction of access control statements andmining of access control attributes.First,the Chat General Language Model(ChatGLM)isemployed to extract access control-related statements from a wide range of natural language documents by constructing unique prompts and leveraging the model’s In-Context Learning to contextualize the statements.Then,the Iterated Dilated-Convolutions-Conditional Random Field(ID-CNN-CRF)model is used to annotate access control attributes within these extracted statements,including subject attributes,object attributes,and action attributes,thus reassembling new access control policies.Experimental results show that our method,compared to baseline methods,achieved the highest F1 score of 0.961,confirming the model’s effectiveness and accuracy.
基金funded by the Informatization Plan of Chinese Academy of Sciences(Grant No.CASWX2021SF-0102)the National Key R&D Program of China(Grant Nos.2022YFA1603903,2022YFA1403800,and 2021YFA0718700)+1 种基金the National Natural Science Foundation of China(Grant Nos.11925408,11921004,and 12188101)the Chinese Academy of Sciences(Grant No.XDB33000000)。
文摘The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.
基金This Research is funded by Researchers Supporting Project Number(RSPD2024R947),King Saud University,Riyadh,Saudi Arabia.
文摘Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.
基金supported through the Annual Funding track by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia[Project No.AN000685].
文摘Sentiment analysis(SA)is the procedure of recognizing the emotions related to the data that exist in social networking.The existence of sarcasm in tex-tual data is a major challenge in the efficiency of the SA.Earlier works on sarcasm detection on text utilize lexical as well as pragmatic cues namely interjection,punctuations,and sentiment shift that are vital indicators of sarcasm.With the advent of deep-learning,recent works,leveraging neural networks in learning lexical and contextual features,removing the need for handcrafted feature.In this aspect,this study designs a deep learning with natural language processing enabled SA(DLNLP-SA)technique for sarcasm classification.The proposed DLNLP-SA technique aims to detect and classify the occurrence of sarcasm in the input data.Besides,the DLNLP-SA technique holds various sub-processes namely preprocessing,feature vector conversion,and classification.Initially,the pre-processing is performed in diverse ways such as single character removal,multi-spaces removal,URL removal,stopword removal,and tokenization.Secondly,the transformation of feature vectors takes place using the N-gram feature vector technique.Finally,mayfly optimization(MFO)with multi-head self-attention based gated recurrent unit(MHSA-GRU)model is employed for the detection and classification of sarcasm.To verify the enhanced outcomes of the DLNLP-SA model,a comprehensive experimental investigation is performed on the News Headlines Dataset from Kaggle Repository and the results signified the supremacy over the existing approaches.
基金supported in part by the National Natural Science Foundation of China under Grant 62176109in part by the Tibetan Information Processing and Machine Translation Key Laboratory of Qinghai Province under Grant 2021‐Z‐003+3 种基金in part by the Natural Science Foundation of Gansu Province under Grant 21JR7RA531 and Grant 22JR5RA487in part by the Fundamental Research Funds for the Central Universities under Grant lzujbky‐2022‐23in part by the CAAI‐Huawei MindSpore Open Fund under Grant CAAIXSJLJJ‐2022‐020Ain part by the Supercomputing Center of Lanzhou University,in part by Sichuan Science and Technology Program No.2022nsfsc0916.
文摘A variety of neural networks have been presented to deal with issues in deep learning in the last decades.Despite the prominent success achieved by the neural network,it still lacks theoretical guidance to design an efficient neural network model,and verifying the performance of a model needs excessive resources.Previous research studies have demonstrated that many existing models can be regarded as different numerical discretizations of differential equations.This connection sheds light on designing an effective recurrent neural network(RNN)by resorting to numerical analysis.Simple RNN is regarded as a discretisation of the forward Euler scheme.Considering the limited solution accuracy of the forward Euler methods,a Taylor‐type discrete scheme is presented with lower truncation error and a Taylor‐type RNN(T‐RNN)is designed with its guidance.Extensive experiments are conducted to evaluate its performance on statistical language models and emotion analysis tasks.The noticeable gains obtained by T‐RNN present its superiority and the feasibility of designing the neural network model using numerical methods.
文摘One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R161)PrincessNourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the|Deanship of Scientific Research at Umm Al-Qura University|for supporting this work by Grant Code:(22UQU4310373DSR33).
文摘The recent developments in Multimedia Internet of Things(MIoT)devices,empowered with Natural Language Processing(NLP)model,seem to be a promising future of smart devices.It plays an important role in industrial models such as speech understanding,emotion detection,home automation,and so on.If an image needs to be captioned,then the objects in that image,its actions and connections,and any silent feature that remains under-projected or missing from the images should be identified.The aim of the image captioning process is to generate a caption for image.In next step,the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct.In this scenario,computer vision model is used to identify the objects and NLP approaches are followed to describe the image.The current study develops aNatural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System(NLPODL-IICS).The aim of the presented NLPODL-IICS model is to produce a proper description for input image.To attain this,the proposed NLPODL-IICS follows two stages such as encoding and decoding processes.Initially,at the encoding side,the proposed NLPODL-IICS model makes use of Hunger Games Search(HGS)with Neural Search Architecture Network(NASNet)model.This model represents the input data appropriately by inserting it into a predefined length vector.Besides,during decoding phase,Chimp Optimization Algorithm(COA)with deeper Long Short Term Memory(LSTM)approach is followed to concatenate the description sentences 4436 CMC,2023,vol.74,no.2 produced by the method.The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively.The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets.Awidespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models.
文摘This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and robots.In this paper,the authors attempt an algorithmic approach to natural language generation through hole semantics and by applying the OMAS-III computational model as a grammatical formalism.In the original work,a technical language is used,while in the later works,this has been replaced by a limited Greek natural language dictionary.This particular effort was made to give the evolving system the ability to ask questions,as well as the authors developed an initial dialogue system using these techniques.The results show that the use of these techniques the authors apply can give us a more sophisticated dialogue system in the future.
文摘This paper attempts to approach the interface of a robot from the perspective of virtual assistants.Virtual assistants can also be characterized as the mind of a robot,since they manage communication and action with the rest of the world they exist in.Therefore,virtual assistants can also be described as the brain of a robot and they include a Natural Language Processing(NLP)module for conducting communication in their human-robot interface.This work is focused on inquiring and enhancing the capabilities of this module.The problem is that nothing much is revealed about the nature of the human-robot interface of commercial virtual assistants.Therefore,any new attempt of developing such a capability has to start from scratch.Accordingly,to include corresponding capabilities to a developing NLP system of a virtual assistant,a method of systemic semantic modelling is proposed and applied.For this purpose,the paper briefly reviews the evolution of virtual assistants from the first assistant,in the form of a game,to the latest assistant that has significantly elevated their standards.Then there is a reference to the evolution of their services and their continued offerings,as well as future expectations.The paper presents their structure and the technologies used,according to the data provided by the development companies to the public,while an attempt is made to classify virtual assistants,based on their characteristics and capabilities.Consequently,a robotic NLP interface is being developed,based on the communicative power of a proposed systemic conceptual model that may enhance the NLP capabilities of virtual assistants,being tested through a small natural language dictionary in Greek.
基金supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004).
文摘Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.
文摘One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier.
基金This work is part of the research projects LaTe4PoliticES(PID2022-138099OBI00)funded by MICIU/AEI/10.13039/501100011033the European Regional Development Fund(ERDF)-A Way of Making Europe and LT-SWM(TED2021-131167B-I00)funded by MICIU/AEI/10.13039/501100011033the European Union NextGenerationEU/PRTR.Mr.Ronghao Pan is supported by the Programa Investigo grant,funded by the Region of Murcia,the Spanish Ministry of Labour and Social Economy and the European Union-NextGenerationEU under the“Plan de Recuperación,Transformación y Resiliencia(PRTR).”。
文摘Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning,which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates.In recent years,the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior.In this study,we investigate the ability of different LLMs,ranging from zero-shot and few-shot learning to fine-tuning.Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval.Furthermore,it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach,scoring 86.811%on the Explainable Detection of Online Sexism(EDOS)test-set and 57.453%on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter(HatEval)test-set.Finally,it is confirmed that the evaluated models perform well in hate text detection,as they beat the best result in the HatEval task leaderboard.The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language.However,the fine-tuned approach tends to produce many false positives.
文摘COVID-19 pandemic restrictions limited all social activities to curtail the spread of the virus.The foremost and most prime sector among those affected were schools,colleges,and universities.The education system of entire nations had shifted to online education during this time.Many shortcomings of Learning Management Systems(LMSs)were detected to support education in an online mode that spawned the research in Artificial Intelligence(AI)based tools that are being developed by the research community to improve the effectiveness of LMSs.This paper presents a detailed survey of the different enhancements to LMSs,which are led by key advances in the area of AI to enhance the real-time and non-real-time user experience.The AI-based enhancements proposed to the LMSs start from the Application layer and Presentation layer in the form of flipped classroom models for the efficient learning environment and appropriately designed UI/UX for efficient utilization of LMS utilities and resources,including AI-based chatbots.Session layer enhancements are also required,such as AI-based online proctoring and user authentication using Biometrics.These extend to the Transport layer to support real-time and rate adaptive encrypted video transmission for user security/privacy and satisfactory working of AI-algorithms.It also needs the support of the Networking layer for IP-based geolocation features,the Virtual Private Network(VPN)feature,and the support of Software-Defined Networks(SDN)for optimum Quality of Service(QoS).Finally,in addition to these,non-real-time user experience is enhanced by other AI-based enhancements such as Plagiarism detection algorithms and Data Analytics.
文摘Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.
文摘Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language.
基金funded by Scientific Research Deanship at University of Ha’il-Saudi Arabia through Project Number RG-23092。
文摘Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a new approach employing Bidirectional Encoder Representations from the Transformers(BERT)base model(cased),originally pretrained in English.This model is uniquely adapted to recognize the intricate nuances of Arabic online communication,a key aspect often overlooked in conventional cyberbullying detection methods.Our model is an end-to-end solution that has been fine-tuned on a diverse dataset of Arabic social media(SM)tweets showing a notable increase in detection accuracy and sensitivity compared to existing methods.Experimental results on a diverse Arabic dataset collected from the‘X platform’demonstrate a notable increase in detection accuracy and sensitivity compared to existing methods.E-BERT shows a substantial improvement in performance,evidenced by an accuracy of 98.45%,precision of 99.17%,recall of 99.10%,and an F1 score of 99.14%.The proposed E-BERT not only addresses a critical gap in cyberbullying detection in Arabic online forums but also sets a precedent for applying cross-lingual pretrained models in regional language applications,offering a scalable and effective framework for enhancing online safety across Arabic-speaking communities.
基金Science and Technology Plan of Shenzhen,Grant/Award Number:JCYJ20200109140410340National Natural Science Foundation of China,Grant/Award Number:62073004。
文摘As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging problem.To this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video clips.The authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual inputs.The outputs of audio and visual branches are combined in a decision fusion module.As humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified keyword.Moreover,the position of the keyword is localised in the attention map without additional position labels.Exper-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy conditions.The code is available at https://github.com/jialeren/AVKT.
文摘Purpose:The purpose of this study is to serve as a comprehensive review of the existing annotated corpora.This review study aims to provide information on the existing annotated corpora for event extraction,which are limited but essential for training and improving the existing event extraction algorithms.In addition to the primary goal of this study,it provides guidelines for preparing an annotated corpus and suggests suitable tools for the annotation task.Design/methodology/approach:This study employs an analytical approach to examine available corpus that is suitable for event extraction tasks.It offers an in-depth analysis of existing event extraction corpora and provides systematic guidelines for researchers to develop accurate,high-quality corpora.This ensures the reliability of the created corpus and its suitability for training machine learning algorithms.Findings:Our exploration reveals a scarcity of annotated corpora for event extraction tasks.In particular,the English corpora are mainly focused on the biomedical and general domains.Despite the issue of annotated corpora scarcity,there are several high-quality corpora available and widely used as benchmark datasets.However,access to some of these corpora might be limited owing to closed-access policies or discontinued maintenance after being initially released,rendering them inaccessible owing to broken links.Therefore,this study documents the available corpora for event extraction tasks.Research limitations:Our study focuses only on well-known corpora available in English and Chinese.Nevertheless,this study places a strong emphasis on the English corpora due to its status as a global lingua franca,making it widely understood compared to other languages.Practical implications:We genuinely believe that this study provides valuable knowledge that can serve as a guiding framework for preparing and accurately annotating events from text corpora.It provides comprehensive guidelines for researchers to improve the quality of corpus annotations,especially for event extraction tasks across various domains.Originality/value:This study comprehensively compiled information on the existing annotated corpora for event extraction tasks and provided preparation guidelines.
基金supported by the National Natural Science Foundation of China(Nos.U1804263,U1736214,62172435)the Zhongyuan Science and Technology Innovation Leading Talent Project(No.214200510019).
文摘The joint entity relation extraction model which integrates the semantic information of relation is favored by relevant researchers because of its effectiveness in solving the overlapping of entities,and the method of defining the semantic template of relation manually is particularly prominent in the extraction effect because it can obtain the deep semantic information of relation.However,this method has some problems,such as relying on expert experience and poor portability.Inspired by the rule-based entity relation extraction method,this paper proposes a joint entity relation extraction model based on a relation semantic template automatically constructed,which is abbreviated as RSTAC.This model refines the extraction rules of relation semantic templates from relation corpus through dependency parsing and realizes the automatic construction of relation semantic templates.Based on the relation semantic template,the process of relation classification and triplet extraction is constrained,and finally,the entity relation triplet is obtained.The experimental results on the three major Chinese datasets of DuIE,SanWen,and FinRE showthat the RSTAC model successfully obtains rich deep semantics of relation,improves the extraction effect of entity relation triples,and the F1 scores are increased by an average of 0.96% compared with classical joint extraction models such as CasRel,TPLinker,and RFBFN.