Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze ...Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze text in a unidirectional manner,where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences.In addition,there are many separate models for identifying offensive texts based on monolin-gual and multilingual,but there are a few models that can detect both monolingual and multilingual-based offensive texts.In this study,a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers(Deep-BERT)to identify offensive posts on social media that are used to harass others.This paper explores a variety of ways to deal with multilin-gualism,including collaborative multilingual and translation-based approaches.Then,the Deep-BERT is tested on the Bengali and English datasets,including the different bidirectional encoder representations from transformers(BERT)pre-trained word-embedding techniques,and found that the proposed Deep-BERT’s efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%.The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts.展开更多
For the existing aspect category sentiment analysis research,most of the aspects are given for sentiment extraction,and this pipeline method is prone to error accumulation,and the use of graph convolutional neural net...For the existing aspect category sentiment analysis research,most of the aspects are given for sentiment extraction,and this pipeline method is prone to error accumulation,and the use of graph convolutional neural network for aspect category sentiment analysis does not fully utilize the dependency type information between words,so it cannot enhance feature extraction.This paper proposes an end-to-end aspect category sentiment analysis(ETESA)model based on type graph convolutional networks.The model uses the bidirectional encoder representation from transformers(BERT)pretraining model to obtain aspect categories and word vectors containing contextual dynamic semantic information,which can solve the problem of polysemy;when using graph convolutional network(GCN)for feature extraction,the fusion operation of word vectors and initialization tensor of dependency types can obtain the importance values of different dependency types and enhance the text feature representation;by transforming aspect category and sentiment pair extraction into multiple single-label classification problems,aspect category and sentiment can be extracted simultaneously in an end-to-end way and solve the problem of error accumulation.Experiments are tested on three public datasets,and the results show that the ETESA model can achieve higher Precision,Recall and F1 value,proving the effectiveness of the model.展开更多
首先利用bidirectional encoder representations from transformers(BERT)模型的强大的语境理解能力来提取数据法律文本的深层语义特征,然后引入细粒度特征提取层,依照注意力机制,重点关注文本中与数据法律问答相关的关键部分,最后对...首先利用bidirectional encoder representations from transformers(BERT)模型的强大的语境理解能力来提取数据法律文本的深层语义特征,然后引入细粒度特征提取层,依照注意力机制,重点关注文本中与数据法律问答相关的关键部分,最后对所采集的法律问答数据集进行训练和评估.结果显示:与传统的多个单一模型相比,所提出的模型在准确度、精确度、召回率、F1分数等关键性能指标上均有提升,表明该系统能够更有效地理解和回应复杂的数据法学问题,为研究数据法学的专业人士和公众用户提供更高质量的问答服务.展开更多
In the context of interdisciplinary research,using computer technology to further mine keywords in cultural texts and carry out semantic analysis can deepen the understanding of texts,and provide quantitative support ...In the context of interdisciplinary research,using computer technology to further mine keywords in cultural texts and carry out semantic analysis can deepen the understanding of texts,and provide quantitative support and evidence for humanistic studies.Based on the novel A Dream of Red Mansions,the automatic extraction and classification of those sentiment terms in it were realized,and detailed analysis of large-scale sentiment terms was carried out.Bidirectional encoder representation from transformers(BERT) pretraining and fine-tuning model was used to construct the sentiment classifier of A Dream of Red Mansions.Sentiment terms of A Dream of Red Mansions are divided into eight sentimental categories,and the relevant people in sentences are extracted according to specific rules.It also tries to visually display the sentimental interactions between Twelve Girls of Jinling and Jia Baoyu along with the development of the episode.The overall F_(1) score of BERT-based sentiment classifier reached 84.89%.The best single sentiment score reached 91.15%.Experimental results show that the classifier can satisfactorily classify the text of A Dream of Red Mansions,and the text classification and interactional analysis results can be mutually verified with the text interpretation of A dream of Red Mansions by literature experts.展开更多
文摘Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze text in a unidirectional manner,where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences.In addition,there are many separate models for identifying offensive texts based on monolin-gual and multilingual,but there are a few models that can detect both monolingual and multilingual-based offensive texts.In this study,a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers(Deep-BERT)to identify offensive posts on social media that are used to harass others.This paper explores a variety of ways to deal with multilin-gualism,including collaborative multilingual and translation-based approaches.Then,the Deep-BERT is tested on the Bengali and English datasets,including the different bidirectional encoder representations from transformers(BERT)pre-trained word-embedding techniques,and found that the proposed Deep-BERT’s efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%.The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts.
基金Supported by the National Key Research and Development Program of China(No.2018YFB1702601).
文摘For the existing aspect category sentiment analysis research,most of the aspects are given for sentiment extraction,and this pipeline method is prone to error accumulation,and the use of graph convolutional neural network for aspect category sentiment analysis does not fully utilize the dependency type information between words,so it cannot enhance feature extraction.This paper proposes an end-to-end aspect category sentiment analysis(ETESA)model based on type graph convolutional networks.The model uses the bidirectional encoder representation from transformers(BERT)pretraining model to obtain aspect categories and word vectors containing contextual dynamic semantic information,which can solve the problem of polysemy;when using graph convolutional network(GCN)for feature extraction,the fusion operation of word vectors and initialization tensor of dependency types can obtain the importance values of different dependency types and enhance the text feature representation;by transforming aspect category and sentiment pair extraction into multiple single-label classification problems,aspect category and sentiment can be extracted simultaneously in an end-to-end way and solve the problem of error accumulation.Experiments are tested on three public datasets,and the results show that the ETESA model can achieve higher Precision,Recall and F1 value,proving the effectiveness of the model.
文摘首先利用bidirectional encoder representations from transformers(BERT)模型的强大的语境理解能力来提取数据法律文本的深层语义特征,然后引入细粒度特征提取层,依照注意力机制,重点关注文本中与数据法律问答相关的关键部分,最后对所采集的法律问答数据集进行训练和评估.结果显示:与传统的多个单一模型相比,所提出的模型在准确度、精确度、召回率、F1分数等关键性能指标上均有提升,表明该系统能够更有效地理解和回应复杂的数据法学问题,为研究数据法学的专业人士和公众用户提供更高质量的问答服务.
基金supported by the Fundamental Research Funds for the Central Universities (2019XD-A03-3)the Beijing Key Lab of Network System and Network Culture (NSNC-202 A09)。
文摘In the context of interdisciplinary research,using computer technology to further mine keywords in cultural texts and carry out semantic analysis can deepen the understanding of texts,and provide quantitative support and evidence for humanistic studies.Based on the novel A Dream of Red Mansions,the automatic extraction and classification of those sentiment terms in it were realized,and detailed analysis of large-scale sentiment terms was carried out.Bidirectional encoder representation from transformers(BERT) pretraining and fine-tuning model was used to construct the sentiment classifier of A Dream of Red Mansions.Sentiment terms of A Dream of Red Mansions are divided into eight sentimental categories,and the relevant people in sentences are extracted according to specific rules.It also tries to visually display the sentimental interactions between Twelve Girls of Jinling and Jia Baoyu along with the development of the episode.The overall F_(1) score of BERT-based sentiment classifier reached 84.89%.The best single sentiment score reached 91.15%.Experimental results show that the classifier can satisfactorily classify the text of A Dream of Red Mansions,and the text classification and interactional analysis results can be mutually verified with the text interpretation of A dream of Red Mansions by literature experts.