Emotion Recognition in Conversations(ERC)is fundamental in creating emotionally intelligentmachines.Graph-BasedNetwork(GBN)models have gained popularity in detecting conversational contexts for ERC tasks.However,their...Emotion Recognition in Conversations(ERC)is fundamental in creating emotionally intelligentmachines.Graph-BasedNetwork(GBN)models have gained popularity in detecting conversational contexts for ERC tasks.However,their limited ability to collect and acquire contextual information hinders their effectiveness.We propose a Text Augmentation-based computational model for recognizing emotions using transformers(TA-MERT)to address this.The proposed model uses the Multimodal Emotion Lines Dataset(MELD),which ensures a balanced representation for recognizing human emotions.Themodel used text augmentation techniques to producemore training data,improving the proposed model’s accuracy.Transformer encoders train the deep neural network(DNN)model,especially Bidirectional Encoder(BE)representations that capture both forward and backward contextual information.This integration improves the accuracy and robustness of the proposed model.Furthermore,we present a method for balancing the training dataset by creating enhanced samples from the original dataset.By balancing the dataset across all emotion categories,we can lessen the adverse effects of data imbalance on the accuracy of the proposed model.Experimental results on the MELD dataset show that TA-MERT outperforms earlier methods,achieving a weighted F1 score of 62.60%and an accuracy of 64.36%.Overall,the proposed TA-MERT model solves the GBN models’weaknesses in obtaining contextual data for ERC.TA-MERT model recognizes human emotions more accurately by employing text augmentation and transformer-based encoding.The balanced dataset and the additional training samples also enhance its resilience.These findings highlight the significance of transformer-based approaches for special emotion recognition in conversations.展开更多
Automated Speech Emotion Recognition (SER) becomes more popular and has increased applicability.SER concentrates on the automatic identification of the emotional state of a humanbeing using speech signals. It mainly d...Automated Speech Emotion Recognition (SER) becomes more popular and has increased applicability.SER concentrates on the automatic identification of the emotional state of a humanbeing using speech signals. It mainly depends upon the in-depth analysis of the speech signal,extracts features containing emotional details from the speech signal, and utilises patternrecognition techniques for emotional state identification. The major problem in automatic SERis to extract discriminate, powerful, and emotional salient features from the acoustical content ofspeech signals. The proposed model aims to detect and classify three emotional states of speechsuch as happy, neutral, and sad. The presented model makes use of Convolution neural network– Gated Recurrent unit (CNN-GRU) based feature extraction technique which derives a set offeature vectors. A comprehensive simulation takes place using the Berlin German Database andSJTU Chinese Database which comprises numerous audio files under a collection of differentemotion labels.展开更多
文摘Emotion Recognition in Conversations(ERC)is fundamental in creating emotionally intelligentmachines.Graph-BasedNetwork(GBN)models have gained popularity in detecting conversational contexts for ERC tasks.However,their limited ability to collect and acquire contextual information hinders their effectiveness.We propose a Text Augmentation-based computational model for recognizing emotions using transformers(TA-MERT)to address this.The proposed model uses the Multimodal Emotion Lines Dataset(MELD),which ensures a balanced representation for recognizing human emotions.Themodel used text augmentation techniques to producemore training data,improving the proposed model’s accuracy.Transformer encoders train the deep neural network(DNN)model,especially Bidirectional Encoder(BE)representations that capture both forward and backward contextual information.This integration improves the accuracy and robustness of the proposed model.Furthermore,we present a method for balancing the training dataset by creating enhanced samples from the original dataset.By balancing the dataset across all emotion categories,we can lessen the adverse effects of data imbalance on the accuracy of the proposed model.Experimental results on the MELD dataset show that TA-MERT outperforms earlier methods,achieving a weighted F1 score of 62.60%and an accuracy of 64.36%.Overall,the proposed TA-MERT model solves the GBN models’weaknesses in obtaining contextual data for ERC.TA-MERT model recognizes human emotions more accurately by employing text augmentation and transformer-based encoding.The balanced dataset and the additional training samples also enhance its resilience.These findings highlight the significance of transformer-based approaches for special emotion recognition in conversations.
文摘Automated Speech Emotion Recognition (SER) becomes more popular and has increased applicability.SER concentrates on the automatic identification of the emotional state of a humanbeing using speech signals. It mainly depends upon the in-depth analysis of the speech signal,extracts features containing emotional details from the speech signal, and utilises patternrecognition techniques for emotional state identification. The major problem in automatic SERis to extract discriminate, powerful, and emotional salient features from the acoustical content ofspeech signals. The proposed model aims to detect and classify three emotional states of speechsuch as happy, neutral, and sad. The presented model makes use of Convolution neural network– Gated Recurrent unit (CNN-GRU) based feature extraction technique which derives a set offeature vectors. A comprehensive simulation takes place using the Berlin German Database andSJTU Chinese Database which comprises numerous audio files under a collection of differentemotion labels.