The estimation of pain intensity is critical for medical diagnosis and treatment of patients.With the development of image monitoring technology and artificial intelligence,automatic pain assessment based on facial ex...The estimation of pain intensity is critical for medical diagnosis and treatment of patients.With the development of image monitoring technology and artificial intelligence,automatic pain assessment based on facial expression and behavioral analysis shows a potential value in clinical applications.This paper reports a framework of convolutional neural network with global and local attention mechanism(GLA-CNN)for the effective detection of pain intensity at four-level thresholds using facial expression images.GLA-CNN includes two modules,namely global attention network(GANet)and local attention network(LANet).LANet is responsible for extracting representative local patch features of faces,while GANet extracts whole facial features to compensate for the ignored correlative features between patches.In the end,the global correlational and local subtle features are fused for the final estimation of pain intensity.Experiments under the UNBC-McMaster Shoulder Pain database demonstrate that GLA-CNN outperforms other state-of-the-art methods.Additionally,a visualization analysis is conducted to present the feature map of GLA-CNN,intuitively showing that it can extract not only local pain features but also global correlative facial ones.Our study demonstrates that pain assessment based on facial expression is a non-invasive and feasible method,and can be employed as an auxiliary pain assessment tool in clinical practice.展开更多
For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The mi...For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.展开更多
Accurately recognizing facial expressions is essential for effective social interactions.Non-human primates(NHPs)are widely used in the study of the neural mechanisms underpinning facial expression processing,yet it r...Accurately recognizing facial expressions is essential for effective social interactions.Non-human primates(NHPs)are widely used in the study of the neural mechanisms underpinning facial expression processing,yet it remains unclear how well monkeys can recognize the facial expressions of other species such as humans.In this study,we systematically investigated how monkeys process the facial expressions of conspecifics and humans using eye-tracking technology and sophisticated behavioral tasks,namely the temporal discrimination task(TDT)and face scan task(FST).We found that monkeys showed prolonged subjective time perception in response to Negative facial expressions in monkeys while showing longer reaction time to Negative facial expressions in humans.Monkey faces also reliably induced divergent pupil contraction in response to different expressions,while human faces and scrambled monkey faces did not.Furthermore,viewing patterns in the FST indicated that monkeys only showed bias toward emotional expressions upon observing monkey faces.Finally,masking the eye region marginally decreased the viewing duration for monkey faces but not for human faces.By probing facial expression processing in monkeys,our study demonstrates that monkeys are more sensitive to the facial expressions of conspecifics than those of humans,thus shedding new light on inter-species communication through facial expressions between NHPs and humans.展开更多
In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According t...In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.展开更多
A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extr...A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.展开更多
The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characte...The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.展开更多
Prediction of students’engagement in aCollaborative Learning setting is essential to improve the quality of learning.Collaborative learning is a strategy of learning through groups or teams.When cooperative learning ...Prediction of students’engagement in aCollaborative Learning setting is essential to improve the quality of learning.Collaborative learning is a strategy of learning through groups or teams.When cooperative learning behavior occurs,each student in the group should participate in teaching activities.Researchers showed that students who are actively involved in a class gain more.Gaze behavior and facial expression are important nonverbal indicators to reveal engagement in collaborative learning environments.Previous studies require the wearing of sensor devices or eye tracker devices,which have cost barriers and technical interference for daily teaching practice.In this paper,student engagement is automatically analyzed based on computer vision.We tackle the problem of engagement in collaborative learning using a multi-modal deep neural network(MDNN).We combined facial expression and gaze direction as two individual components of MDNN to predict engagement levels in collaborative learning environments.Our multi-modal solution was evaluated in a real collaborative environment.The results show that the model can accurately predict students’performance in the collaborative learning environment.展开更多
Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on ha...Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on hand-crafted features,namely,LBP,SIFT,and HOG,along with that a classifier trained on a database of videos or images.Many execute perform well on image datasets captured in a controlled condition;however not perform well in the more challenging dataset,which has partial faces and image variation.Recently,many studies presented an endwise structure for facial expression recognition by utilizing DL methods.Therefore,this study develops an earthworm optimization with an improved SqueezeNet-based FER(EWOISN-FER)model.The presented EWOISN-FER model primarily applies the contrast-limited adaptive histogram equalization(CLAHE)technique as a pre-processing step.In addition,the improved SqueezeNet model is exploited to derive an optimal set of feature vectors,and the hyperparameter tuning process is performed by the stochastic gradient boosting(SGB)model.Finally,EWO with sparse autoencoder(SAE)is employed for the FER process,and the EWO algorithm appropriately chooses the SAE parameters.Awide-ranging experimental analysis is carried out to examine the performance of the proposed model.The experimental outcomes indicate the supremacy of the presented EWOISN-FER technique.展开更多
Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class va...Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class variances.Facial characteristics have not been completely explored in static pictures.Previous studies used Convolution Neural Networks(CNNs)based on transfer learning and hyperparameter optimizations for static facial emotional recognitions.Particle Swarm Optimizations(PSOs)have also been used for tuning hyperparameters.However,these methods achieve about 92 percent in terms of accuracy.The existing algorithms have issues with FER accuracy and precision.Hence,the overall FER performance is degraded significantly.To address this issue,this work proposes a combination of CNNs and Long Short-Term Memories(LSTMs)called the HCNN-LSTMs(Hybrid CNNs and LSTMs)approach for FERs.The work is evaluated on the benchmark dataset,Facial Expression Recog Image Ver(FERC).Viola-Jones(VJ)algorithms recognize faces from preprocessed images followed by HCNN-LSTMs feature extractions and FER classifications.Further,the success rate of Deep Learning Techniques(DLTs)has increased with hyperparameter tunings like epochs,batch sizes,initial learning rates,regularization parameters,shuffling types,and momentum.This proposed work uses Improved Weight based Whale Optimization Algorithms(IWWOAs)to select near-optimal settings for these parameters using bestfitness values.The experi-mentalfindings demonstrated that the proposed HCNN-LSTMs system outper-forms the existing methods.展开更多
The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully con...The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully connection.Such simple combination not only fails to pass the spatial information to network,but also increases calculation amounts.The method proposed in this paper aims to integrate facial landmarks-driven representation into the triplet network.The spatial information provided by landmarks is introduced into the feature extraction process,so that the model can better capture the location relationship.In addition,coordinate information is also integrated into the triple loss calculation to further enhance similarity prediction.Specifically,for each image,the coordinates of 68 landmarks are detected,and then a region attention map based on these landmarks is generated.For the feature map output by the shallow convolutional layer,it will be multiplied with the attention map to correct the feature activation,so as to strengthen the key region and weaken the unimportant region.Finally,the optimized embedding output can be further used for downstream tasks.Three embeddings of three images output by the network can be regarded as a triplet representation for similarity computation.Through the CK+dataset,the effectiveness of such an optimized feature extraction is verified.After that,it is applied to facial expression similarity tasks.The results on the facial expression comparison(FEC)dataset show that the accuracy rate will be significantly improved after the landmark information is introduced.展开更多
Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means throu...Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means through which humans convey emotions, making their automated recognition valuable for various applications including man-computer interaction, affective computing, and psychological research. Pre-processing techniques are applied to every image with the aim of standardizing the images. Frequently used techniques include scaling, blurring, rotating, altering the contour of the image, changing the color to grayscale and normalization. Followed by feature extraction and then the traditional classifiers are applied to infer facial expressions. Increasing the performance of the system is difficult in the typical machine learning approach because feature extraction and classification phases are separate. But in Deep Neural Networks (DNN), the two phases are combined into a single phase. Therefore, the Convolutional Neural Network (CNN) models give better accuracy in Facial Expression Recognition than the traditional classifiers. But still the performance of CNN is hampered by noisy and deviated images in the dataset. This work utilized the preprocessing methods such as resizing, gray-scale conversion and normalization. Also, this research work is motivated by these drawbacks to study the use of image pre-processing techniques to enhance the performance of deep learning methods to implement facial expression recognition. Also, this research aims to recognize emotions using deep learning and show the influences of data pre-processing for further processing of images. The accuracy of each pre-processing methods is compared, then combination between them is analysed and the appropriate preprocessing techniques are identified and implemented to see the variability of accuracies in predicting facial expressions. .展开更多
A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree o...A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree of the class membership to which each training sample belongs. CCA is then used to establish the relationship between each facial image and the corresponding class membership vector, and the class membership vector of a test image is estimated using this relationship. Moreover, the fuzzy-LDA/CCA method is also generalized to deal with nonlinear discriminant analysis problems via kernel method. The performance of the proposed method is demonstrated using real data.展开更多
It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Ame...It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Americans.To test this hypothesis the performance of Portuguese in the recognition of Happiness,Surprise,Sadness,Fear,Disgust and Anger NimStim facial expressions was compared with the Americans,but no significant differences were found.In both populations the easiest emotion to identify was Happiness while Fear was the most difficult one.However,with exception for Surprise,Portuguese tend to show a lower accuracy rate for all the emotions studied.Results highlighted some cultural differences.展开更多
Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a chal...Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).展开更多
A facial expression emotion recognition based human-robot interaction(FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize huma...A facial expression emotion recognition based human-robot interaction(FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize human emotions, but also to generate facial expression for adapting to human emotions. A facial emotion recognition method based on2D-Gabor, uniform local binary pattern(LBP) operator, and multiclass extreme learning machine(ELM) classifier is presented,which is applied to real-time facial expression recognition for robots. Facial expressions of robots are represented by simple cartoon symbols and displayed by a LED screen equipped in the robots, which can be easily understood by human. Four scenarios,i.e., guiding, entertainment, home service and scene simulation are performed in the human-robot interaction experiment, in which smooth communication is realized by facial expression recognition of humans and facial expression generation of robots within 2 seconds. As a few prospective applications, the FEERHRI system can be applied in home service, smart home, safe driving, and so on.展开更多
Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after ant...Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after antidepressant treatment. Following escitalopram oxalate treatment, patients exhibited decreased activation in bilateral precentral gyrus, bilateral middle frontal gyrus, left middle temporal gyrus, bilateral postcentral gyrus, left cingulate and right parahippocampal gyrus, and increased activation in right superior frontal gyrus, bilateral superior parietal Iobule and left occipital gyrus during sad facial expression recognition. After antidepressant treatment, patients also exhibited decreased activation in the bilateral middle frontal gyrus, bilateral cingulate and right parahippocampal gyrus, and increased activation in the right inferior frontal gyrus, left fusiform gyrus and right precuneus during happy facial expression recognition. Our experimental findings indicate that the limbic-cortical network might be a key target region for antidepressant treatment in major depressive disorder.展开更多
Local binary pattern(LBP)is an important method for texture feature extraction of facial expression.However,it also has the shortcomings of high dimension,slow feature extraction and noeffective local or global featur...Local binary pattern(LBP)is an important method for texture feature extraction of facial expression.However,it also has the shortcomings of high dimension,slow feature extraction and noeffective local or global features extracted.To solve these problems,a facial expression feature extraction method is proposed based on improved LBP.Firstly,LBP is converted into double local binary pattern(DLBP).Then by combining Taylor expansion(TE)with DLBP,DLBP-TE algorithm is obtained.Finally,the DLBP-TE algorithm combined with extreme learning machine(ELM)is applied in seven kinds of ficial expression images and the corresponding experiments are carried out in Japanese adult female facial expression(JAFFE)database.The results show that the proposed method can significantly improve facial expression recognition rate.展开更多
In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-...In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-CWT can provide a group delay of 1/4 of a sample period, and satisfy the usual 2-band filter bank constraints of no aliasing and perfect reconstruction. To resolve illumination variation in expression verification, low-frequency coefficients produced by DT-CWT are set zeroes, high-frequency coefficients are used for reconstructing the image, and basic LBP histogram is mapped on the reconstructed image by means of histogram specification. LBP is capable of encoding texture and shape information of the preprocessed images. The histogram graphs built from multi-scale rotation invariant LBPs are combined to serve as feature for further recognition. Template matching is adopted to classify facial expressions for its simplicity. The experimental results show that the proposed approach has good performance in efficiency and accuracy.展开更多
Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER hav...Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.展开更多
OBJECTIVE: The objective of this study is to summarize and analyze the brain signal patterns of empathy for pain caused by facial expressions of pain utilizing activation likelihood estimation, a meta-analysis method....OBJECTIVE: The objective of this study is to summarize and analyze the brain signal patterns of empathy for pain caused by facial expressions of pain utilizing activation likelihood estimation, a meta-analysis method. DATA SOURCES: Studies concerning the brain mechanism were searched from the Science Citation Index, Science Direct, PubMed, DeepDyve, Cochrane Library, SinoMed, Wanfang, VIP, China National Knowledge Infrastructure, and other databases, such as SpringerLink, AMA, Science Online, Wiley Online, were collected. A time limitation of up to 13 December 2016 was applied to this study. DATA SELECTION: Studies presenting with all of the following criteria were considered for study inclusion: Use of functional magnetic resonance imaging, neutral and pained facial expression stimuli, involvement of adult healthy human participants over 18 years of age, whose empathy ability showed no difference from the healthy adult, a painless basic state, results presented in Talairach or Montreal Neurological Institute coordinates, multiple studies by the same team as long as they used different raw data. OUTCOME MEASURES: Activation likelihood estimation was used to calculate the combined main activated brain regions under the stimulation of pained facial expression. RESULTS: Eight studies were included, containing 178 subjects. Meta-analysis results suggested that the anterior cingulate cortex(BA32), anterior central gyrus(BA44), fusiform gyrus, and insula(BA13) were activated positively as major brain areas under the stimulation of pained facial expression. CONCLUSION: Our study shows that pained facial expression alone, without viewing of painful stimuli, activated brain regions related to pain empathy, further contributing to revealing the brain's mechanisms of pain empathy.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.62276051the Natural Science Foundation of Sichuan Province under Grant No.2023NSFSC0640Medical Industry Information Integration Collaborative Innovation Project of Yangtze Delta Region Institute under Grant No.U0723002。
文摘The estimation of pain intensity is critical for medical diagnosis and treatment of patients.With the development of image monitoring technology and artificial intelligence,automatic pain assessment based on facial expression and behavioral analysis shows a potential value in clinical applications.This paper reports a framework of convolutional neural network with global and local attention mechanism(GLA-CNN)for the effective detection of pain intensity at four-level thresholds using facial expression images.GLA-CNN includes two modules,namely global attention network(GANet)and local attention network(LANet).LANet is responsible for extracting representative local patch features of faces,while GANet extracts whole facial features to compensate for the ignored correlative features between patches.In the end,the global correlational and local subtle features are fused for the final estimation of pain intensity.Experiments under the UNBC-McMaster Shoulder Pain database demonstrate that GLA-CNN outperforms other state-of-the-art methods.Additionally,a visualization analysis is conducted to present the feature map of GLA-CNN,intuitively showing that it can extract not only local pain features but also global correlative facial ones.Our study demonstrates that pain assessment based on facial expression is a non-invasive and feasible method,and can be employed as an auxiliary pain assessment tool in clinical practice.
基金funded by Anhui Province Quality Engineering Project No.2021jyxm0801Natural Science Foundation of Anhui University of Chinese Medicine under Grant Nos.2020zrzd18,2019zrzd11+1 种基金Humanity Social Science foundation Grants 2021rwzd20,2020rwzd07Anhui University of Chinese Medicine Quality Engineering Projects No.2021zlgc046.
文摘For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.
基金supported by the National Natural Science Foundation of China (U20A2017)Guangdong Basic and Applied Basic Research Foundation (2022A1515010134,2022A1515110598)+2 种基金Youth Innovation Promotion Association of Chinese Academy of Sciences (2017120)Shenzhen-Hong Kong Institute of Brain Science–Shenzhen Fundamental Research Institutions (NYKFKT2019009)Shenzhen Technological Research Center for Primate Translational Medicine (F-2021-Z99-504979)。
文摘Accurately recognizing facial expressions is essential for effective social interactions.Non-human primates(NHPs)are widely used in the study of the neural mechanisms underpinning facial expression processing,yet it remains unclear how well monkeys can recognize the facial expressions of other species such as humans.In this study,we systematically investigated how monkeys process the facial expressions of conspecifics and humans using eye-tracking technology and sophisticated behavioral tasks,namely the temporal discrimination task(TDT)and face scan task(FST).We found that monkeys showed prolonged subjective time perception in response to Negative facial expressions in monkeys while showing longer reaction time to Negative facial expressions in humans.Monkey faces also reliably induced divergent pupil contraction in response to different expressions,while human faces and scrambled monkey faces did not.Furthermore,viewing patterns in the FST indicated that monkeys only showed bias toward emotional expressions upon observing monkey faces.Finally,masking the eye region marginally decreased the viewing duration for monkey faces but not for human faces.By probing facial expression processing in monkeys,our study demonstrates that monkeys are more sensitive to the facial expressions of conspecifics than those of humans,thus shedding new light on inter-species communication through facial expressions between NHPs and humans.
文摘In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.
基金supported by the Researchers Supporting Project (No.RSP-2021/395),King Saud University,Riyadh,Saudi Arabia.
文摘A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.
基金This work is supported by Natural Science Foundation of China(Grant No.61903056)Major Project of Science and Technology Research Program of Chongqing Education Commission of China(Grant No.KJZDM201900601)+3 种基金Chongqing Research Program of Basic Research and Frontier Technology(Grant Nos.cstc2019jcyj-msxmX0681,cstc2021jcyj-msxmX0530,and cstc2021jcyjmsxmX0761)Project Supported by Chongqing Municipal Key Laboratory of Institutions of Higher Education(Grant No.cqupt-mct-201901)Project Supported by Chongqing Key Laboratory of Mobile Communications Technology(Grant No.cqupt-mct-202002)Project Supported by Engineering Research Center of Mobile Communications,Ministry of Education(Grant No.cqupt-mct202006)。
文摘The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.
基金supported by the National Natural Science Foundation of China (No.61977031)XPCC’s Plan for Tackling Key Scientific and Technological Problems in Key Fields (No.2021AB023-3).
文摘Prediction of students’engagement in aCollaborative Learning setting is essential to improve the quality of learning.Collaborative learning is a strategy of learning through groups or teams.When cooperative learning behavior occurs,each student in the group should participate in teaching activities.Researchers showed that students who are actively involved in a class gain more.Gaze behavior and facial expression are important nonverbal indicators to reveal engagement in collaborative learning environments.Previous studies require the wearing of sensor devices or eye tracker devices,which have cost barriers and technical interference for daily teaching practice.In this paper,student engagement is automatically analyzed based on computer vision.We tackle the problem of engagement in collaborative learning using a multi-modal deep neural network(MDNN).We combined facial expression and gaze direction as two individual components of MDNN to predict engagement levels in collaborative learning environments.Our multi-modal solution was evaluated in a real collaborative environment.The results show that the model can accurately predict students’performance in the collaborative learning environment.
文摘Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on hand-crafted features,namely,LBP,SIFT,and HOG,along with that a classifier trained on a database of videos or images.Many execute perform well on image datasets captured in a controlled condition;however not perform well in the more challenging dataset,which has partial faces and image variation.Recently,many studies presented an endwise structure for facial expression recognition by utilizing DL methods.Therefore,this study develops an earthworm optimization with an improved SqueezeNet-based FER(EWOISN-FER)model.The presented EWOISN-FER model primarily applies the contrast-limited adaptive histogram equalization(CLAHE)technique as a pre-processing step.In addition,the improved SqueezeNet model is exploited to derive an optimal set of feature vectors,and the hyperparameter tuning process is performed by the stochastic gradient boosting(SGB)model.Finally,EWO with sparse autoencoder(SAE)is employed for the FER process,and the EWO algorithm appropriately chooses the SAE parameters.Awide-ranging experimental analysis is carried out to examine the performance of the proposed model.The experimental outcomes indicate the supremacy of the presented EWOISN-FER technique.
文摘Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class variances.Facial characteristics have not been completely explored in static pictures.Previous studies used Convolution Neural Networks(CNNs)based on transfer learning and hyperparameter optimizations for static facial emotional recognitions.Particle Swarm Optimizations(PSOs)have also been used for tuning hyperparameters.However,these methods achieve about 92 percent in terms of accuracy.The existing algorithms have issues with FER accuracy and precision.Hence,the overall FER performance is degraded significantly.To address this issue,this work proposes a combination of CNNs and Long Short-Term Memories(LSTMs)called the HCNN-LSTMs(Hybrid CNNs and LSTMs)approach for FERs.The work is evaluated on the benchmark dataset,Facial Expression Recog Image Ver(FERC).Viola-Jones(VJ)algorithms recognize faces from preprocessed images followed by HCNN-LSTMs feature extractions and FER classifications.Further,the success rate of Deep Learning Techniques(DLTs)has increased with hyperparameter tunings like epochs,batch sizes,initial learning rates,regularization parameters,shuffling types,and momentum.This proposed work uses Improved Weight based Whale Optimization Algorithms(IWWOAs)to select near-optimal settings for these parameters using bestfitness values.The experi-mentalfindings demonstrated that the proposed HCNN-LSTMs system outper-forms the existing methods.
文摘The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully connection.Such simple combination not only fails to pass the spatial information to network,but also increases calculation amounts.The method proposed in this paper aims to integrate facial landmarks-driven representation into the triplet network.The spatial information provided by landmarks is introduced into the feature extraction process,so that the model can better capture the location relationship.In addition,coordinate information is also integrated into the triple loss calculation to further enhance similarity prediction.Specifically,for each image,the coordinates of 68 landmarks are detected,and then a region attention map based on these landmarks is generated.For the feature map output by the shallow convolutional layer,it will be multiplied with the attention map to correct the feature activation,so as to strengthen the key region and weaken the unimportant region.Finally,the optimized embedding output can be further used for downstream tasks.Three embeddings of three images output by the network can be regarded as a triplet representation for similarity computation.Through the CK+dataset,the effectiveness of such an optimized feature extraction is verified.After that,it is applied to facial expression similarity tasks.The results on the facial expression comparison(FEC)dataset show that the accuracy rate will be significantly improved after the landmark information is introduced.
文摘Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means through which humans convey emotions, making their automated recognition valuable for various applications including man-computer interaction, affective computing, and psychological research. Pre-processing techniques are applied to every image with the aim of standardizing the images. Frequently used techniques include scaling, blurring, rotating, altering the contour of the image, changing the color to grayscale and normalization. Followed by feature extraction and then the traditional classifiers are applied to infer facial expressions. Increasing the performance of the system is difficult in the typical machine learning approach because feature extraction and classification phases are separate. But in Deep Neural Networks (DNN), the two phases are combined into a single phase. Therefore, the Convolutional Neural Network (CNN) models give better accuracy in Facial Expression Recognition than the traditional classifiers. But still the performance of CNN is hampered by noisy and deviated images in the dataset. This work utilized the preprocessing methods such as resizing, gray-scale conversion and normalization. Also, this research work is motivated by these drawbacks to study the use of image pre-processing techniques to enhance the performance of deep learning methods to implement facial expression recognition. Also, this research aims to recognize emotions using deep learning and show the influences of data pre-processing for further processing of images. The accuracy of each pre-processing methods is compared, then combination between them is analysed and the appropriate preprocessing techniques are identified and implemented to see the variability of accuracies in predicting facial expressions. .
基金The National Natural Science Foundation of China (No.60503023,60872160)the Natural Science Foundation for Universities ofJiangsu Province (No.08KJD520009)the Intramural Research Foundationof Nanjing University of Information Science and Technology(No.Y603)
文摘A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree of the class membership to which each training sample belongs. CCA is then used to establish the relationship between each facial image and the corresponding class membership vector, and the class membership vector of a test image is estimated using this relationship. Moreover, the fuzzy-LDA/CCA method is also generalized to deal with nonlinear discriminant analysis problems via kernel method. The performance of the proposed method is demonstrated using real data.
文摘It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Americans.To test this hypothesis the performance of Portuguese in the recognition of Happiness,Surprise,Sadness,Fear,Disgust and Anger NimStim facial expressions was compared with the Americans,but no significant differences were found.In both populations the easiest emotion to identify was Happiness while Fear was the most difficult one.However,with exception for Surprise,Portuguese tend to show a lower accuracy rate for all the emotions studied.Results highlighted some cultural differences.
基金supported by the Academy of Finland(267581)the D2I SHOK Project from Digile Oy as well as Nokia Technologies(Tampere,Finland)
文摘Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).
基金supported by the National Natural Science Foundation of China(61403422,61273102)the Hubei Provincial Natural Science Foundation of China(2015CFA010)+1 种基金the Ⅲ Project(B17040)the Fundamental Research Funds for National University,China University of Geosciences(Wuhan)
文摘A facial expression emotion recognition based human-robot interaction(FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize human emotions, but also to generate facial expression for adapting to human emotions. A facial emotion recognition method based on2D-Gabor, uniform local binary pattern(LBP) operator, and multiclass extreme learning machine(ELM) classifier is presented,which is applied to real-time facial expression recognition for robots. Facial expressions of robots are represented by simple cartoon symbols and displayed by a LED screen equipped in the robots, which can be easily understood by human. Four scenarios,i.e., guiding, entertainment, home service and scene simulation are performed in the human-robot interaction experiment, in which smooth communication is realized by facial expression recognition of humans and facial expression generation of robots within 2 seconds. As a few prospective applications, the FEERHRI system can be applied in home service, smart home, safe driving, and so on.
基金supported by research grants from the National Natural Science Foundation of China (No. 81071099)the Liaoning Science and Technology Foundation (No. 2008225010-14)Doctoral Foundation of the First Affiliated Hospital in China Medical University (No. 2010)
文摘Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after antidepressant treatment. Following escitalopram oxalate treatment, patients exhibited decreased activation in bilateral precentral gyrus, bilateral middle frontal gyrus, left middle temporal gyrus, bilateral postcentral gyrus, left cingulate and right parahippocampal gyrus, and increased activation in right superior frontal gyrus, bilateral superior parietal Iobule and left occipital gyrus during sad facial expression recognition. After antidepressant treatment, patients also exhibited decreased activation in the bilateral middle frontal gyrus, bilateral cingulate and right parahippocampal gyrus, and increased activation in the right inferior frontal gyrus, left fusiform gyrus and right precuneus during happy facial expression recognition. Our experimental findings indicate that the limbic-cortical network might be a key target region for antidepressant treatment in major depressive disorder.
文摘Local binary pattern(LBP)is an important method for texture feature extraction of facial expression.However,it also has the shortcomings of high dimension,slow feature extraction and noeffective local or global features extracted.To solve these problems,a facial expression feature extraction method is proposed based on improved LBP.Firstly,LBP is converted into double local binary pattern(DLBP).Then by combining Taylor expansion(TE)with DLBP,DLBP-TE algorithm is obtained.Finally,the DLBP-TE algorithm combined with extreme learning machine(ELM)is applied in seven kinds of ficial expression images and the corresponding experiments are carried out in Japanese adult female facial expression(JAFFE)database.The results show that the proposed method can significantly improve facial expression recognition rate.
文摘In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-CWT can provide a group delay of 1/4 of a sample period, and satisfy the usual 2-band filter bank constraints of no aliasing and perfect reconstruction. To resolve illumination variation in expression verification, low-frequency coefficients produced by DT-CWT are set zeroes, high-frequency coefficients are used for reconstructing the image, and basic LBP histogram is mapped on the reconstructed image by means of histogram specification. LBP is capable of encoding texture and shape information of the preprocessed images. The histogram graphs built from multi-scale rotation invariant LBPs are combined to serve as feature for further recognition. Template matching is adopted to classify facial expressions for its simplicity. The experimental results show that the proposed approach has good performance in efficiency and accuracy.
文摘Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.
基金supported by the National Natural Science Foundation of China,No.81473769(to WW),81772430(to WW)a grant from the Training Program of Innovation and Entrepreneurship for Undergraduates of Southern Medical University of Guangdong Province of China in 2016,No.201612121057(to WW)
文摘OBJECTIVE: The objective of this study is to summarize and analyze the brain signal patterns of empathy for pain caused by facial expressions of pain utilizing activation likelihood estimation, a meta-analysis method. DATA SOURCES: Studies concerning the brain mechanism were searched from the Science Citation Index, Science Direct, PubMed, DeepDyve, Cochrane Library, SinoMed, Wanfang, VIP, China National Knowledge Infrastructure, and other databases, such as SpringerLink, AMA, Science Online, Wiley Online, were collected. A time limitation of up to 13 December 2016 was applied to this study. DATA SELECTION: Studies presenting with all of the following criteria were considered for study inclusion: Use of functional magnetic resonance imaging, neutral and pained facial expression stimuli, involvement of adult healthy human participants over 18 years of age, whose empathy ability showed no difference from the healthy adult, a painless basic state, results presented in Talairach or Montreal Neurological Institute coordinates, multiple studies by the same team as long as they used different raw data. OUTCOME MEASURES: Activation likelihood estimation was used to calculate the combined main activated brain regions under the stimulation of pained facial expression. RESULTS: Eight studies were included, containing 178 subjects. Meta-analysis results suggested that the anterior cingulate cortex(BA32), anterior central gyrus(BA44), fusiform gyrus, and insula(BA13) were activated positively as major brain areas under the stimulation of pained facial expression. CONCLUSION: Our study shows that pained facial expression alone, without viewing of painful stimuli, activated brain regions related to pain empathy, further contributing to revealing the brain's mechanisms of pain empathy.