Intelligent personal assistants play a pivotal role in in-vehicle systems,significantly enhancing life efficiency,driving safety,and decision-making support.In this study,the multi-modal design elements of intelligent...Intelligent personal assistants play a pivotal role in in-vehicle systems,significantly enhancing life efficiency,driving safety,and decision-making support.In this study,the multi-modal design elements of intelligent personal assistants within the context of visual,auditory,and somatosensory interactions with drivers were discussed.Their impact on the driver’s psychological state through various modes such as visual imagery,voice interaction,and gesture interaction were explored.The study also introduced innovative designs for in-vehicle intelligent personal assistants,incorporating design principles such as driver-centricity,prioritizing passenger safety,and utilizing timely feedback as a criterion.Additionally,the study employed design methods like driver behavior research and driving situation analysis to enhance the emotional connection between drivers and their vehicles,ultimately improving driver satisfaction and trust.展开更多
Support vector machines (SVMs) are utilized for emotion recognition in Chinese speech in this paper. Both binary class discrimination and the multi class discrimination are discussed. It proves that the emotional fe...Support vector machines (SVMs) are utilized for emotion recognition in Chinese speech in this paper. Both binary class discrimination and the multi class discrimination are discussed. It proves that the emotional features construct a nonlinear problem in the input space, and SVMs based on nonlinear mapping can solve it more effectively than other linear methods. Multi class classification based on SVMs with a soft decision function is constructed to classify the four emotion situations. Compared with principal component analysis (PCA) method and modified PCA method, SVMs perform the best result in multi class discrimination by using nonlinear kernel mapping.展开更多
In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm base...In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm based on the combination of the emotional data field (EDF) and the ant colony search (ACS) strategy, called the EDF-ACS algorithm, is proposed. More specifically, the inter- relationship among the turn-based acoustic feature vectors of different labels are established by using the potential function in the EDF. To perform the spontaneous speech emotion recognition, the artificial colony is used to mimic the turn- based acoustic feature vectors. Then, the canonical ACS strategy is used to investigate the movement direction of each artificial ant in the EDF, which is regarded as the emotional label of the corresponding turn-based acoustic feature vector. The proposed EDF-ACS algorithm is evaluated on the continueous audio)'visual emotion challenge (AVEC) 2012 dataset, which contains the spontaneous, non-prototypical and unsegmented speech emotion data. The experimental results show that the proposed EDF-ACS algorithm outperforms the existing state-of-the-art algorithm in turn-based speech emotion recognition.展开更多
To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for...To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for feature extraction.First, in order to extract the spectra features, the auditory attention model is employed for variational emotion features detection. Then, the selective attention mechanism model is proposed to extract the salient gist features which showtheir relation to the expected performance in cross-corpus testing.Furthermore, the Chirplet time-frequency atoms are introduced to the model. By forming a complete atom database, the Chirplet can improve the spectrum feature extraction including the amount of information. Samples from multiple databases have the characteristics of multiple components. Hereby, the Chirplet expands the scale of the feature vector in the timefrequency domain. Experimental results show that, compared to the traditional feature model, the proposed feature extraction approach with the prototypical classifier has significant improvement in cross-corpus speech recognition. In addition, the proposed method has better robustness to the inconsistent sources of the training set and the testing set.展开更多
In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projec...In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.展开更多
Semi-supervised discriminant analysis SDA which uses a combination of multiple embedding graphs and kernel SDA KSDA are adopted in supervised speech emotion recognition.When the emotional factors of speech signal samp...Semi-supervised discriminant analysis SDA which uses a combination of multiple embedding graphs and kernel SDA KSDA are adopted in supervised speech emotion recognition.When the emotional factors of speech signal samples are preprocessed different categories of features including pitch zero-cross rate energy durance formant and Mel frequency cepstrum coefficient MFCC as well as their statistical parameters are extracted from the utterances of samples.In the dimensionality reduction stage before the feature vectors are sent into classifiers parameter-optimized SDA and KSDA are performed to reduce dimensionality.Experiments on the Berlin speech emotion database show that SDA for supervised speech emotion recognition outperforms some other state-of-the-art dimensionality reduction methods based on spectral graph learning such as linear discriminant analysis LDA locality preserving projections LPP marginal Fisher analysis MFA etc. when multi-class support vector machine SVM classifiers are used.Additionally KSDA can achieve better recognition performance based on kernelized data mapping compared with the above methods including SDA.展开更多
文摘从"Tequila Sunrise"到"Lying Eyes",再到"Hotel California",超级乡村摇滚乐队The Eagles(老鹰乐队)以质朴的歌声写就了流行音乐史上一段不巧的辉煌。在经历了二十几年沉浮之后,乐队推出了新专辑——Long Road out of Eden.其保持了Eagles早期的乡村民谣风格,曲曲动听。看来有四十年唱龄的老鹰,风采依旧。这里选择了一首柔情似水的"Do Something"与大家分享。
文摘Intelligent personal assistants play a pivotal role in in-vehicle systems,significantly enhancing life efficiency,driving safety,and decision-making support.In this study,the multi-modal design elements of intelligent personal assistants within the context of visual,auditory,and somatosensory interactions with drivers were discussed.Their impact on the driver’s psychological state through various modes such as visual imagery,voice interaction,and gesture interaction were explored.The study also introduced innovative designs for in-vehicle intelligent personal assistants,incorporating design principles such as driver-centricity,prioritizing passenger safety,and utilizing timely feedback as a criterion.Additionally,the study employed design methods like driver behavior research and driving situation analysis to enhance the emotional connection between drivers and their vehicles,ultimately improving driver satisfaction and trust.
文摘Support vector machines (SVMs) are utilized for emotion recognition in Chinese speech in this paper. Both binary class discrimination and the multi class discrimination are discussed. It proves that the emotional features construct a nonlinear problem in the input space, and SVMs based on nonlinear mapping can solve it more effectively than other linear methods. Multi class classification based on SVMs with a soft decision function is constructed to classify the four emotion situations. Compared with principal component analysis (PCA) method and modified PCA method, SVMs perform the best result in multi class discrimination by using nonlinear kernel mapping.
基金The National Natural Science Foundation of China(No.61231002,61273266,61571106)the Foundation of the Department of Science and Technology of Guizhou Province(No.[2015]7637)
文摘In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm based on the combination of the emotional data field (EDF) and the ant colony search (ACS) strategy, called the EDF-ACS algorithm, is proposed. More specifically, the inter- relationship among the turn-based acoustic feature vectors of different labels are established by using the potential function in the EDF. To perform the spontaneous speech emotion recognition, the artificial colony is used to mimic the turn- based acoustic feature vectors. Then, the canonical ACS strategy is used to investigate the movement direction of each artificial ant in the EDF, which is regarded as the emotional label of the corresponding turn-based acoustic feature vector. The proposed EDF-ACS algorithm is evaluated on the continueous audio)'visual emotion challenge (AVEC) 2012 dataset, which contains the spontaneous, non-prototypical and unsegmented speech emotion data. The experimental results show that the proposed EDF-ACS algorithm outperforms the existing state-of-the-art algorithm in turn-based speech emotion recognition.
基金The National Natural Science Foundation of China(No.61273266,61231002,61301219,61375028)the Specialized Research Fund for the Doctoral Program of Higher Education(No.20110092130004)the Natural Science Foundation of Shandong Province(No.ZR2014FQ016)
文摘To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for feature extraction.First, in order to extract the spectra features, the auditory attention model is employed for variational emotion features detection. Then, the selective attention mechanism model is proposed to extract the salient gist features which showtheir relation to the expected performance in cross-corpus testing.Furthermore, the Chirplet time-frequency atoms are introduced to the model. By forming a complete atom database, the Chirplet can improve the spectrum feature extraction including the amount of information. Samples from multiple databases have the characteristics of multiple components. Hereby, the Chirplet expands the scale of the feature vector in the timefrequency domain. Experimental results show that, compared to the traditional feature model, the proposed feature extraction approach with the prototypical classifier has significant improvement in cross-corpus speech recognition. In addition, the proposed method has better robustness to the inconsistent sources of the training set and the testing set.
基金The National Natural Science Foundation of China(No.61231002,61273266)the Ph.D.Program Foundation of Ministry of Education of China(No.20110092130004)China Postdoctoral Science Foundation(No.2015M571637)
文摘In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.
基金The National Natural Science Foundation of China(No.61231002,61273266)the Ph.D.Programs Foundation of Ministry of Education of China(No.20110092130004)
文摘Semi-supervised discriminant analysis SDA which uses a combination of multiple embedding graphs and kernel SDA KSDA are adopted in supervised speech emotion recognition.When the emotional factors of speech signal samples are preprocessed different categories of features including pitch zero-cross rate energy durance formant and Mel frequency cepstrum coefficient MFCC as well as their statistical parameters are extracted from the utterances of samples.In the dimensionality reduction stage before the feature vectors are sent into classifiers parameter-optimized SDA and KSDA are performed to reduce dimensionality.Experiments on the Berlin speech emotion database show that SDA for supervised speech emotion recognition outperforms some other state-of-the-art dimensionality reduction methods based on spectral graph learning such as linear discriminant analysis LDA locality preserving projections LPP marginal Fisher analysis MFA etc. when multi-class support vector machine SVM classifiers are used.Additionally KSDA can achieve better recognition performance based on kernelized data mapping compared with the above methods including SDA.