Auditory information can transfer emotional message through rhythm and semantic meaning. In some cases, the emotional implication from the rhythm and the semantic might even be inconsistent. If such complicated auditory information is simultaneously presented with visual information like faces, what kind of interaction will happen when people process such information? To investigate this question is the aim of our present study. The sound materials were produced by two professional speakers, a male and a female. In Experiment 1, speakers read neutral words with a happy or angry rhythm respectively, and the participants' task was to judge whether the emotion of the rhythm of the speech was consistent with the facial expression which was presented at the same time. In Experiment 2, speakers read positive words with an angry rhythm or read negative words with a happy rhythm respectively, and the participants' task was to judge either the emotion of the rhythm of the speech or the semantics of speech was consistent with the facial expression which was presented at the same time. There were 43 participants (21 females) taking part in Experiment 1 and 40 participants (20 females) in Experiment 2. Each experiment included 120 trails. In Experiment 1, half of the trials were consistent and the other half inconsistent. In Experiment 2, half were consistent with the semantic information and the other half consistent with the rhythm information. The key- response mapping was counterbalanced across participants. Repeated measures ANOVA with facial expression and rhythm-emotion consistency as the within-subjects factors in Experiment 1 and with facial expression and judgment cue as the within-subjects factors in Experiment 2 were performed on the participant's mean reaction time and accuracy. The results revealed that (1) When the facial expression was positive, participants were more accurate in judging the relationship of information between the visual and auditory channels [Experiment 1 .F (1, 42) = 15.41,p 〈 .001, partial n2 = .27;Experiment 2:F (1, 39) = 6.82,p 〈 0.05, partial n2 =0. 15]. (2) In Experiment 1, when the valence of the rhythm was consistent with the facial expression, the judgment of the relationship of information between the visual and auditory channels was faster IF (1, 42) = 37.63,p 〈 0.001, partial n2 = 0.47] and more accurate[F (1, 42) = 21.80,p 〈0 .001,partial 72=0 .34]. (3) In Experiment 2, when the facial expression was negative, compared with the rhythm clues of the words, the semantic clues could facilitate participants' response in judging the relationship of visual and auditory stimuli [F (1, 39) = 15.78,p 〈 .001, partial 1/2 = .41]. The results suggested that when the visual and auditory stimuli were presented at the same time, the visual information was processed in advance and then affected the auditory information processing. Whether the emotional valence of the visual and auditory stimulus was conflicting or not, positive facial expression promoted the cognitive judgment about the relationship between the visual and auditory information. When the emotional valence of visual and auditory stimulus was congruent, it brought out an Easy Processing phenomenon. When the emotional valence of visual and auditory stimulus was conflicting, negative facial expression and the semantic information in the auditory channel could promote each other's processing. The present study innovatively explored the separated role of the semantic emotional information and the rhythmic emotional information of a word on the judgment about the relationship between the visual and auditory stimuli. It initially revealed that the semantic information had a speed advantage while the rhythm information had an accuracy advantage when the visual stimulus was a negative facial expression.
Journal of Psychological Science
cross-modal, facial expression, emotional voice, rhythm- semantic conflicts