Skip to main content
Short Research Article

Are Face-Incongruent Voices Harder to Process?

Effects of Face–Voice Gender Incongruency on Basic Cognitive Information Processing

Published Online:https://doi.org/10.1027/1618-3169/a000440

Abstract. Based on current integration theories of face–voice processing, the present study had participants process 1,152 videos of faces uttering digits. Half of the videos contained face–voice gender-incongruent stimuli (vs. congruent stimuli in the other half). Participants indicated digit magnitude or parity. Tasks were presented in pure blocks (only 1 task) and in task switching blocks (using colored cues to specify task). The results indicate significant congruency effects in pure blocks, but partially reversed congruency effects in task switching, probably due to enhanced assignment of capacity toward resolving difficult situational demands. Congruency effects did not dissipate over time, ruling out that initial surprise associated with incongruent stimuli drove the effects. The results show that interference between two task-irrelevant person-related dimensions (face/voice gender) can affect processing of a third, task-relevant dimension (digit identity), suggesting greater processing ease associated with more authentic voices (i.e., voices that do not violate face-based expectancies).

References

  • Allport, D. A., Styles, E. A. & Hsieh, S. (1994). Shifting intentional set: Exploring the dynamic control of tasks. In C. UmiltàM. MoscovitchEds., Attention and performance series. Attention and performance 15: Conscious and nonconscious information processing (pp. 421–452). Cambridge, MA: MIT Press. First citation in articleGoogle Scholar

  • Alsius, A., Möttönen, R., Sams, M. E., Soto-Faraco, S. & Tiippana, K. (2014). Effect of attentional load on audiovisual speech perception: Evidence from ERPs. Frontiers Psychology, 15, 727. https://doi.org/10.3389/fpsyg.2014.00727 First citation in articleGoogle Scholar

  • Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. https://doi.org/10.1146/annurev.psych.59.103006.093639 First citation in articleCrossref MedlineGoogle Scholar

  • Barsalou, L. W., Simmons, W. K., Barbey, A. & Wilson, C. D. (2003). Grounding conceptual knowledge in modality-specific systems. Trends in Cognitive Sciences, 7, 84–91. https://doi.org/10.1016/S1364-6613(02)00029-3 First citation in articleCrossref MedlineGoogle Scholar

  • Belin, P., Fecteau, S. & Bedard, C. (2004). Thinking the voice: Neural correlates of voice perception. Trends in Cognitive Sciences, 8, 129–135. https://doi.org/10.1016/j.tics.2004.01.008 First citation in articleCrossref MedlineGoogle Scholar

  • Campanella, S. & Belin, P. (2007). Integrating face and voice in person perception. Trends in Cognitive Sciences, 11, 535–543. https://doi.org/10.1016/j.tics.2007.10.001 First citation in articleCrossref MedlineGoogle Scholar

  • Bruce, V. & Young, A. (1986). Understanding face recognition. British Journal of Psychology, 77, 305–327. https://doi.org/10.1111/j.2044-8295.1986.tb02199.x First citation in articleCrossref MedlineGoogle Scholar

  • Campeanu, S., Craik, F. I. M. & Alain, C. (2013). Voice congruency facilitates word recognition. PLoS One, 8, e58778. https://doi.org/10.1371/journal.pone.0058778 First citation in articleCrossref MedlineGoogle Scholar

  • Ellis, H. D., Jones, D. M. & Mosdell, N. (1997). Intra- and inter-modal repetition priming of familiar faces and voices. British Journal of Psychology, 88, 143–156. https://doi.org/10.1111/j.2044-8295.1997.tb02625.x First citation in articleCrossref MedlineGoogle Scholar

  • Eriksen, B. A. & Eriksen, C. W. (1974). Effects of noise letters upon identification of a target letter in a non-search task. Perception and Psychophysics, 16, 143–149. https://doi.org/10.3758/BF03203267 First citation in articleCrossrefGoogle Scholar

  • Freeman, J. B. & Ambady, N. (2011). When two become one: Temporally dynamic integration of the face and voice. Journal of Experimental Social Psychology, 47, 259–263. https://doi.org/10.1016/j.jesp.2010.08.018 First citation in articleCrossrefGoogle Scholar

  • Geiselman, R. E. & Crawley, J. M. (1983). Incidental processing of speaker characteristics – voice as connotative information. Journal of Verbal Learning and Verbal Behavior, 22, 15–23. https://doi.org/10.1016/S0022-5371(83)80003-6 First citation in articleCrossrefGoogle Scholar

  • Goldinger, S. D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning Memory and Cognition, 22, 1166–1183. https://doi.org/10.1037/0278-7393.22.5.1166 First citation in articleCrossref MedlineGoogle Scholar

  • Green, K. P., Kuhl, P. K., Meltzoff, A. N. & Stevens, E. B. (1991). Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect. Perception & Psychophysics, 50, 524–536. https://doi.org/10.3758/BF03207536 First citation in articleCrossref MedlineGoogle Scholar

  • Greeno, J. G. & Middle School Mathematics through Applications Project Group. (1998). The situativity of knowing, learning, and research. American Psychologist, 53, 5–26. https://doi.org/10.1037/0003-066x.53.1.5 First citation in articleCrossrefGoogle Scholar

  • Huestegge, S. & Raettig, T. (2018). Crossing gender borders: Bidirectional dynamic interaction between face-based and voice-based gender categorization. Journal of Voice. https://doi.org/10.1016/j.jvoice.2018.09.020 First citation in articleCrossrefGoogle Scholar

  • Kiesel, A., Steinhauser, M., Wendt, M., Falkenstein, M., Jost, K., Philipp, A. M. & Koch, I. (2010). Control and interference in task switching – A review. Psychological Bulletin, 136, 849–874. https://doi.org/10.1037/a0019842 First citation in articleCrossref MedlineGoogle Scholar

  • Kim, S. Y., Kim, M. S. & Chun, M. M. (2005). Concurrent working memory load can reduce distraction. Proceedings of the National Academy of Science of the United States of America, 102, 16524–16529. https://doi.org/10.1073/pnas.0505454102 First citation in articleCrossref MedlineGoogle Scholar

  • Kornblum, S., Hasbroucq, T. & Osman, A. (1990). Dimensional overlap: Cognitive basis for stimulus-response compatibility – A model and taxonomy. Psychological Review, 97, 253–270. https://doi.org/10.1037/0033-295X.97.2.253 First citation in articleCrossref MedlineGoogle Scholar

  • Latinus, M., VanRullen, R. & Taylor, M. J. (2010). Top-down and bottom-up modulation in processing bimodal face/voice stimuli. BMC Neuroscience, 11, 36. https://doi.org/10.1186/1471-2202-11-36 First citation in articleCrossref MedlineGoogle Scholar

  • Logan, G. D. & Zbrodoff, N. J. (1979). When it helps to be misled: Facilitative effects of increasing the frequency of conflicting stimuli in a Stroop-like task. Memory & Cognition, 3, 166–174. https://doi.org/10.3758/BF03197535 First citation in articleCrossrefGoogle Scholar

  • Masuda, S., Tsujii, T. & Watanabe, S. (2005). An interference effect of voice presentation on face gender discrimination task: Evidence from event-related potentials. International Congress Series, 1278, 156–159. https://doi.org/10.1016/j.ics.2004.11.193 First citation in articleCrossrefGoogle Scholar

  • McGurk, H. & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746–748. https://doi.org/10.1038/264746a0 First citation in articleCrossref MedlineGoogle Scholar

  • Meiran, N. (1996). Reconfiguration of processing mode prior to task performance. Journal of Experimental Psychology: Learning, Memory & Cognition, 22, 1423–1442. https://doi.org/10.1037/0278-7393.22.6.1423 First citation in articleCrossrefGoogle Scholar

  • Meuter, R. F. I. & Allport, A. (1999). Bilingual language switching in naming: Asymmetrical costs of language selection. Journal of Memory and Language, 40, 25–40. https://doi.org/10.1006/jmla.1998.2602 First citation in articleCrossrefGoogle Scholar

  • Murphy, G., Groeger, J. A. & Greene, C. M. (2016). Twenty years of load theory – Where are we now, and where should we go next? Psychonomic Bulletin & Review, 23, 1316–1340. https://doi.org/10.3758/s13423-015-0982-5 First citation in articleCrossref MedlineGoogle Scholar

  • Neumann, M. F. & Schweinberger, S. R. (2009). N250r ERP repetition effects from distractor faces when attending to another face under load: Evidence for a face attention resource. Brain Research, 1270, 64–77. https://doi.org/10.1016/j.brainres.2009.03.018 First citation in articleCrossref MedlineGoogle Scholar

  • Nygaard, L. C. & Pisoni, D. B. (1998). Talker-specific learning in speech perception. Perception & Psychophysics, 60, 355–376. First citation in articleCrossref MedlineGoogle Scholar

  • Nygaard, L. C., Sommers, M. S. & Pisoni, D. B. (1994). Speech-perception as a talker-contingent process. Psychological Science, 5, 42–46. https://doi.org/10.3758/BF03206860 First citation in articleCrossref MedlineGoogle Scholar

  • Palmeri, T. J., Goldinger, S. D. & Pisoni, D. B. (1993). Episodic encoding of voice attributes and recognition memory for spoken words. Journal of Experimental Psychology: Learning Memory and Cognition, 19, 309–328. https://doi.org/10.1037/0278-7393.19.2.309 First citation in articleCrossref MedlineGoogle Scholar

  • Perrachione, T. K. & Wong, P. C. M. (2007). Learning to recognize speakers of a non-native language: Implications for the functional organization of human auditory cortex. Neuropsychologia, 45, 1899–1910. https://doi.org/10.1016/j.neuropsychologia.2006.11.015 First citation in articleCrossref MedlineGoogle Scholar

  • Peynircioglu, Z., Brent, W., Tatz, J. & Wyatt, J. (2017). McGurk effect in gender identification: Vision trumps audition in voice judgments. The Journal of General Psychology, 144, 59–68. https://doi.org/10.1080/00221309.2016.1258388 First citation in articleCrossref MedlineGoogle Scholar

  • Rogers, R. D. & Monsell, S. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. https://doi.org/10.1037/0096-3445.124.2.207 First citation in articleCrossrefGoogle Scholar

  • Schweinberger, S. R. & Robertson, D. M. C. (2017). Audiovisual integration in familiar person recognition. Visual Cognition, 25, 589–610. https://doi.org/10.1080/13506285.2016.1276110 First citation in articleCrossrefGoogle Scholar

  • Schweinberger, S. R., Robertson, D. M. C. & Kaufmann, R. M. (2007). Hearing facial identities. The Quarterly Journal of Experimental Psychology, 60, 1446–1456. https://doi.org/10.1080/17470210601063589 First citation in articleCrossrefGoogle Scholar

  • Smith, E. L., Grabowecky, M. & Suzuki, S. (2007). Auditory-visual crossmodal integration in perception of face gender. Current Biology, 17, 1680–1685. https://doi.org/10.1016/j.cub.2007.08.043 First citation in articleCrossref MedlineGoogle Scholar

  • Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18, 643–662. https://doi.org/10.1037/h0054651 First citation in articleCrossrefGoogle Scholar

  • Theeuwes, J. & Van der Stigchel, S. (2006). Faces capture attention: Evidence from inhibition of return. Visual Cognition, 13, 657–665. https://doi.org/10.1080/13506280500410949 First citation in articleCrossrefGoogle Scholar

  • Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin & Review, 9, 625–636. https://doi.org/10.3758/BF03196322 First citation in articleCrossref MedlineGoogle Scholar

  • Yovel, G. & Belin, P. (2013). A unified coding strategy for processing faces and voices. Trends in Cognitive Science, 17, 263–271. https://doi.org/10.1016/j.tics.2013.04.004 First citation in articleCrossref MedlineGoogle Scholar

  • Zäske, R., Fritz, C. & Schweinberger, S. R. (2013). Spatial inattention abolishes voice adaptation. Attention Perception & Psychophysics, 75, 603–613. https://doi.org/10.3758/s13414-012-0420-y First citation in articleCrossref MedlineGoogle Scholar

  • Zäske, R., Perlich, M. C. & Schweinberger, S. R. (2016). To hear or not to hear: Voice processing under visual load. Attention Perception & Psychophysics, 78, 1488–1495. https://doi.org/10.3758/s13414-016-1119-2 First citation in articleCrossref MedlineGoogle Scholar

  • Zäske, R., Volberg, G., Kovacs, G. & Schweinberger, S. R. (2014). Electrophysiological correlates of voice learning and recognition. Journal of Neuroscience, 34, 10821–10831. https://doi.org/10.1523/JNEUROSCI.0581-14.2014 First citation in articleCrossref MedlineGoogle Scholar