Do Individual Response Styles Matter?
Assessing Differential Item Functioning for Men and Women in the NEO-PI-R
Abstract
The occurrence of differential item functioning (DIF) for gender indicates that an instrument may not be functioning equivalently for men and women. Aside from DIF effects, item responses in personality questionnaires can also be influenced by response styles. This study analyzes the German NEO-PI-R regarding its differential item functioning for men and women while taking response styles into account. To this end, mixed Rasch models were estimated first to identify latent classes that differed in their response style. These latent classes were identified as extreme response style (ERS) and nonextreme response style (NERS). Then, DIF analyses were conducted separately for the different response styles and compared with DIF results for the complete sample. Several items especially on Neuroticism, Agreeableness, and Conscientiousness facets showed gender-DIF and thus function differentially between men and women. DIF results differed mainly in size between the complete sample and the response style subsamples, though DIF classification was overall consistent between ERS, NERS, and the complete sample.
References
1999). Standards for educational and psychological testing. Washington, DC: American Education Research Association.
. (2006). Individual differences in response scale use: Mixed Rasch modeling of responses to NEO-FFI items. Personality and Individual Differences, 40, 1235–1245. doi 10.1016/j.paid.2005.10.018
(1991). The Big Five personality dimensions and job performance: A meta-analysis. Personnel Psychology, 44, 1–26. doi 10.1111/j.1744-6570.1991.tb00688.x
(2001). Response styles in marketing research: A cross-national investigation. Journal of Marketing Research, 38, 143–156. doi 10.1509/jmkr.38.2.143.18840
(2009). Addressing score bias and differential item functioning due to individual differences in response style. Applied Psychological Measurement, 33, 335–352. doi 10.1177/0146621608329891
(1993). NEO-Fünf-Faktoren-Inventar (NEO-FFI) nach Costa und McCrae [
(NEO-FFI: NEO Five Factor Inventory after Costa and McCrae ]. Göttingen, Germany: Hogrefe.1987). Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52, 345–370. doi 10.1007/BF02294361
(2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71, 460–502. doi 10.1177/0013164410375112
(1988). Statistical power analysis for the behavioral sciences. New York, NY: Erlbaum.
(1992). Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI). Odessa, FL: Psychological Assessment Resources.
(2001). Gender differences in personality traits across cultures: Robust and surprising findings. Journal of Personality and Social Psychology, 81, 322–331. doi 10.1037//0022-3514.81.2.322
(1994). Gender differences in personality: A meta-analysis. Psychological Bulletin, 116, 429–456. doi 10.1037// 0033-2909.116.3.429
(Holland, P. W. Wainer, H. Eds. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.
2009). Investigating gender differential item functioning across countries and test languages for PISA science items. International Journal of Testing, 9, 122–133. doi 10.1080/15305050902880769
(1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70, 810–819. doi 10.1037/ 0022-3514.70.4.810
(1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174. doi 10.1007/BF02296272
(2009). Gender and ethnicity differences on the Abridged Big Five Circumplex (AB5C) of personality traits: A differential item functioning analysis. Educational and Psychological Measurement, 69, 613–635. doi 10.1177/ 0013164408323235
(2004). NEO-PI-R: NEO-Persönlichkeitsinventar nach Costa und McCrae [
(NEO-PI-R: NEO Personality Inventory after Costa and McCrae ]. Göttingen, Germany: Hogrefe.2001). Invariance on the NEO-PI-R neuroticism scale. Multivariate Behavioral Research, 36, 83–110. doi 10.1207/S15327906MBR3601_04
(1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282. doi 10.1177/014662169001400305
(1991). A logistic mixture distribution model for polytomous item responses. The British Journal for Mathematical and Statistical Psychology, 44, 75–92. doi 10.1111/j.2044-8317.1991.tb00951.x
(1999). Sind die Big Five Rasch-skalierbar? Eine Reanalyse der NEO-FFI-Normierungsdaten [
von. (Are the Big Five Rasch scalable? A reanalysis of the NEO-FFI norming data ]. Diagnostica, 45, 119–127. doi 10.1026//0012-1924.45.3.1191995). Mixture distribution Rasch models. In G. H. Fischer I. W. MolenaarEds., Rasch models: Foundations, recent developments, and applications (pp. 257–268). New York, NY: Springer.
von. (1998). Gender differences on negative affectivity: An IRT study of differential item functioning on the Multidimensional Personality Questionnaire stress reaction scale. Journal of Personality and Social Psychology, 75, 1350–1362.
(2001). WINMIRA 2001 [Computer software]. Kiel, Germany: Institute for Science Education.
(2008). Assessment of differential item functioning. Journal of Applied Measurement, 9, 387–408.
(2013). Consistency of extreme response style and nonextreme response style across traits. Journal of Research in Personality, 47, 178–189. doi 10.1016/j.jrp.2012.10.010
(2007). ConQuest (Version 2.0) [Computer software]. Camberwell, Australia: Australian Council for Educational Research.
(1993). Practical questions in the use of DIF statistics in item development. In , Differential item functioning (pp. 337–364). Hillsdale, NJ: Erlbaum.
(