Skip to main content
Original Article

Not Very Powerful

The Influence of Negations and Vague Quantifiers on the Psychometric Properties of Questionnaires

Published Online:https://doi.org/10.1027/1015-5759/a000539

Abstract. Several guidelines on how to construct questionnaire items exist, even though the literature lacks empirical evidence for their effectiveness. To investigate whether the addition of negations and vague quantifiers worsens the psychometric properties of an established questionnaire, 872 participants completed one version of the Positive and Negative Affect Schedule (PANAS) – the German original, a negated version, a version with vague quantifiers or a version with both negations and vague quantifiers. Reliability estimates, item-total correlations, Confirmatory Factor Analysis (CFA) model fit, and fit to the Partial Credit Model (PCM) were compared among the four conditions. No PANAS version was clearly superior as no systematic pattern in the psychometric properties was found. Our findings question the general applicability of the guidelines of item construction as well as the effectiveness of widely used statistical analyses assessing the quality of scales. The results should encourage researchers to put a stronger focus on careful item construction as relying on psychometric properties might not be sufficient to develop valid questionnaires.

References

  • Anastasi, A., & Urbina, S. (1997). Psychology testing. Englewood Cliffs, NJ: Prentice-Hall International. First citation in articleGoogle Scholar

  • Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140. https://doi.org/10.1007/BF02291180 First citation in articleCrossrefGoogle Scholar

  • Andrich, D. (2013). An expanded derivation of the threshold structure of the polytomous Rasch model that dispels any “threshold disorder controversy”. Educational and Psychological Measurement, 73, 78–124. https://doi.org/10.1177/0013164412450877 First citation in articleCrossrefGoogle Scholar

  • Barnette, J. J. (2000). Effects of stem and Likert response option reversals on survey internal consistency: If you feel the need, there is a better alternative to using those negatively worded stems. Educational and Psychological Measurement, 60, 361–370. https://doi.org/10.1177/00131640021970592 First citation in articleCrossrefGoogle Scholar

  • Bassili, J. N., & Scott, B. S. (1996). Response latency as a signal to question problems in survey research. Public Opinion Quarterly, 60, 390–399. https://doi.org/10.1086/297760 First citation in articleCrossrefGoogle Scholar

  • Beauducel, A., & Wittmann, W. W. (2005). Simulation study on fit indexes in CFA based on data with slightly distorted simple structure. Structural Equation Modeling, 12, 41–75. https://doi.org/10.1207/s15328007sem1201_3 First citation in articleCrossrefGoogle Scholar

  • Billiet, J. B., & McClendon, M. J. (2000). Modeling acquiescence in measurement models for two balanced sets of items. Structural Equation Modeling, 7, 608–628. https://doi.org/10.1207/S15328007SEM0704_5 First citation in articleCrossrefGoogle Scholar

  • Budescu, D. V., & Wallsten, T. S. (1985). Consistency in Interpretation of probabilistic phrases. Organizational Behavior and Human Decision Processes, 36, 391–405. https://doi.org/10.1016/0749-5978(85)90007-X First citation in articleCrossrefGoogle Scholar

  • Canty, A., & Ripley, B. (2017). boot: Bootstrap R (S-Plus) functions, R package (Version 1.3.20). Retrieved from https://cran.r-project.org/web/packages/boot/boot.pdf First citation in articleGoogle Scholar

  • Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14, 464–504. https://doi.org/10.1080/10705510701301834 First citation in articleCrossrefGoogle Scholar

  • Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources. First citation in articleGoogle Scholar

  • Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. https://doi.org/10.1007/BF02310555 First citation in articleCrossrefGoogle Scholar

  • Diers, C. J. (1964). Social desirability and acquiescence in response to personality items. Journal of Consulting Psychology, 28, 71–77. https://doi.org/10.1037/h0043753 First citation in articleCrossrefGoogle Scholar

  • Eggert, D. (1974). Eysenck-Persönlichkeits-Inventar: EPI: Handanweisung für die Durchführung und Auswertung [Eysenck-Personality Inventory: EPI: Manual for testing and analysis]. Oxford, UK: Verlag für Psychologie. First citation in articleGoogle Scholar

  • Flake, J. K., Pek, J., & Hehman, E. (2017). Construct validation in social and personality research: Current practice and recommendations. Social Psychological and Personality Science, 8, 370–378. https://doi.org/10.1177/1948550617693063 First citation in articleCrossrefGoogle Scholar

  • García-Pérez, M. A. (2017). An analysis of (dis)ordered categories, thresholds, and crossings in difference and divide-by-total IRT models for ordered responses. The Spanish Journal of Psychology, 20, E10. https://doi.org/10.1017/sjp.2017.11 First citation in articleCrossrefGoogle Scholar

  • Gnambs, T. (2015). Facets of measurement error for scores of the Big Five: Three reliability generalizations. Personality and Individual Differences, 84, 84–89. https://doi.org/10.1016/j.paid.2014.08.019 First citation in articleCrossrefGoogle Scholar

  • Hakel, M. D. (1968). How often is often? American Psychologist, 23, 533–534. https://doi.org/10.1037/h0037716 First citation in articleCrossrefGoogle Scholar

  • Heene, M., Hilbert, S., Draxler, C., Ziegler, M., & Bühner, M. (2011). Masking misfit in confirmatory factor analysis by increasing unique variances: A cautionary note on the usefulness of cutoff values of fit indices. Psychological Methods, 16, 319–336. https://doi.org/10.1037/a0024917 First citation in articleCrossrefGoogle Scholar

  • Herche, J., & Engelland, B. (1996). Reversed-polarity items and scale unidimensionality. Journal of the Academy of Marketing Science, 24, 366–374. https://doi.org/10.1177/0092070396244007 First citation in articleCrossrefGoogle Scholar

  • Hinz, A., Brähler, E., Geyer, M., & Körner, A. (2003). Urteilseffekte beim NEO-FFI [Response sets measured with NEO-FFI]. Diagnostica, 49, 157–163. https://doi.org/10.1026//0012-1924.49.4.157 First citation in articleLinkGoogle Scholar

  • Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. https://doi.org/10.1080/10705519909540118 First citation in articleCrossrefGoogle Scholar

  • Korkmaz, S., Goksuluk, D., & Zararsiz, G. (2014). MVN: An R package for assessing multivariate normality (Version 4.0). The R Journal, 6, 151–162. https://doi.org/10.32614/RJ-2014-031 First citation in articleCrossrefGoogle Scholar

  • Krohne, H. W., Egloff, B., Kohlmann, C. W., & Tausch, A. (1996). Untersuchungen mit einer deutschen Version der “Positive and Negative Affect Schedule” (PANAS) [Investigations with a German version of the Positive and Negative Affect Schedule (PANAS)]. Diagnostica, 42(2), 139–156. https://doi.org/10.1037/t49650-000 First citation in articleGoogle Scholar

  • Löhr, F. J., & Angleitner, A. (1980). Eine Untersuchung zu sprachlichen Formulierungen der Items in deutschen Persönlichkeitsfragebogen [An investigation of item wording in German personality questionnaires]. Zeitschrift für Differentielle und Diagnostische Psychologie, 1, 217–235. First citation in articleGoogle Scholar

  • Mair, P., & Hatzinger, R. (2007). Extended Rasch modeling: The eRm package for the application of IRT models in R. Journal of Statistical Software, 20, 1–20. Retrieved from http://www.jstatsoft.org/v20/i09. (Version 0.15-7) First citation in articleCrossrefGoogle Scholar

  • Marsh, H. W. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70, 810–819. https://doi.org/10.1037/0022-3514.70.4.810 First citation in articleCrossrefGoogle Scholar

  • Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174. https://doi.org/10.1007/BF02296272 First citation in articleCrossrefGoogle Scholar

  • McDonald, R. P. (1970). The theoretical foundations of principal factor analysis, canonical factor analysis, and alpha factor analysis. British Journal of Mathematical and Statistical Psychology, 23, 1–21. https://doi.org/10.1111/j.2044-8317.1970.tb00432.x First citation in articleCrossrefGoogle Scholar

  • Meloni, F., & Gana, K. (2001). Wording effects in the Italian version of the Penn State Worry Questionnaire. Clinical Psychology and Psychotherapy, 8, 282–287. https://doi.org/10.1002/cpp.294 First citation in articleCrossrefGoogle Scholar

  • Moxey, L. M., & Sanford, A. J. (1992). Context effects and the communicative functions of quantifiers: Implications for their use in attitude research. In N. SchwarzS. SudmanEds., Context effects in social and psychological research (pp. 279–296). New York, NY: Springer. First citation in articleGoogle Scholar

  • Moxey, L. M., & Sanford, A. J. (1993). Prior expectation and the interpretation of natural language quantifiers. European Journal of Cognitive Psychology, 5, 73–91. https://doi.org/10.1080/09541449308406515 First citation in articleCrossrefGoogle Scholar

  • Nelson Laird, T. F., Korkmaz, A., & Chen, P. D. (2008, April). How often is “often” revisited: The meaning and linearity of vague quantifiers used on the National Survey of Student Engagement. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA First citation in articleGoogle Scholar

  • Ortuño-Sierra, J., Santarén-Rosell, M., Pérez de Albéniz, A., & Fonseca-Pedrero, E. (2015). Dimensional structure of the Spanish version of the Positive and Negative Affect Schedule (PANAS) in adolescents and young adults. Psychological Assessment, 27, e1–e9. https://doi.org/10.1037/pas0000107 First citation in articleCrossrefGoogle Scholar

  • Pargent, F., Hilbert, S., Eichhorn, K., & Bühner, M. (2018). Can’t make it better nor worse: An empirical study about the effectiveness of general rules of item construction on psychometric properties. European Journal of Psychological Assessment, Advance online publication. https://doi.org/10.1027/1015-5759/a000471 First citation in articleLinkGoogle Scholar

  • Payne, S. L. (1951). The art of asking questions. Princeton, NJ: Princeton University Press. First citation in articleGoogle Scholar

  • Penfield, R. D., Myers, N. D., & Wolfe, E. W. (2008). Methods for assessing item, step, and threshold invariance in polytomous items following the partial credit model. Educational and Psychological Measurement, 68, 717–733. https://doi.org/10.1177/0013164407312602 First citation in articleCrossrefGoogle Scholar

  • Pepper, S., & Prytulak, L. S. (1974). Sometimes frequently means seldom: Context effects in the interpretation of quantitative expressions. Journal of Research in Personality, 8, 95–101. https://doi.org/10.1016/0092-6566(74)90049-X First citation in articleCrossrefGoogle Scholar

  • Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual Review of Psychology, 63, 539–569. https://doi.org/10.1146/annurev-psych-120710-100452 First citation in articleCrossrefGoogle Scholar

  • R Core Team. (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. (Version 3.3.3). Retrieved from https://www.R-project.org/ First citation in articleGoogle Scholar

  • Revelle, W. (2017). psych: Procedures for personality and psychological research. Evanston, IL, USA: Northwestern University. (Version 1.8.4). Retrieved from https://CRAN.R-project.org/package=psych First citation in articleGoogle Scholar

  • Rohrmann, B. (1978). Empirische Studien zur Entwicklung von Antwortskalen für die sozialwissenschaftliche Forschung [Empirical Studies of the development of rating scales in social sciences]. Zeitschrift für Sozialpsychologie, 9, 222–245. First citation in articleGoogle Scholar

  • Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. Retrieved from http://www.jstatsoft.org/v48/i02/. (Version 0.5-22) First citation in articleCrossrefGoogle Scholar

  • Rost, J., & von Davier, M. (1995). Mixture distribution Rasch models. In G. FischerI. MolenaarEds., Rasch models (pp. 257–268). New York, NY: Springer. First citation in articleGoogle Scholar

  • Royston, J.P. (1983). Some techniques for assessing multivariate normality based on the Shapiro-Wilk W. Applied Statistics, 32, 121–133. Retrieved from http://www.jstor.org/stable/2347291 First citation in articleCrossrefGoogle Scholar

  • Schaeffer, N. C. (1991). Hardly ever or constantly? Group comparisons using vague quantifiers. Public Opinion Quarterly, 55, 395–423. https://doi.org/10.1086/269270 First citation in articleCrossrefGoogle Scholar

  • Spreen, O., & Spreen, G. (1963). The MMPI in a German speaking population standardization report and methodological problems of cross-cultural interpretations. Acta Psychologica, 21, 265–273. https://doi.org/10.1016/0001-6918(63)90052-0 First citation in articleCrossrefGoogle Scholar

  • Swain, S. D., Weathers, D., & Niedrich, R. W. (2008). Assessing three sources of misresponse to reversed Likert items. Journal of Marketing Research, 45, 116–131. https://doi.org/10.1509/jmkr.45.1.116 First citation in articleCrossrefGoogle Scholar

  • von Davier, M. (2001). WINMIRA 2001 (Version 1.45). St. Paul, MN: Assessment Systems Corporation. First citation in articleGoogle Scholar

  • von Davier, M., & Rost, J. (1995). Polytomous mixed Rasch models. In G. H. FischerI. W. MolenaarEds., Rasch models (pp. 371–379). New York, NY: Springer. First citation in articleGoogle Scholar

  • Weijters, B., & Baumgartner, H. (2012). Misresponse to reversed and negated items in surveys: A review. Journal of Marketing Research, 49, 737–747. https://doi.org/10.1509/jmr.11.0368 First citation in articleCrossrefGoogle Scholar

  • Wetzel, E., & Carstensen, C. H. (2014). Reversed thresholds in Partial Credit Models: A reason for collapsing categories? Assessment, 21, 765–774. https://doi.org/10.1177/1073191114530775 First citation in articleCrossrefGoogle Scholar

  • Wright, D. B., Gaskell, G. D., & O’Muircheartaigh, C. A. (1994). How much is “quite a bit”? Mapping between numerical values and vague quantifiers. Applied Cognitive Psychology, 8, 479–496. https://doi.org/10.1002/acp.2350080506 First citation in articleCrossrefGoogle Scholar

  • Ziegler, M. (2014). Comments on item selection procedures. European Journal of Psychological Assessment, 30, 1–2. https://doi.org/10.1027/1015-5759/a000196 First citation in articleLinkGoogle Scholar