Skip to main content
Original Article

Experimental Test Validation

Examining the Path From Test Elements to Test Performance

Published Online:https://doi.org/10.1027/1015-5759/a000393

Abstract. Although the vast majority of validation studies rely on correlational validity evidence, there is an increased recognition that validity should also focus on whether variations in the focal psychological attribute lead to variations in the measurement outcomes. Therefore, calls have been made that validity evidence should also be gathered through experiments. Existing experimental validation strategies focus on manipulating psychological attributes and their effects on measurement outcomes. In the current manuscript, we present an additional and complementary approach that focuses on manipulating test elements (instead of psychological attributes) that are considered indispensable for test functioning. Examples from personality, situational judgment, emotional intelligence, and reading comprehension domains are presented to illustrate our approach. The presented approach is integrated into existing validation strategies.

References

  • AERA, APA, NCME. (2014). Standards for educational and psychological testing. Washington, DC: AERA. First citation in articleGoogle Scholar

  • APA, AERA, NCME, (1966). Standards for educational and psychological testing. Washington, DC: APA. First citation in articleGoogle Scholar

  • Baumgarten, M., Süß, H. M. & Weis, S. (2015). The cue is the key. European Journal of Psychological Assessment, 31, 38–44. doi: 10.1027/1015-5759/a000204 First citation in articleLinkGoogle Scholar

  • Bollen, K. A. (1989). A new incremental fit index for general structural equation models. Sociological Methods & Research, 17, 303–316. doi: 10.1177/0049124189017003004 First citation in articleCrossrefGoogle Scholar

  • Bornstein, R. F. (2011). Toward a process-focused model of test score validity: Improving psychological assessment in science and practice. Psychological Assessment, 23, 532–544. doi: 10.1037/a0022402 First citation in articleCrossrefGoogle Scholar

  • Borsboom, D. & Markus, K. A. (2013). Truth and evidence in validity theory. Journal of Educational Measurement, 50, 110–114. doi: 10.1111/jedm.12006 First citation in articleCrossrefGoogle Scholar

  • Borsboom, D., Mellenbergh, G. J. & van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061–1071. doi: 10.1037/0033-295X.111.4.1061 First citation in articleCrossrefGoogle Scholar

  • Cronbach, L. J. & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. doi: 10.1037/h0040957 First citation in articleCrossrefGoogle Scholar

  • Döring, A. K., Blauensteiner, A., Aryus, K., Drögekamp, L. & Bilsky, W. (2010). Assessing values at an early age: The Picture-Based Value Survey for Children (PBVS-C). Journal of Personality Assessment, 92, 439–448. doi: 10.1080/00223891.2010.497423 First citation in articleCrossrefGoogle Scholar

  • Embretson, S. E. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93, 179–197. First citation in articleCrossrefGoogle Scholar

  • Greiff, S., Wüstenberg, S. & Funke, J. (2012). Dynamic problem solving: A new assessment perspective. Applied Psychological Measurement, 36, 189–213. doi: 10.1177/0146621612439620 First citation in articleCrossrefGoogle Scholar

  • Guion, R. M. (1980). On Trinitarian doctrines of validity. Professional Psychology, 11, 385–398. doi: 10.1037/0735-7028.11.3.385 First citation in articleCrossrefGoogle Scholar

  • Irvine, S. H. & Kyllonen, P. C. (2002). Item generation for test development. Mahwah, NJ: Erlbaum. First citation in articleGoogle Scholar

  • Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73. doi: 10.1111/jedm.12000 First citation in articleCrossrefGoogle Scholar

  • Katz, S., Lautenschlager, G. J., Blackburn, A. B. & Harris, F. H. (1990). Answering reading comprehension items without passages on the SAT. Psychological Science, 1, 122–127. First citation in articleCrossrefGoogle Scholar

  • Krumm, S., Lievens, F., Hüffmeier, J., Lipnevich, A. A., Bendels, H. & Hertel, G. (2015). How “situational” is judgment in situational judgment tests? The Journal of Applied Psychology, 100, 399–416. doi: 10.1037/a0037674 First citation in articleCrossrefGoogle Scholar

  • Krumm, S., Schäpers, P. & Göbel, A. (2016). Motive arousal without pictures? An experimental validation of a hybrid implicit motive test. Journal of Personality Assessment, 98, 514–522. First citation in articleCrossrefGoogle Scholar

  • Landy, F. J. (1986). Stamp collecting versus science: Validation as hypothesis testing. The American Psychologist, 41, 1183–1192. doi: 10.1037/0003-066X.41.11.1183 First citation in articleCrossrefGoogle Scholar

  • Lievens, F., De Corte, W. & Schollaert, E. (2008). A closer look at the frame-of-reference effect in personality scale scores and validity. The Journal of Applied Psychology, 93, 268–279. doi: 10.1037/0021-9010.93.2.268 First citation in articleCrossrefGoogle Scholar

  • Lievens, F. & Motowidlo, S. J. (2016). Situational judgment tests: From measures of situational judgment to measures of general domain knowledge. Industrial and Organizational Psychology, 9, 3–22. doi: 10.1017/iop.2015.71 First citation in articleCrossrefGoogle Scholar

  • Lievens, F. & Sackett, P. R. (2006). Video-based versus written situational judgment tests: a comparison in terms of predictive validity. The Journal of Applied Psychology, 91, 1181–1188. doi: 10.1037/0021-9010.91.5.1181 First citation in articleCrossrefGoogle Scholar

  • Lievens, F. & Sackett, P. R. (2007). Situational judgment tests in high-stakes settings: Issues and strategies with generating alternate forms. The Journal of Applied Psychology, 92, 1043–1055. doi: 10.1037/0021-9010.92.4.1043 First citation in articleCrossrefGoogle Scholar

  • Loevinger, J. (1957). Objective tests as instruments of psychological theory: Monograph supplement 9. Psychological Reports, 3, 635–694. doi: 10.2466/pr0.1957.3.3.635 First citation in articleCrossrefGoogle Scholar

  • McClelland, D. C. (1985). Human motivation. Glenview, IL: Scott, Foresman & Co. First citation in articleGoogle Scholar

  • Motowidlo, S. J., Crook, A. E., Kell, H. J. & Naemi, B. (2009). Measuring procedural knowledge more simply with a single-response situational judgment test. Journal of Business and Psychology, 24, 281–288. doi: 10.1007/s10869-009-9106-4 First citation in articleCrossrefGoogle Scholar

  • Motowidlo, S. J., Dunnette, M. D. & Carter, G. W. (1990). An alternative selection procedure: The low-fidelity simulation. The Journal of Applied Psychology, 75, 640–647. doi: 10.1037/0021-9010.75.6.640 First citation in articleCrossrefGoogle Scholar

  • Murray, H. A. (1938). Explorations in personality. New York, NY: Oxford. First citation in articleGoogle Scholar

  • Newton, P. E. (2012). Clarifying the consensus definition of validity. Measurement, 10, 1–29. doi: 10.1080/15366367.2012.669666 First citation in articleCrossrefGoogle Scholar

  • Newton, P. & Shaw, S. (2014). Validity in educational and psychological assessment. Cambridge, UK: Sage. First citation in articleCrossrefGoogle Scholar

  • Podsakoff, N. P., Podsakoff, P. M., MacKenzie, S. B. & Klinger, R. L. (2013). Are we really measuring what we say we’re measuring? Using video techniques to supplement traditional construct validation procedures. The Journal of Applied Psychology, 98, 99–113. doi: 10.1037/a0029570 First citation in articleCrossrefGoogle Scholar

  • Rockstuhl, T., Ang, S., Ng, K. Y., Lievens, F. & Van Dyne, L. (2015). Putting judging situations into situational judgment tests: Evidence from intercultural multimedia SJTs. The Journal of Applied Psychology, 100, 464–480. doi: 10.1037/a0038098 First citation in articleCrossrefGoogle Scholar

  • Rost, D. H. & Sparfeldt, J. R. (2007). Leseverständnis ohne Lesen? Zur Konstruktvalidität von multiple-choice-Leseverständnistestaufgaben [Reading comprehension without reading? Construct validity of multiple-choice reading comprehension tasks]. Zeitschrift für Pädagogische Psychologie, 21, 305–314. doi: 10.1024/1010-0652.21.3.305 First citation in articleLinkGoogle Scholar

  • Schroeder, S. & Tiffin-Richards, S. (2014). Kognitive Verarbeitung von Leseverständnisitems mit und ohne Text [Cognitive processing of reading comprehension items with and without text]. Zeitschrift Für Pädagogische Psychologie, 28, 21–30. doi: 10.1024/1010-0652/a000121 First citation in articleLinkGoogle Scholar

  • Sokolowski, K., Schmalt, H. D., Langens, T. A. & Puca, R. M. (2000). Assessing achievement, affiliation, and power motives all at once: The Multi-Motive Grid (MMG). Journal of Personality Assessment, 74, 126–145. doi: 10.1207/S15327752JPA740109 First citation in articleCrossrefGoogle Scholar

  • Sparfeldt, J. R., Kimmel, R., Löwenkamp, L., Steingräber, A. & Rost, D. H. (2012). Not read, but nevertheless solved? Three experiments on PIRLS multiple choice reading comprehension test items. Educational Assessment, 17, 214–232. doi: 10.1080/10627197.2012.735921 First citation in articleCrossrefGoogle Scholar

  • Vigneau, F., Caissie, A. F. & Bors, D. A. (2006). Eye-movement analysis demonstrates strategic influences on intelligence. Intelligence, 34, 261–272. doi: 10.1016/j.intell.2005.11.003 First citation in articleCrossrefGoogle Scholar

  • Weis, S. & Süß, H.-M. (2005). Social intelligence – A review and critical discussion of measurement concepts. In R. SchulzeR. D. RobertsEds., An international handbook of emotional intelligence (pp. 203–230). Göttingen, Germany: Hogrefe. First citation in articleGoogle Scholar

  • Whetzel, D. L. & McDaniel, M. A. (2009). Situational judgment tests: An overview of current research. Human Resource Management Review, 19, 188–202. doi: 10.1016/j.hrmr.2009.03.007 First citation in articleCrossrefGoogle Scholar

  • Ziegler, M., Booth, T. & Bensch, D. (2013). Getting entangled in the nomological net: Thoughts on validity and conceptual overlap. European Journal of Psychological Assessment, 29, 157–161. doi: 10.1027/1015-5759/a000173 First citation in articleLinkGoogle Scholar

  • Ziegler, M. & Vautier, S. (2014). A farewell, a welcome, and an unusual exchange. European Journal of Psychological Assessment, 30, 81–85. doi: 10.1027/1015-5759/a000203 First citation in articleLinkGoogle Scholar