Experimental Test Validation
Examining the Path From Test Elements to Test Performance
Abstract
Abstract. Although the vast majority of validation studies rely on correlational validity evidence, there is an increased recognition that validity should also focus on whether variations in the focal psychological attribute lead to variations in the measurement outcomes. Therefore, calls have been made that validity evidence should also be gathered through experiments. Existing experimental validation strategies focus on manipulating psychological attributes and their effects on measurement outcomes. In the current manuscript, we present an additional and complementary approach that focuses on manipulating test elements (instead of psychological attributes) that are considered indispensable for test functioning. Examples from personality, situational judgment, emotional intelligence, and reading comprehension domains are presented to illustrate our approach. The presented approach is integrated into existing validation strategies.
References
2014). Standards for educational and psychological testing. Washington, DC: AERA.
. (1966). Standards for educational and psychological testing. Washington, DC: APA.
, (2015). The cue is the key. European Journal of Psychological Assessment, 31, 38–44. doi: 10.1027/1015-5759/a000204
(1989). A new incremental fit index for general structural equation models. Sociological Methods & Research, 17, 303–316. doi: 10.1177/0049124189017003004
(2011). Toward a process-focused model of test score validity: Improving psychological assessment in science and practice. Psychological Assessment, 23, 532–544. doi: 10.1037/a0022402
(2013). Truth and evidence in validity theory. Journal of Educational Measurement, 50, 110–114. doi: 10.1111/jedm.12006
(2004). The concept of validity. Psychological Review, 111, 1061–1071. doi: 10.1037/0033-295X.111.4.1061
(1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. doi: 10.1037/h0040957
(2010). Assessing values at an early age: The Picture-Based Value Survey for Children (PBVS-C). Journal of Personality Assessment, 92, 439–448. doi: 10.1080/00223891.2010.497423
(1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93, 179–197.
(2012). Dynamic problem solving: A new assessment perspective. Applied Psychological Measurement, 36, 189–213. doi: 10.1177/0146621612439620
(1980). On Trinitarian doctrines of validity. Professional Psychology, 11, 385–398. doi: 10.1037/0735-7028.11.3.385
(2002). Item generation for test development. Mahwah, NJ: Erlbaum.
(2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73. doi: 10.1111/jedm.12000
(1990). Answering reading comprehension items without passages on the SAT. Psychological Science, 1, 122–127.
(2015). How “situational” is judgment in situational judgment tests? The Journal of Applied Psychology, 100, 399–416. doi: 10.1037/a0037674
(2016). Motive arousal without pictures? An experimental validation of a hybrid implicit motive test. Journal of Personality Assessment, 98, 514–522.
(1986). Stamp collecting versus science: Validation as hypothesis testing. The American Psychologist, 41, 1183–1192. doi: 10.1037/0003-066X.41.11.1183
(2008). A closer look at the frame-of-reference effect in personality scale scores and validity. The Journal of Applied Psychology, 93, 268–279. doi: 10.1037/0021-9010.93.2.268
(2016). Situational judgment tests: From measures of situational judgment to measures of general domain knowledge. Industrial and Organizational Psychology, 9, 3–22. doi: 10.1017/iop.2015.71
(2006). Video-based versus written situational judgment tests: a comparison in terms of predictive validity. The Journal of Applied Psychology, 91, 1181–1188. doi: 10.1037/0021-9010.91.5.1181
(2007). Situational judgment tests in high-stakes settings: Issues and strategies with generating alternate forms. The Journal of Applied Psychology, 92, 1043–1055. doi: 10.1037/0021-9010.92.4.1043
(1957). Objective tests as instruments of psychological theory: Monograph supplement 9. Psychological Reports, 3, 635–694. doi: 10.2466/pr0.1957.3.3.635
(1985). Human motivation. Glenview, IL: Scott, Foresman & Co.
(2009). Measuring procedural knowledge more simply with a single-response situational judgment test. Journal of Business and Psychology, 24, 281–288. doi: 10.1007/s10869-009-9106-4
(1990). An alternative selection procedure: The low-fidelity simulation. The Journal of Applied Psychology, 75, 640–647. doi: 10.1037/0021-9010.75.6.640
(1938). Explorations in personality. New York, NY: Oxford.
(2012). Clarifying the consensus definition of validity. Measurement, 10, 1–29. doi: 10.1080/15366367.2012.669666
(2014). Validity in educational and psychological assessment. Cambridge, UK: Sage.
(2013). Are we really measuring what we say we’re measuring? Using video techniques to supplement traditional construct validation procedures. The Journal of Applied Psychology, 98, 99–113. doi: 10.1037/a0029570
(2015). Putting judging situations into situational judgment tests: Evidence from intercultural multimedia SJTs. The Journal of Applied Psychology, 100, 464–480. doi: 10.1037/a0038098
(2007). Leseverständnis ohne Lesen? Zur Konstruktvalidität von multiple-choice-Leseverständnistestaufgaben
([Reading comprehension without reading? Construct validity of multiple-choice reading comprehension tasks] . Zeitschrift für Pädagogische Psychologie, 21, 305–314. doi: 10.1024/1010-0652.21.3.3052014). Kognitive Verarbeitung von Leseverständnisitems mit und ohne Text
([Cognitive processing of reading comprehension items with and without text] . Zeitschrift Für Pädagogische Psychologie, 28, 21–30. doi: 10.1024/1010-0652/a0001212000). Assessing achievement, affiliation, and power motives all at once: The Multi-Motive Grid (MMG). Journal of Personality Assessment, 74, 126–145. doi: 10.1207/S15327752JPA740109
(2012). Not read, but nevertheless solved? Three experiments on PIRLS multiple choice reading comprehension test items. Educational Assessment, 17, 214–232. doi: 10.1080/10627197.2012.735921
(2006). Eye-movement analysis demonstrates strategic influences on intelligence. Intelligence, 34, 261–272. doi: 10.1016/j.intell.2005.11.003
(2005).
(Social intelligence – A review and critical discussion of measurement concepts . In R. SchulzeR. D. RobertsEds., An international handbook of emotional intelligence (pp. 203–230). Göttingen, Germany: Hogrefe.2009). Situational judgment tests: An overview of current research. Human Resource Management Review, 19, 188–202. doi: 10.1016/j.hrmr.2009.03.007
(2013). Getting entangled in the nomological net: Thoughts on validity and conceptual overlap. European Journal of Psychological Assessment, 29, 157–161. doi: 10.1027/1015-5759/a000173
(2014). A farewell, a welcome, and an unusual exchange. European Journal of Psychological Assessment, 30, 81–85. doi: 10.1027/1015-5759/a000203
(