Test Validation Without Measurement
Disentangling Scientific Explanation of Item Responses and Justification of Focused Assessment Policies Based on Test Data
Abstract
Abstract. Test validation based on usual statistical analyses is paradoxical, as, from a falsificationist perspective, they do not test that test data are ordinal measurements, and, from the ethical perspective, they do not justify the use of test scores. This paper (i) proposes some basic definitions, where measurement is a special case of scientific explanation; starting from the examples of memory accuracy and suicidality as scored by two widely used clinical tests/questionnaires. Moreover, it shows (ii) how to elicit the logic of the observable test events underlying the test scores, and (iii) how the measurability of the target theoretical quantities – memory accuracy and suicidality – can and should be tested at the respondent scale as opposed to the scale of aggregates of respondents. (iv) Criterion-related validity is revisited to stress that invoking the explanative power of test data should draw attention on counterexamples instead of statistical summarization. (v) Finally, it is argued that the justification of the use of test scores in specific settings should be part of the test validation task, because, as tests specialists, psychologists are responsible for proposing their tests for social uses.
References
1984). Cued recall in amnesia. Journal of Clinical Neuropsychology, 6, 433–440. doi: 10.1080/01688638408401233
(2006). ADCS prevention instrument project: Overview and initial results. Alzheimer Disease and Associated Disorders, 20, S109–S123. doi: 10.1097/01.wad.0000213870.40300.21
(1987). Genuine memory deficits in dementia. Developmental Neurosychology, 3, 13–36.
(1988). Screening for dementia by memory testing. Neurology, 38, 900. doi: 10.1212/WNL.38.6.900
(2000). Memory impairment on free and cued selective reminding predicts dementia. Neurology, 54, 827–832. doi: 10.1212/WNL.54.4.827
(1997). Free and cued selective reminding and selective reminding in the elderly. Journal of Clinical and Experimental Neuropsychology, 19, 643–654. doi: 10.1080/01688639708403750
(2010). Free and cued selective reminding identifies very mild dementia in primary care. Alzheimer Disease and Associated Disorders, 24, 284–290. doi: 10.1097/WAD.0b013e3181cfc78b
(1975). The emergence of probability. London, UK: Cambridge University Press.
(2004). Staking our claim for qualitative psychology as science. Qualitative Research in Psychology, 1, 3–14. doi: 10.1191/1478088704qp002oa
(2013). Additive conjoint measurement and the resistance toward falsifiability in Psychology. Frontiers in Psychology, 4, 246. doi: 10.3389/fpsyg.2013.00246
(2010). Trying to discover sufficient condition causes. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 6, 59–70. doi: 10.1027/1614-2241/a000007
(2013). Statisticism in personality psychologists’ use of trait constructs: What is it? How was it contracted? Is there a cure? New Ideas in Psychology, 31, 65–71. doi: 10.1016/j.newideapsych.2011.02.009
(1997). The Mini International Neuropsychiatric Interview (MINI). A short diagnostic structured interview: Reliability and validity according to the CIDI. European Psychiatry, 12, 224–231. doi: 10.1016/S0924-9338(97)83296-8
(1990). An introduction to the logic of psychological measurement. Hillsdale, NJ: Erlbaum.
(1997). Quantitative science and the definition of measurement in psychology. British Journal of Psychology, 88, 355–383. doi: 10.1111/j.2044-8295.1997.tb02641.x
(1999). Measurement in psychology: A critical history of a methodological concept. Cambridge, UK: Cambridge University Press.
(2000). Normal science, pathological science and psychometrics. Theory & Psychology, 10, 639–667. doi: 10.1177/0959354300105004
(2001). Teaching and misteaching measurement in psychology. Australian Psychologist, 36, 211–218. doi: 10.1080/00050060108259657
(2003). The quantitative imperative: Positivism, naive realism and the place of qualitative methods in psychology. Theory & Psychology, 13, 5–31. doi: 10.1177/0959354303013001758
(2008a). Conjoint measurement and the Rasch paradox: A response to Kyngdon. Theory & Psychology, 18, 119–124. doi: 10.1177/0959354307086926
(2008b). Is psychometrics pathological science? Measurement: Interdisciplinary Research & Perspective, 6, 7–24. doi: 10.1080/15366360802035489
(1999). On liberty. Peterborough, Canada: Broadview Press. (Original work published 1859).
(2012). Clarifying the consensus definition of validity. Measurement: Interdisciplinary Research & Perspective, 10, 1–29. doi: 10.1080/15366367.2012.669666
(2000). Science and the open society: The future of Karl Popper’s philosophy. Budapest, Hungary: Central European University Press.
(2009). Three concepts of science. Scientific Medicine, 1, 2–4.
(1994). Memory function in very early Alzheimer’s disease. Neurology, 44, 867–872.Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/8190289
(1959). The logic of scientific discovery. New York, NY: Basic Books.
(1992). Realism and the aim of science: From the postscript to the logic of scientific discovery. New York, NY: Routledge.
(2012). Predictive validity of the MINI suicidal scale for self-harm in acute psychiatry: A prospective study of the first year after discharge. Archives of Suicide Research: Official Journal of the International Academy for Suicide Research, 16, 287–302. doi: 10.1080/13811118.2013.722052
(2007). Amnestic syndrome of the medial temporal type identifies prodromal AD: A longitudinal study. Neurology, 69, 1859–1867. doi: 10.1212/01.wnl.0000279336.36610.f7
(2010). The amnestic syndrome of hippocampal type in Alzheimer’s disease: An MRI study. Journal of Alzheimer’s Disease: JAD, 22, 285–294. doi: 10.3233/JAD-2010-091150
(1997). The validity of the Mini International Neuropsychiatric Interview (MINI) according to the SCID-P and its reliability. European Psychiatry, 12, 232–241. doi: 10.1016/S0924-9338(97)83297-X
(1998). The Mini-International Neuropsychiatric Interview (M.I.N.I.): The development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. The Journal of Clinical Psychiatry, 59(Suppl 2), 22–33.
(2011). The operationalization of general hypotheses versus the discovery of empirical laws in psychology. Philosophia Scientiae, 15, 105–122. doi: 10.4000/philosophiascientiae.656
(2014). Puzzle-solving in psychology: The neo-Galtonian vs. nomothetic research focuses. New Ideas in Psychology, 33, 46–53. doi: 10.1016/j.newideapsych.2013.10.002
(2012). The ambiguous utility of psychometrics for the interpretative foundation of socially relevant avatars. Theory & Psychology, 22, 810–822. doi: 10.1177/0959354312450093
(2014). A Farewell, a welcome, and an unusual exchange. European Journal of Psychological Assessment, 30, 81–85. doi: 10.1027/1015-5759/a000203
(