Skip to main content
Original Article

Test Validation Without Measurement

Disentangling Scientific Explanation of Item Responses and Justification of Focused Assessment Policies Based on Test Data

Published Online:

Abstract. Test validation based on usual statistical analyses is paradoxical, as, from a falsificationist perspective, they do not test that test data are ordinal measurements, and, from the ethical perspective, they do not justify the use of test scores. This paper (i) proposes some basic definitions, where measurement is a special case of scientific explanation; starting from the examples of memory accuracy and suicidality as scored by two widely used clinical tests/questionnaires. Moreover, it shows (ii) how to elicit the logic of the observable test events underlying the test scores, and (iii) how the measurability of the target theoretical quantities – memory accuracy and suicidality – can and should be tested at the respondent scale as opposed to the scale of aggregates of respondents. (iv) Criterion-related validity is revisited to stress that invoking the explanative power of test data should draw attention on counterexamples instead of statistical summarization. (v) Finally, it is argued that the justification of the use of test scores in specific settings should be part of the test validation task, because, as tests specialists, psychologists are responsible for proposing their tests for social uses.


  • Buschke, H. (1984). Cued recall in amnesia. Journal of Clinical Neuropsychology, 6, 433–440. doi: 10.1080/01688638408401233 First citation in articleCrossrefGoogle Scholar

  • Ferris, S. H., Aisen, P. S., Cummings, J., Galasko, D., Salmon, D. P., Schneider, L., … Thal, L. J. (2006). ADCS prevention instrument project: Overview and initial results. Alzheimer Disease and Associated Disorders, 20, S109–S123. doi: 10.1097/01.wad.0000213870.40300.21 First citation in articleCrossrefGoogle Scholar

  • Grober, E. & Buschke, H. (1987). Genuine memory deficits in dementia. Developmental Neurosychology, 3, 13–36. First citation in articleCrossrefGoogle Scholar

  • Grober, E., Buschke, H., Crystal, H., Bang, S. & Dresner, R. (1988). Screening for dementia by memory testing. Neurology, 38, 900. doi: 10.1212/WNL.38.6.900 First citation in articleCrossrefGoogle Scholar

  • Grober, E., Lipton, R. B., Hall, C. & Crystal, H. (2000). Memory impairment on free and cued selective reminding predicts dementia. Neurology, 54, 827–832. doi: 10.1212/WNL.54.4.827 First citation in articleCrossrefGoogle Scholar

  • Grober, E., Merling, A., Heimlich, T. & Lipton, R. B. (1997). Free and cued selective reminding and selective reminding in the elderly. Journal of Clinical and Experimental Neuropsychology, 19, 643–654. doi: 10.1080/01688639708403750 First citation in articleCrossrefGoogle Scholar

  • Grober, E., Sanders, A. E., Hall, C. & Lipton, R. B. (2010). Free and cued selective reminding identifies very mild dementia in primary care. Alzheimer Disease and Associated Disorders, 24, 284–290. doi: 10.1097/WAD.0b013e3181cfc78b First citation in articleCrossrefGoogle Scholar

  • Hacking, I. (1975). The emergence of probability. London, UK: Cambridge University Press. First citation in articleGoogle Scholar

  • Harré, R. (2004). Staking our claim for qualitative psychology as science. Qualitative Research in Psychology, 1, 3–14. doi: 10.1191/1478088704qp002oa First citation in articleCrossrefGoogle Scholar

  • Heene, M. (2013). Additive conjoint measurement and the resistance toward falsifiability in Psychology. Frontiers in Psychology, 4, 246. doi: 10.3389/fpsyg.2013.00246 First citation in articleCrossrefGoogle Scholar

  • Krause, M. S. (2010). Trying to discover sufficient condition causes. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 6, 59–70. doi: 10.1027/1614-2241/a000007 First citation in articleLinkGoogle Scholar

  • Lamiell, J. T. (2013). Statisticism in personality psychologists’ use of trait constructs: What is it? How was it contracted? Is there a cure? New Ideas in Psychology, 31, 65–71. doi: 10.1016/j.newideapsych.2011.02.009 First citation in articleCrossrefGoogle Scholar

  • Lecrubier, Y., Sheehan, D., Weiller, E., Amorim, P., Bonora, I., Harnett Sheehan, K., … Dunbar, G. (1997). The Mini International Neuropsychiatric Interview (MINI). A short diagnostic structured interview: Reliability and validity according to the CIDI. European Psychiatry, 12, 224–231. doi: 10.1016/S0924-9338(97)83296-8 First citation in articleCrossrefGoogle Scholar

  • Michell, J. (1990). An introduction to the logic of psychological measurement. Hillsdale, NJ: Erlbaum. First citation in articleGoogle Scholar

  • Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British Journal of Psychology, 88, 355–383. doi: 10.1111/j.2044-8295.1997.tb02641.x First citation in articleCrossrefGoogle Scholar

  • Michell, J. (1999). Measurement in psychology: A critical history of a methodological concept. Cambridge, UK: Cambridge University Press. First citation in articleCrossrefGoogle Scholar

  • Michell, J. (2000). Normal science, pathological science and psychometrics. Theory & Psychology, 10, 639–667. doi: 10.1177/0959354300105004 First citation in articleCrossrefGoogle Scholar

  • Michell, J. (2001). Teaching and misteaching measurement in psychology. Australian Psychologist, 36, 211–218. doi: 10.1080/00050060108259657 First citation in articleCrossrefGoogle Scholar

  • Michell, J. (2003). The quantitative imperative: Positivism, naive realism and the place of qualitative methods in psychology. Theory & Psychology, 13, 5–31. doi: 10.1177/0959354303013001758 First citation in articleCrossrefGoogle Scholar

  • Michell, J. (2008a). Conjoint measurement and the Rasch paradox: A response to Kyngdon. Theory & Psychology, 18, 119–124. doi: 10.1177/0959354307086926 First citation in articleCrossrefGoogle Scholar

  • Michell, J. (2008b). Is psychometrics pathological science? Measurement: Interdisciplinary Research & Perspective, 6, 7–24. doi: 10.1080/15366360802035489 First citation in articleCrossrefGoogle Scholar

  • Mill, J. S. (1999). On liberty. Peterborough, Canada: Broadview Press. (Original work published 1859). First citation in articleGoogle Scholar

  • Newton, P. E. (2012). Clarifying the consensus definition of validity. Measurement: Interdisciplinary Research & Perspective, 10, 1–29. doi: 10.1080/15366367.2012.669666 First citation in articleCrossrefGoogle Scholar

  • Notturno, M. A. (2000). Science and the open society: The future of Karl Popper’s philosophy. Budapest, Hungary: Central European University Press. First citation in articleGoogle Scholar

  • Notturno, M. A. (2009). Three concepts of science. Scientific Medicine, 1, 2–4. First citation in articleGoogle Scholar

  • Petersen, R. C., Smith, G. E., Ivnik, R. J., Kokmen, E. & Tangalos, E. G. (1994). Memory function in very early Alzheimer’s disease. Neurology, 44, 867–872.Retrieved from First citation in articleCrossrefGoogle Scholar

  • Popper, K. R. (1959). The logic of scientific discovery. New York, NY: Basic Books. First citation in articleGoogle Scholar

  • Popper, K. R. (1992). Realism and the aim of science: From the postscript to the logic of scientific discovery. New York, NY: Routledge. First citation in articleGoogle Scholar

  • Roaldset, J. O., Linaker, O. M. & Bjørkly, S. (2012). Predictive validity of the MINI suicidal scale for self-harm in acute psychiatry: A prospective study of the first year after discharge. Archives of Suicide Research: Official Journal of the International Academy for Suicide Research, 16, 287–302. doi: 10.1080/13811118.2013.722052 First citation in articleCrossrefGoogle Scholar

  • Sarazin, M., Berr, C., De Rotrou, J., Fabrigoule, C., Pasquier, F., Legrain, S., … Dubois, B. (2007). Amnestic syndrome of the medial temporal type identifies prodromal AD: A longitudinal study. Neurology, 69, 1859–1867. doi: 10.1212/01.wnl.0000279336.36610.f7 First citation in articleCrossrefGoogle Scholar

  • Sarazin, M., Chauviré, V., Gerardin, E., Colliot, O., Kinkingnéhun, S., de Souza, L. C., … Dubois, B. (2010). The amnestic syndrome of hippocampal type in Alzheimer’s disease: An MRI study. Journal of Alzheimer’s Disease: JAD, 22, 285–294. doi: 10.3233/JAD-2010-091150 First citation in articleCrossrefGoogle Scholar

  • Sheehan, D. V., Lecrubier, Y., Harnett Sheehan, K., Janavs, J., Weiller, E., Keskiner, A., … Dunbar, G. C. (1997). The validity of the Mini International Neuropsychiatric Interview (MINI) according to the SCID-P and its reliability. European Psychiatry, 12, 232–241. doi: 10.1016/S0924-9338(97)83297-X First citation in articleCrossrefGoogle Scholar

  • Sheehan, D. V., Lecrubier, Y., Sheehan, K. H., Amorim, P., Janavs, J., Weiller, E., … Dunbar, G. C. (1998). The Mini-International Neuropsychiatric Interview (M.I.N.I.): The development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. The Journal of Clinical Psychiatry, 59(Suppl 2), 22–33. First citation in articleGoogle Scholar

  • Vautier, S. (2011). The operationalization of general hypotheses versus the discovery of empirical laws in psychology. Philosophia Scientiae, 15, 105–122. doi: 10.4000/philosophiascientiae.656 First citation in articleCrossrefGoogle Scholar

  • Vautier, S., Lacot, E. & Veldhuis, M. (2014). Puzzle-solving in psychology: The neo-Galtonian vs. nomothetic research focuses. New Ideas in Psychology, 33, 46–53. doi: 10.1016/j.newideapsych.2013.10.002 First citation in articleCrossrefGoogle Scholar

  • Vautier, S., Veldhuis, M., Lacot, E. & Matton, N. (2012). The ambiguous utility of psychometrics for the interpretative foundation of socially relevant avatars. Theory & Psychology, 22, 810–822. doi: 10.1177/0959354312450093 First citation in articleCrossrefGoogle Scholar

  • Ziegler, M. & Vautier, S. (2014). A Farewell, a welcome, and an unusual exchange. European Journal of Psychological Assessment, 30, 81–85. doi: 10.1027/1015-5759/a000203 First citation in articleLinkGoogle Scholar