Open AccessResearch Article – Extended

Find the Mistake!

Psychometric Properties of an Innovative Response Format for Figural Matrices Tasks

Sonja Breuer, Division of Psychological Assessment, Department of Psychology, Paris Lodron University, Hellbrunner Str. 34, 5020 Salzburg, Austria

[email protected]

https://orcid.org/0000-0001-6214-5262

Division of Psychological Assessment, Department of Psychology, Paris Lodron University, Salzburg, Austria

Search for more papers by this author

Thomas Scherndl

Division of Psychological Assessment, Department of Psychology, Paris Lodron University, Salzburg, Austria

Search for more papers by this author

, and

Tuulia M. Ortner

Division of Psychological Assessment, Department of Psychology, Paris Lodron University, Salzburg, Austria

Search for more papers by this author

Published Online:February 20, 2023https://doi.org/10.1027/1614-0001/a000386

Abstract

Abstract: Reasoning ability has commonly been regarded as the best predictor of academic and occupational success. Due to concerns about the validity of multiple-choice (MC) formats, test security breaches, and the fact that the difficulty levels of most existing reasoning assessments target the population mean, there is a constant need for new reliable and valid test instruments that can be applied to assess fluid intelligence in advanced cognitive performance areas. We developed a novel computerized figural matrices test to assess nonverbal reasoning for university student aptitude assessment. In two studies, we generated, revised, and empirically validated the Isometric Matrices Test (IMT). Our results show that the IMT is less prone to test-wiseness strategies than existing reasoning tests. In a third study, we created and evaluated an innovative Find the Mistake (FtM) response format as an alternative to classical multiple-choice formats. Overall, both response formats revealed satisfactory psychometric quality in terms of item difficulties and discrimination, test-retest reliability, construct and criterion validity, and Rasch or two-parameter logistic (2PL) model fit, but in one MC version, the internal consistency was low due to negative discrimination indices. The MC response format turned out to be easier than the FtM format, with men slightly outperforming women in both response modes. We propose the IMT as a useful tool for assessing nonverbal reasoning ability in above-average performance areas and discuss the automatic generation of larger IMT item pools for adaptive testing in order to increase test security and reliability.

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. American Educational Research Association. First citation in article Google Scholar
Alnabhan, M. (2002). An empirical investigation of the effects of three methods of handling guessing and risk taking on the psychometric indices of a test. Social Behavior and Personality, 30(7), 645–652. https://doi.org/10.2224/sbp.2002.30.7.645 First citation in article Crossref, Google Scholar
Amiri, B. M., & Fazlalizadeh, S. (2011). The impact of applying concept mapping techniques on EFL learners’ knowledge of tenses. Journal of English Studies, 1, 39–61. First citation in article Google Scholar
APA Committee on Psychological Tests and Assessments. (2017). FAQ: Maintaining test security in the age of technology. https://www.apa.org/science/programs/testing/test-security-faq First citation in article Google Scholar
Arendasy, M. E., & Sommer, M. (2012). Gender differences in figural matrices: The moderating role of item design features. Intelligence, 40(6), 584–597. https://doi.org/10.1016/j.intell.2012.08.003 First citation in article Crossref, Google Scholar
Arendasy, M. E., & Sommer, M. (2013). Reducing response elimination strategies enhances the construct validity of figural matrices. Intelligence, 41(4), 234–243. https://doi.org/10.1016/j.intell.2013.03.006 First citation in article Crossref, Google Scholar
Arendasy, M. E., Sommer, M., Gittler, G., & Hergovich, A. (2006). Automatic generation of quantitative reasoning items: A pilot study. Journal of Individual Differences, 27(1), 2–14. https://doi.org/10.1027/1614-0001.27.1.2 First citation in article Link, Google Scholar
Baldiga, K. (2013). Gender differences in willingness to guess. Management Science, 60(2), 434–448. https://doi.org/10.1287/mnsc.2013.1776 First citation in article Crossref, Google Scholar
Barana, A., Floris, F., Marchisio, M., Marello, C., Pulvirenti, M., Rabellino, S., & Sacchet, M. (2019, April 11–12). Adapting STEM automated assessment system to enhance language skills. Paper presented at the 15th International Scientific Conference eLearning and Software for Education, Bucharest. First citation in article Google Scholar
Becker, N., Preckel, F., Karbach, J., Raffel, N., & Spinath, F. M. (2014). The construction task: Validation of a distractor-free item format for the presentation of figural matrices. Diagnostica, 61(1), 22–33. https://doi.org/10.1026/0012-1924/a000111 First citation in article Link, Google Scholar
Becker, N., Schmitz, F., Falk, A. M., Feldbrügge, J., Recktenwald, D. R., Wilhelm, O., Preckel, F., & Spinath, F. M. (2016). Preventing response elimination strategies improves the convergent validity of figural matrices. Journal of Intelligence, 4(2), 1–15. https://doi.org/10.3390/jintelligence4010002 First citation in article Google Scholar
Birenbaum, M., & Feldman, R. A. (1998). Relationships between learning patterns and attitudes towards two assessment formats. Educational Research, 40(1), 90–98. https://doi.org/10.1080/0013188980400109 First citation in article Crossref, Google Scholar
Booth, J. F., & Horn, R. (2004). Figure Reasoning Test (FRT): Manual für FRT und FRT-J [Figure Reasoning Test (FRT): Manual for FRT and FRT-J]. Swets Test Services. First citation in article Google Scholar
Breuer, S., Scherndl, T., & Ortner, T. M. (2020). Effects of response format on psychometric properties and fairness of a matrices test: Multiple choice vs. free response. Frontiers in Education, 5, Article 15. https://doi.org/10.3389/feduc.2020.00015 First citation in article Crossref, Google Scholar
Breuer, S., Scherndl, T., & Ortner, T. M. (2022). Supplemental material to “Find the mistake! Psychometric properties of an innovative response format for figural matrices tasks”. https://osf.io/j9p2a/?view_only=caf80535472746fca79fc2a61403b78b First citation in article Google Scholar
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. https://doi.org/10.1037/h0046016 First citation in article Crossref, Google Scholar
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge University Press. First citation in article Crossref, Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Erlbaum. First citation in article Google Scholar
Colom, R., Escorial, S., & Rebollo, I. (2004). Sex differences on the progressive matrices are influenced by sex differences on spatial ability. Personality and Individual Differences, 37(6), 1289–1293. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=psyc4&AN=2004-19562-016 First citation in article Crossref, Google Scholar
Correia, P., Cabral, G., & Aguiar, J. (2016, September 6). Cmaps with errors: Why not? Comparing two Cmap-based assessment tasks to evaluate conceptual understanding. Paper presented at the 7th International Conference on Concept Mapping, CMC 2016, Tallinn, Estonia. First citation in article Crossref, Google Scholar
Crocker, L., & Schmitt, A. (1987). Improving multiple-choice test performance for examinees with different levels of test anxiety. Journal of Experimental Education, 55(4), 201–205. https://doi.org/10.1080/00220973.1987.10806454 First citation in article Crossref, Google Scholar
Dodeen, H., Abdelfattah, F., & Alshumrani, S. (2014). Test-taking skills of secondary students: The relationship with motivation, attitudes, anxiety and attitudes towards tests. South African Journal of Education, 34(2), Article 866. https://doi.org/10.15700/201412071153 First citation in article Crossref, Google Scholar
Dorans, N. J.Cook, L. L. (Eds.). (2016). Fairness in educational assessment and measurement. Routledge. First citation in article Crossref, Google Scholar
Edwards, B. D., & Arthur, W. (2007). An Examination of Factors Contributing to a Reduction in Subgroup Differences on a Constructed-Response Paper-and-Pencil Test of Scholastic Achievement. Journal of Applied Psychology, 92(3), 794–801. https://doi.org/10.1037/0021-9010.92.3.794 First citation in article Crossref, Google Scholar
Formann, A. K. (2002). Wiener Matrizen-Test (EDV-Version 22.0) [Viennese Matrices Test (IT-version 22.0)]. Schuhfried GmbH. First citation in article Google Scholar
Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Press. First citation in article Google Scholar
Heller, K. A., Kratzmeier, H., & Lengfelder, A. (1998). Matrizen-Test-Manual. Ein Handbuch zu den Standard Progressive Matrices von J. C. Raven [Matrices Test Manual. A Guide to J. C. Raven’s Standard Progressive Matrices]. Beltz. First citation in article Google Scholar
Heyborne, W. H., Clarke, J. A., & Perrett, J. J. (2011). A Comparison of two forms of assessment in an introductory biology laboratory course. Journal of College Science Teaching, 40(5), 28–31. http://hdl.handle.net/1959.14/175225 First citation in article Google Scholar
Heydasch, T., Haubrich, J., & Renner, K.-H. (2013). The short version of the Hagen Matrices Test (HMT-S): A 6-item induction intelligence test. Methods, Data, Analyses, 7(2), 183–208. https://doi.org/10.12758/mda.2013.011 First citation in article Google Scholar
Hodapp, V. (1991). Das Prüfungsängstlichkeitsinventar TAI-G: Eine erweiterte und modifizierte Version mit vier Komponenten [The Test Anxiety Inventory TAI-G: An expanded and modified version with four components]. Zeitschrift für Pädagogische Psychologie / German Journal of Educational Psychology, 5(2), 121–130. First citation in article Google Scholar
Hornke, L. F., Küppers, A., & Etzel, S. (2000). Design and evaluation of an adaptive matrices test. Diagnostica, 46(4), 182–188. https://doi.org/10.1026/0012-1924.46.4.182 First citation in article Link, Google Scholar
Hossiep, R., Turck, D., & Hasella, M. (1999). BOMAT-advanced – Bochumer Matrices Test. Manual. Hogrefe. First citation in article Google Scholar
Howard, C., Jordan, P., Di Eugenio, B., & Katz, S. (2017). Shifting the load: A peer dialogue agent that encourages its human collaborator to contribute more to problem solving. International Journal of Artificial Intelligence in Education, 27, 101–129. https://doi.org/10.1007/s40593-015-0071-y First citation in article Crossref, Google Scholar
Hudson, R. D., & Treagust, D. F. (2013). Which form of assessment provides the best information about student performance in chemistry examinations? Research in Science & Technological Education, 31(1), 49–65. https://doi.org/10.1080/02635143.2013.764516 First citation in article Crossref, Google Scholar
IBM Corp. (2019). IBM SPSS Statistics for Windows, Version 26.0. https://www.ibm.com/support/pages/ibm-spss-statistics-26-documentation First citation in article Google Scholar
International Test Commission. (2014). International guidelines on the security of tests, examinations, and other assessments. https://www.intestcom.org/page/18 First citation in article Google Scholar
Irwing, P., & Lynn, R. (2005). Sex differences in means and variability on the progressive matrices in university students: A meta-analysis. British Journal of Psychology, 96(4), 505–524. https://doi.org/10.1348/000712605X53542 First citation in article Crossref, Google Scholar
Keith, N., Hodapp, V., Schermelleh-Engel, K., & Moosbrugger, H. (2003). Cross-sectional and longitudinal confirmatory factor models for the German Test Anxiety Inventory: A construct validation. Anxiety, Stress, & Coping, 16(3), 251–270. https://doi.org/10.1080/1061580031000095416 First citation in article Crossref, Google Scholar
Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean? Rasch Measurement Transactions, 16, 871–882. https://www.rasch.org/rmt/rmt162f.htm First citation in article Google Scholar
Liou, P.-Y., & Bulut, O. (2020). The effects of item format and cognitive domain on students’ science performance in TIMSS 2011. Research in Science Education, 50, 99–121. https://doi.org/10.1007/s11165-017-9682-7 First citation in article Crossref, Google Scholar
Mackintosh, N. J., & Bennett, E. S. (2005). What do raven’s matrices measure? An analysis in terms of sex differences. Intelligence, 33(6), 663–674. https://doi.org/10.1016/j.intell.2005.03.004 First citation in article Crossref, Google Scholar
Magis, D., Beland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847–862. https://doi.org/10.3758/BRM.42.3.847 First citation in article Crossref, Google Scholar
Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207–218. https://doi.org/10.1207/s15326985ep3404_2 First citation in article Crossref, Google Scholar
McCoubrie, P. (2004). Improving the fairness of multiple-choice questions: A literature review. Medical Teacher, 26(8), 709–712. https://doi.org/10.1080/01421590400013495 First citation in article Crossref, Google Scholar
Miller, D. I., & Halpern, D. F. (2014). The new science of cognitive sex differences. Trends in Cognitive Sciences, 18(1), 37–45. https://doi.org/10.1016/j.tics.2013.10.011 First citation in article Crossref, Google Scholar
Mittring, G., & Rost, D. H. (2008). The nasty distractors. The utility of a notional distractor analysis of items of matrices tests for the highly gifted. Diagnostica, 54(4), 193–201. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=psyn&AN=0211997 First citation in article Link, Google Scholar
Moreno, R., Martinez, R. J., & Muniz, J. (2006). New guidelines for developing multiple-choice items. Methodology: European Journal of Research Methods for the Behavioral & Social Sciences, 2(2), 65–72. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=ovfth&AN=01222909-200602020-00002 First citation in article Link, Google Scholar
Ostapczuk, M., Musch, J., & Lieberei, W. (2011). The Analytical Test: Validation of a new personnel selection tool assessing reasoning. Zeitschrift fuer Arbeits- und Organisationspsychologie, 55, 1–16. https://doi.org/10.1026/0932-4089/a000031 First citation in article Link, Google Scholar
Pind, J., Gunnarsdóttir, E. K., & Jóhannesson, H. S. (2003). Raven’s Standard Progressive Matrices: New school age norms and a study of the test’s validity. Personality and Individual Differences, 34(3), 375–386. https://doi.org/10.1016/S0191-8869(02)00058-2 First citation in article Crossref, Google Scholar
Piskernik, B. (2013). Free response matrices (FRM) [Software and Manual]. Schuhfried GmbH. First citation in article Google Scholar
Popham, W. J. (2012). Assessment bias: How to banish it (2nd ed.). Pearson. First citation in article Google Scholar
Powell, S. R. (2012). High-stakes testing for students with mathematics difficulty: Response format effects in mathematics problem solving. Learning Disability Quarterly, 35(1), 3–9. https://doi.org/10.1177/0731948711428773 First citation in article Crossref, Google Scholar
Preckel, F. (2003). Diagnostik intellektueller Hochbegabung. Testentwicklung zur Erfassung der fluiden Intelligenz [Assessment of giftedness. Test development to assess fluid intelligence]. Hogrefe. First citation in article Google Scholar
R Core Team. (2021). R: A language and environment for statistical computing. https://www.R-project.org/ First citation in article Google Scholar
Raven, J., Raven, J. C., & Court, J. H. (1998). Manual for Raven’s progressive matrices and vocabulary scales. Oxford Psychologists Press. First citation in article Google Scholar
Reardon, S. F., Kalogrides, D., Fahle, E. M., Podolsky, A., & Zárate, R. C. (2018). The relationship between test item format and gender achievement gaps on Math and ELA tests in fourth and eighth grades. Educational Researcher, 47(5), 284–294. https://doi.org/10.3102/0013189X18762105 First citation in article Crossref, Google Scholar
Robitzsch, A., Kiefer, T., & Wu, M. (2018). TAM: Test analysis modules R package version 2.12-18. https://CRAN.R-project.org/package=TAM First citation in article Google Scholar
Rosseti, M. O., Rabelo, I. S. A., Leme, I. F. A., Pacanaro, S. V., & Güntert, I. B. (2009). Validity evidence of Raven’s Advanced Progressive Matrices in university students. Psico-USF, 14(2), 177–184. https://doi.org/10.1590/S1413-82712009000200006 First citation in article Crossref, Google Scholar
Rubio, V. J., Hernández, J. M., Zaldívar, F., Márquez, O., & Santacreu, J. (2010). Can we predict risk-taking behavior? Two behavioral tests for predicting guessing tendencies in a multiple-choice test. European Journal of Psychological Assessment, 26(2), 87–94. https://doi.org/10.1027/1015-5759/a000013 First citation in article Link, Google Scholar
Ryan, J. M., & DeMark, S. (2002). Variation in achievement scores related to gender, item format, and content area tested. In G. TindalT. M. HaladynaEds., Large-scale assessment programs for all students: Validity, technical adequacy, and implementation (pp. 67–88). Erlbaum. First citation in article Google Scholar
Schmidt, F. L., Oh, I.-S., & Shaffer, J. A. (2016). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 100 years of research findings (working paper). University of Iowa. First citation in article Google Scholar
Sebrechts, M. M., Bennett, R. E., & Rock, D. A. (1991). Agreement between expert-system and human raters’ scores on complex constructed-response quantitative items. Journal of Applied Psychology, 76(6), 856–862. https://doi.org/10.1037/0021-9010.76.6.856 First citation in article Crossref, Google Scholar
Stanger-Hall, K. F. (2012). Multiple-choice exams: An obstacle for higher-level thinking in introductory science classes. CBE Life Sciences Education, 11(3), 294–306. https://doi.org/10.1187/cbe.11-11-0100 First citation in article Crossref, Google Scholar
Tavakol, M., & Dennick, R. (2011). Making sense of Cronbach’s alpha. International Journal of Medical Education, 2, 53–55. https://doi.org/10.5116/ijme.4dfb.8dfd First citation in article Crossref, Google Scholar
Vodegel Matzen, L. B., Van der Molen, M. W., & Dudink, A. C. (1994). Error analysis of Raven test performance. Personality and Individual Differences, 16(3), 433–445. http://ovidsp.ovid.com/ovidweb.cgi?T=JS&CSC=Y&NEWS=N&PAGE=fulltext&D=psyc3&AN=1994-31901-001 First citation in article Crossref, Google Scholar
Wacker, A., Jaunzeme, J., & Jaksztat, S. (2008). Eine Kurzform des Prüfungsängstlichkeitsinventars TAI-G [Short version of the Test Anxiety Inventory TAI-G]. Zeitschrift für Pädagogische Psychologie / German Journal of Educational Psychology, 22(1), 73–81. https://doi.org/10.1024/1010-0652.22.1.73 First citation in article Link, Google Scholar
Wells, C. S., & Wollack, J. A. (2003). An instructor’s guide to understanding test reliability. Testing & Evaluation Services, University of Wisconsin. First citation in article Google Scholar
Zeidner, M. (1998). Test anxiety: The state of the art. Springer. https://books.google.at/books?id=qmcQBwAAQBAJ&dq=test+anxiety+fear+zeidner&hl=de&lr= First citation in article Google Scholar
Zieky, M. J. (2016). Fairness in test design and development. In N. J. DoransL. L. CookEds., Fairness in educational assessment and measurement (pp. 9–31). Routledge. First citation in article Crossref, Google Scholar

Volume 44Issue 2April 2023

ISSN: 1614-0001eISSN: 2151-2299

History

ReceivedJuly 5, 2021
RevisedApril 8, 2022
AcceptedSeptember 10, 2022
Published onlineFebruary 20, 2023

Licenses & Copyright

Distributed as a Hogrefe OpenMind article under the license CC BY 4.0 ( https://creativecommons.org/licenses/by/4.0)

Keywords

Acknowledgments:

We thank Paul Lengenfelder for his help with data collection and Jane Zagorski for her helpful suggestions on a previous version of the manuscript.

Open Data:

Data files and codes for this paper are provided in the Open Science Framework (OSF) at https://osf.io/j9p2a/?view_only=caf80535472746fca79fc2a61403b78b (Breuer et al., 2022).

PDF download

Verify Phone

Congrats!

Find the Mistake!

Psychometric Properties of an Innovative Response Format for Figural Matrices Tasks

Abstract

References

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Find the Mistake!

Psychometric Properties of an Innovative Response Format for Figural Matrices Tasks

Abstract

References

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners