Skip to main content
Original Article

On the Impact of the Response Options’ Position on Item Difficulty in Multiple-Choice-Items

Published Online:https://doi.org/10.1027/1015-5759/a000615

Abstract. The multiple-choice item format is widely used in test construction and Large-Scale Assessment. So far, there has been little research on the impact of the position of the solution among the response options and the few existing results are even inconsistent. Since it would be an easy way to create parallel items for group setting by altering the response options, the influence of the response options’ position on item difficulty should be examined. The Linear Logistic Test Model (Fischer, 1972) was used to analyze the data of 829 students aged 8–20 years, who worked on general knowledge items. It was found that the position of the solution among the response options has an influence on item difficulty. Items are easiest when the solution is in first place and more difficult when the solution is placed in a middle position or at the end of the set of response options.

References

  • Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140. https://doi.org/10.1007/BF02291180 First citation in articleCrossrefGoogle Scholar

  • Arendasy, M. (2005). Automatic generation of Rasch-calibrated items: Figural matrices test GEOM and endless-loops test EC. International Journal of Testing, 5, 197–224. https://doi.org/10.1207/s15327574ijt0503_2 First citation in articleCrossrefGoogle Scholar

  • Arendasy, M., & Sommer, M. (2012). Using automatic item generation to meet the increasing item demands of high-stakes educational and occupational assessment. Learning and Individual Differences, 22, 112–117. https://doi.org/10.1177/0022022110397360 First citation in articleCrossrefGoogle Scholar

  • Arendasy, M., Sommer, M., Gittler, G., & Hergovich, A. (2006). Automatic generation of quantitative reasoning items: A pilot study. Journal of Individual Differences, 27, 2–14. https://doi.org/10.1027/1614-0001.27.1.2 First citation in articleLinkGoogle Scholar

  • Attali, Y., & Bar-Hillel, M. (2003). Guess where: The position of correct answers in multiple-choice test items as a psychometric variable. Journal of Educational Measurement, 40(2), 109–128. https://doi.org/10.1111/j.1745-3984.2003.tb01099.x First citation in articleCrossrefGoogle Scholar

  • Bar-Hillel, M., & Attali, Y. (2002). Seek whence: Answer sequences and their consequences in key-balanced multiple-choice tests. The American Statistician, 56(4), 299–303. https://doi.org/10.1198/000313002623 First citation in articleCrossrefGoogle Scholar

  • Becker, N., Schmitz, F., Falk, A. M., Feldbrügge, J., Preckel, F., Wilhelm, O., & Spinath, F. M. (2016). Preventing of response elimination strategies improves the construct validity of figural matrices. Journal of Intelligence, 4(1), Article 2. https://doi.org/10.3390/jintelligence4010002 First citation in articleCrossrefGoogle Scholar

  • Birenbaum, M., & Tatsuoka, K. K. (1987). Open-ended versus multiple-choice response formats–it does make a difference for diagnostic purposes. Applied Psychological Measurement, 11, 385–395. https://doi.org/10.1177/014662168701100404 First citation in articleCrossrefGoogle Scholar

  • Birenbaum, M., Tatsuoka, K. K., & Gutvirtz, Y. (1992). Effects of response format on diagnostic assessment of scholastic achievement. Applied Psychological Measurement, 16, 353–361. https://doi.org/10.1177/014662169201600406 First citation in articleCrossrefGoogle Scholar

  • Bliss, L. B. (1980). A test of Lord’s assumption regarding examinee guessing behavior on multiple-choice tests using elementary school students. Journal of Educational Measurement, 17(2), 147–153. First citation in articleCrossrefGoogle Scholar

  • Budescu, D., & Bar-Hillel, M. (1993). To guess or not to guess: A decision-theoretic view of formula scoring. Journal of Educational Measurement, 30(4), 277–291. First citation in articleCrossrefGoogle Scholar

  • Bush, M. (2001). A Multiple Choice Test that rewards partial knowledge. Journal of Further and Higher Education, 25, 157–163. https://doi.org/10.1080/03098770120050828 First citation in articleCrossrefGoogle Scholar

  • Cizek, G. J. (1994). The effect of altering the position of options in a multiple-choice examination. Educational and Psychological Measurement, 54, 8–20. https://doi.org/10.1177/0013164494054001002 First citation in articleCrossrefGoogle Scholar

  • Debeer, D., & Janssen, R. (2013). Modeling item-position effects within an IRT framework. Journal of Educational Measurement, 50, 164–185. https://doi.org/10.1111/jedm.12009 First citation in articleCrossrefGoogle Scholar

  • DiBattista, D., & Kurzawa, L. (2011). Examination of the quality of multiple-choice items on classroom tests. The Canadian Journal for the Scholarship of Teaching and Learning, 2(2), 1–23. https://doi.org/10.5206/cjsotl-rcacea.2011.2.4 First citation in articleCrossrefGoogle Scholar

  • Fischer, G. H. (1972). Conditional maximum-likelihood estimations of item parameters for a linear logistic test model, Psychological Institute University of Vienna. (Research Bulletin, 9). First citation in articleGoogle Scholar

  • Fischer, G. H. (1974). Einführung in die Theorie psychologischer Tests. Grundlagen und Anwendungen [Introduction of the theory of psychological tests – basics and applications]. Huber. First citation in articleGoogle Scholar

  • Fischer, G. H.Molenaar, I. W. (Eds.). (1995). Rasch models. Foundations, recent developments, and applications, Springer. First citation in articleGoogle Scholar

  • Fischer, G. H., & Scheiblechner, H. H. (1970). Algorithmen und Programme für das probabilistische Testmodell von Rasch [Algorithms and programs for the probabilistic Rasch model]. Psychologische Beiträge, 12, 23–51. First citation in articleGoogle Scholar

  • Gierl, M. J., Bulut, O., Guo, Q., & Zhang, X. (2017). Developing, analyzing, and using distractors for multiple-choice tests in education: A comprehensive review. Review of Educational Research, 87(6), 1082–1116. https://doi.org/10.3102/0034654317726529 First citation in articleCrossrefGoogle Scholar

  • Guo, J., Tay, L., & Drasgow, F. (2009). Conspiracies and Test Compromise: An Evaluation of the Resistance of Test Systems to Small-scale Cheating. International Journal of Testing, 9(4), 283–309. First citation in articleCrossrefGoogle Scholar

  • Hahne, J. (2008). Analyzing position effects within reasoning items using the LLTM for structurally incomplete data. Psychology Science Quarterly, 50, 379–390. First citation in articleGoogle Scholar

  • Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines. Applied Measurement in Education, 15(3), 309–334. https://doi.org/10.1207/S15324818AME1503_5 First citation in articleCrossrefGoogle Scholar

  • Hartig, J., & Buchholz, J. (2012). A multilevel item response model for item position effects and individuals persistence. Psychological Test and Assessment Modeling, 54, 418–431. First citation in articleGoogle Scholar

  • Hohensinn, C., & Baghaei, P. (2017). Does the position of response options in multiple-choice tests matter? Psicológica, 38, 93–109. First citation in articleGoogle Scholar

  • Hohensinn, C., & Kubinger, K. (2009). On varying item difficulty by changing the response format for a mathematical competence test. Austrian Journal of Statistics, 38(4), 231–239. First citation in articleGoogle Scholar

  • Hohensinn, C., Kubinger, K. D., Reif, M., Holocher-Ertl, S., Khorramdel, L., & Frebort, M. (2008). Examining item-position effects in large-scale assessment using the linear Logistic Test Model. Psychology Science Quarterly, 20, 391–402. First citation in articleGoogle Scholar

  • Hohensinn, C., Kubinger, K. D., Reif, M., Schleicher, E., & Khorramdel, L. (2011). Analysing item position effects due to test booklet deign within large-scale assessment. Educational Research and Evaluation, 17, 497–509. First citation in articleCrossrefGoogle Scholar

  • Katz, I. R., Bennett, R. E., & Berger, A. E. (2000). Effects of response format on difficulty of SAT-mathematics items: It’s not the strategy. Journal of Educational Measurement, 37, 39–57. https://doi.org/10.1111/j.1745-3984.2000.tb01075.x First citation in articleCrossrefGoogle Scholar

  • Kubinger, K. D. (2005). Psychological test calibration using the Rasch model – some critical suggestions on traditional approaches. International Journal of Testing, 5(4), 377–394. https://doi.org/10.1207/s15327574ijt0504_3 First citation in articleCrossrefGoogle Scholar

  • Kubinger, K. D. (2009). Applications of the Linear Logistic Test Model in psychometric research. Educational and Psychological Measurement, 69(2), 232–244. https://doi.org/10.1177/0013164408322021 First citation in articleCrossrefGoogle Scholar

  • Kubinger, K. D., & Gottschall, C. H. (2007). Item difficulty of multiple choice tests dependant on different item response formats – an experiment in fundamental research on psychological assessment. Psychology Science, 49(4), 361–374. First citation in articleGoogle Scholar

  • Kubinger, K.D., & Hagenmüller, B. (2019). Gruppentest zur Erfassung der Intelligenz auf Basis des AID (AID-G) [Intelligence diagnosticum for group administration, AID-G]. Hogrefe. First citation in articleGoogle Scholar

  • Kubinger, K. D., Reif, M., & Yanagida, T. (2011). Branched adaptive testing with a Rasch-Model-calibrated test: Analysing item presentation’s sequence effects using the Rasch-model-based LLTM. Educational Research and Evaluation, 17(5), 373–385. https://doi.org/10.1080/13803611.2011.630549 First citation in articleCrossrefGoogle Scholar

  • Mair, P., Hatzinger, R., & Maier, M. J. (2018). eRm: Extended Rasch modeling (Version 0.16–0). http://erm.r-forge.r-project.org/ First citation in articleGoogle Scholar

  • McNamara, W. J., & Weitzman, E. (1945). The effect of choice placement on the difficulty of multiple choice questions. Journal of Educational Psychology, 36, 103–113. First citation in articleCrossrefGoogle Scholar

  • Mittring, G., & Rost, D. H. (2008). Die verflixten Distraktoren. Über den Nutzen einer theoretischen Distraktorenanalyse bei Matrizentests (für besser Begabte und Hochbegabte) [The darn distractors. About the benefits of a theorectic distractor analysis in matrices tests (for the highly talented and gifted)]. Diagnostica, 54(4), 193–201. https://doi.org/10.1026/0012-1924.54.4.193 First citation in articleLinkGoogle Scholar

  • R Core Team. (2018). R: A language and environment for statistical computing, R Foundation for Statistical Computing. First citation in articleGoogle Scholar

  • Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests, University of Chicago Press. First citation in articleGoogle Scholar

  • Reif, M., & Steinfeld, J. (2017). PP: Estimation of person parameters for the 1, 2, 3, 4-PL model and the GPCM (R package version 0.6.1.), The R Foundation. https://cran.r-project.org/web/packages/pp First citation in articleGoogle Scholar

  • Rowley, G. L., & Traub, R. E. (1977). Formula scoring, number-right scoring and test-taking strategy. Journal of Educational Measurement, 14(1), 15–22. https://doi.org/10.1111/j.1745-3984.1977.tb00024.x First citation in articleCrossrefGoogle Scholar

  • Schroeder, J., Murphy, K. L., & Holme, T. A. (2012). Investigating factors that influence item performance on ACS exams. Journal of Chemical Education, 89, 346–350. First citation in articleCrossrefGoogle Scholar

  • Sherriffs, A. C., & Boomer, D. S. (1954). Who is penalized by the penalty for guessing? The Journal of Educational Psychology, 45(2), 81–90. https://doi.org/10.1037/h0053756 First citation in articleCrossrefGoogle Scholar

  • Smith, J. K. (1975). Converging on correct answers: A peculiarity of multiple choice items. Journal of Educational Measurement, 19, 211–220. https://doi.org/10.1111/j.1745-3984.1982.tb00129.x First citation in articleCrossrefGoogle Scholar

  • Snijders, T. B. (2001). Asymptotic null distribution of person fit statistics with estimated person parameter. Psychometrika, 66, 331–342. https://doi.org/10.1007/BF02294437 First citation in articleCrossrefGoogle Scholar

  • Thissen, A., Koch, M., Becker, N., & Spinath, F. M. (2018). Construct your own response: The cube construction task as a novel format for the assessment of spatial ability. European Journal of Psychological Assessment, 34(5), 304–311. https://doi.org/10.1027/1015-5759/a000342 First citation in articleLinkGoogle Scholar

  • Vigneau, F., Caissie, A. F., & Bors, D. A. (2006). Eye-movement analysis strategic influences on intelligence. Intelligence, 34(3), 261–272. https://doi.org/10.1016/j.intell.2005.11.003 First citation in articleCrossrefGoogle Scholar

  • Weirich, S., Hecht, M., Penk, C., Roppelt, A., & Böhme, K. (2017). Item position effects are moderated by changes in test-taking effort. Applied Psychological Measurement, 41, 115–129. https://doi.org/10.1177/0146621616676791 First citation in articleCrossrefGoogle Scholar

  • Weitzman, R. A. (1970). Ideal multiple-choice items. Journal of the American Statistical Association, 65(329), 71–88. https://doi.org/10.1080/01621459.1970.10481063 First citation in articleCrossrefGoogle Scholar