Open Access

Multiple-Choice-Prüfungen an Hochschulen?

Ein Literaturüberblick und Plädoyer für mehr praxisorientierte Forschung

Marlit A. Lindner

Leibniz-Institut für die Pädagogik der Naturwissenschaften und Mathematik (IPN), Kiel

Search for more papers by this author

Benjamin Strobel

Leibniz-Institut für die Pädagogik der Naturwissenschaften und Mathematik (IPN), Kiel

Search for more papers by this author

, and

Olaf Köller

Leibniz-Institut für die Pädagogik der Naturwissenschaften und Mathematik (IPN), Kiel

Search for more papers by this author

Published Online:October 16, 2015https://doi.org/10.1024/1010-0652/a000156

Abstract

Zusammenfassung. Multiple-Choice-Aufgaben (MCA) sind bei der Leistungsmessung großer Personengruppen besonders ökonomisch. Im Zuge des hohen Prüfungsaufkommens im Bachelor-Master-System werden MCA-Klausuren auch an deutschen Hochschulen immer häufiger eingesetzt. Doch welche diagnostische Qualität haben Prüfungen mit MCA und wo liegen Vorteile und Probleme? In diesem Literaturüberblick kommen wir zu vier zentralen Ergebnissen: (1) MCA von hoher Qualität sind in vielen Fällen diagnostisch vergleichbar zu Constructed-Response-Aufgaben. (2) Es existieren effektive Strategien, um Rateeffekten zu begegnen. (3) Der Einfluss des Prüfungsformats auf Lern- und Prüfungsstrategien ist kaum vermeidbar. (4) Besonders geeignet für den Hochschulkontext sind die MC-Formate Multiple-Response und Multiple-True-False sowie insbesondere computerbasierte Testaufgaben. Zusätzlich zeigen wir einen Mangel an Forschungsarbeiten auf, die für belastbare Aussagen über den diagnostischen Wert von MCA in realen Kontexten unerlässlich sind und leiten daraus Forschungsfragen ab.

Are Multiple-Choice Exams Useful for Universities? A Literature Review and Argument for a More Practice Oriented Research

Abstract. Multiple-choice questions (MCQ) are particularly efficient in measuring achievement in large student groups. Due to the high number of tests in the bachelor-master-system, German universities administer MCQ exams with increasing frequency. So what is the diagnostic quality of exams using MCQ and which assets and drawbacks are associated with MCQ application? In the course of this literature review we draw four essential conclusions: (1) High quality MCQ share similar diagnostic characteristics with constructed-response questions in many cases. (2) There are potent strategies to address (the problem of) guessing in MCQ. (3) Effects of MCQ on learning and testing strategies are hardly avoidable. (4) The multiple-response and multiple-true-false format as well as computer-based MCQ-formats are particularly suitable for university exams. Additionally, we identify a considerable lack of research in this area and propose research desiderata so that the diagnostic value of MCQ in higher education can be reliably evaluated in the future.

Literatur

Albanese, M. A. (1993). Type K and other complex multiple-choice items: An analysis of research and item properties. Educational Measurement: Issues and Practice, 12 (1), 28–33. First citation in article Crossref, Google Scholar
Albanese, M. A., Kent, T. H. & Whitney, D. R. (1979). Cluing in multiple-choice test items with combinations of correct responses. Academic Medicine, 54, 948–950. First citation in article Crossref, Google Scholar
Anderson, J. R. & Bower, G. H. (1972). Recognition and retrieval process in free recall. Psychological Review, 79, 97–132. First citation in article Crossref, Google Scholar
Angoff, W. H. (1989). Does guessing really help? Journal of Educational Measurement, 26, 323–336. First citation in article Crossref, Google Scholar
Ascalon, M. E., Meyers, L. S., Davis, B. W. & Smits, N. (2007). Distractor similarity and item-stem structure: Effects on item difficulty. Applied Measurement in Education, 20, 153–170. First citation in article Crossref, Google Scholar
Attali, Y. & Bar-Hillel, M. (2003). Guess where: The position of correct answers in multiple-choice test items as a psychometric variable. Journal of Educational Measurement, 40, 109–128. First citation in article Crossref, Google Scholar
Baghaei, P. & Amrahi, N. (2011). The effects of the number of options on the psychometric characteristics of multiple choice items. Psychological Test and Assessment Modeling, 53, 192–211. First citation in article Google Scholar
Bar-Hillel, M., Budescu, D. & Attali, Y. (2005). Scoring and keying multiple choice tests: A case study in irrationality. Mind & Society, 4, 3–12. First citation in article Crossref, Google Scholar
Beaucamp, G. & Buchholz, J. A. (2010). Rechtsfragen bei der Einführung von Multiple-Choice-Prüfungen (Antwort-Wahl-Verfahren). Wissenschaftsrecht, 43, 56–67. First citation in article Crossref, Google Scholar
Bennett, R. E., Rock, D. A. & Wang, M. (1991). Equivalence of free-response and multiple-choice items. Journal of Educational Measurement, 28, 77–92. First citation in article Crossref, Google Scholar
Ben-Shakhar, G. & Sinai, Y. (1991). Gender differences in multiple choice tests: The role of differential guessing tendencies. Journal of Educational Measurement, 28, 23–35. First citation in article Crossref, Google Scholar
Ben-Simon, A., Budescu, D. V. & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21, 65–88. First citation in article Crossref, Google Scholar
Bible, L., Simkin, M. G. & Kuechler, W. L. (2008). Using multiple-choice tests to evaluate students' understanding of accounting. Accounting Education: An International Journal, 17, 55–68. First citation in article Crossref, Google Scholar
Birenbaum, M. & Feldman, R. A. (1998). Relationships between learning patterns and attitudes towards two assessment formats. Educational Research, 40, 90–98. First citation in article Crossref, Google Scholar
Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H. & Krathwohl, D. R. (1956). Taxonomy of educational objectives: Handbook I: Cognitive domain. New York, NY: David McKay. First citation in article Google Scholar
Bonner, S. M. (2013). Mathematics strategy use in solving test items in varied formats. The Journal of Experimental Education, 81, 409–428. First citation in article Crossref, Google Scholar
Boyle, A. & Hutchison, D. (2009). Sophisticated tasks in e-assessment: What are they and what are their benefits? Assessment & Evaluation in Higher Education, 34, 305–319. First citation in article Crossref, Google Scholar
Bridgeman, B., Trapani, C. & Attali, Y. (2012). Comparison of human and machine scoring of essays: Differences by gender, ethnicity, and country. Applied Measurement in Education, 25, 27–40. First citation in article Crossref, Google Scholar
Briggs, D. C., Alonzo, A. C., Schwab, C. & Wilson, M. (2006). Diagnostic assessment with ordered multiple-choice items. Educational Assessment, 11, 33–63. First citation in article Crossref, Google Scholar
Bruno, J. E. & Dirkzwager, A. (1995). Determining the optimal number of alternatives to a multiple-choice test item: An information theoretic perspective. Educational and Psychological Measurement, 55, 959–966. First citation in article Crossref, Google Scholar
Budescu, D. & Bar-Hillel, M. (1993). To guess or not to guess: A decision theoretic view of formula scoring. Journal of Educational Measurement, 30, 277–291. First citation in article Crossref, Google Scholar
Burton, R. F. (2001). Quantifying the effects of chance in multiple choice and true/false tests: question selection and guessing of answers. Assessment & Evaluation in Higher Education, 26, 41–50. First citation in article Crossref, Google Scholar
Burton, R. F. (2006). Sampling knowledge and understanding: How long should a test be? Assessment & Evaluation in Higher Education, 31, 569–582. First citation in article Crossref, Google Scholar
Bush, M. (2001). A multiple choice test that rewards partial knowledge. Journal of Further and Higher Education, 25, 157–163. First citation in article Crossref, Google Scholar
Bush, M. (2015). Reducing the need for guesswork in multiple-choice tests. Assessment & Evaluation in Higher Education, 40, 218–231. First citation in article Crossref, Google Scholar
Butler, A. C. & Roediger, H. L., III (2008). Feedback enhances the positive effects and reduces the negative effects of multiple-choice testing. Memory & Cognition, 36, 604–616. First citation in article Crossref, Google Scholar
Case, S. M. & Swanson, D. B. (2002). Constructing written test questions for the basic and clinical sciences (3rd ed.). Philadelphia, PA: National Board of Medical Examiners. First citation in article Google Scholar
Case, S. M., Swanson, D. B. & Ripkey, D. R. (1994). Comparison of items in five-option and extended-matching formats for assessment of diagnostic skills. Academic Medicine, 69, 1–3. First citation in article Crossref, Google Scholar
Chandratilake, M., Davis, M. & Ponnamperuma, G. (2011). Assessment of medical knowledge: The pros and cons of using true/false multiple choice questions. The National Medical Journal of India, 24, 225–228. First citation in article Google Scholar
Cizek, G. J. & O'Day, D. M. (1994). Further investigation of nonfunctioning options in multiple-choice test items. Educational and Psychological Measurement, 54, 861–872. First citation in article Crossref, Google Scholar
Cohen, A. D. (2006). The coming of age of research on test-taking strategies. Language Assessment Quarterly, 3, 307–331. First citation in article Crossref, Google Scholar
Cronbach, L. J. (1942). Studies of acquiescence as a factor in the true-false test. Journal of Educational Psychology, 33, 401– 401. First citation in article Crossref, Google Scholar
Delgado, A. R. & Prieto, G. (1998). Further evidence favoring three-option items in multiple-choice tests. European Journal of Psychological Assessment, 14, 197–201. First citation in article Link, Google Scholar
DeMars, C. E. (2000). Test stakes and item format interactions. Applied Measurement in Education, 13, 55–77. First citation in article Crossref, Google Scholar
DeMars, C. E. (2012). Confirming testlet effects. Applied Psychological Measurement, 36, 104–121. First citation in article Crossref, Google Scholar
Dirkzwager, A. (2003). Multiple evaluation: A new testing paradigm that exorcizes guessing. International Journal of Testing, 3, 333–352. First citation in article Crossref, Google Scholar
Dodeen, H. (2008). Assessing test-taking strategies of university students: Developing a scale and estimating its psychometric indices. Assessment & Evaluation in Higher Education, 33, 409–419. First citation in article Crossref, Google Scholar
Dolly, J. P. & Williams, K. S. (1986). Using test-taking strategies to maximize multiple-choice test scores. Educational and Psychological Measurement, 46, 619–625. First citation in article Crossref, Google Scholar
Downing, S. M. (2002a). Construct-irrelevant variance and flawed test questions: Do multiple-choice item-writing principles make any difference? Academic Medicine, 77 (10), 103–104. First citation in article Crossref, Google Scholar
Downing, S. M. (2002b). Threats to the validity of locally developed multiple-choice tests in medical education: Construct-irrelevant variance and construct underrepresentation. Advances in Health Sciences Education, 7, 235–241. First citation in article Crossref, Google Scholar
Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education. Advances in Health Sciences Education, 10, 133–143. First citation in article Crossref, Google Scholar
Downing, S. M. & Haladyna, T. M. (1997). Test item development: Validity evidence from quality assurance procedures. Applied Measurement in Education, 10, 61–82. First citation in article Crossref, Google Scholar
Dunbar, S. B., Koretz, D. M. & Hoover, H. D. (1991). Quality control in the development and use of performance assessments. Applied Measurement in Education, 4, 289–303. First citation in article Crossref, Google Scholar
Eckes, T. (2014). Lokale Abhängigkeit von Items im TestDaF-Leseverstehen. Diagnostica, 61, 93–106. First citation in article Link, Google Scholar
Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum. First citation in article Google Scholar
Ericsson, K. A. & Simon, H. A. (1984). Protocol analysis. Cambridge, MA: MIT press. First citation in article Google Scholar
Espinosa, M. P. & Gardeazabal, J. (2010). Optimal correction for guessing in multiple-choice tests. Journal of Mathematical Psychology, 54, 415–425. First citation in article Crossref, Google Scholar
Fajardo, L. L. & Chan, K. M. (1993). Evaluation of medical students in radiology. Written testing using uncued multiple-choice questions. Investigative Radiology, 28, 964–968. First citation in article Crossref, Google Scholar
Fenderson, B. A., Damjanov, I., Robeson, M. R., Veloski, J. J. & Rubin, E. (1997). The virtues of extended matching and uncued tests as alternatives to multiple choice questions. Human Pathology, 28, 526–532. First citation in article Crossref, Google Scholar
Frary, R. B. (1988). Formula scoring of multiple-choice tests (correction for guessing). Educational Measurement: Issues and Practice, 7 (2), 33–38. First citation in article Crossref, Google Scholar
Frary, R. B. (1989). Partial-credit scoring methods for multiple-choice tests. Applied Measurement in Education, 2, 79–96. First citation in article Crossref, Google Scholar
Grier, J. B. (1975). The number of alternatives for optimum test reliability. Journal of Educational Measurement, 12, 109–113. First citation in article Crossref, Google Scholar
Grosse, M. E. & Wright, B. A. (1985). Validity and reliability of true-false tests. Educational and Psychological Measurement, 45, 1–13. First citation in article Crossref, Google Scholar
Haladyna, T. M. (1992). Context-dependent item sets. Educational Measurement: Issues and Practice, 11 (1), 21–25. First citation in article Crossref, Google Scholar
Haladyna, T. M. (1997). Writing test items to evaluate higher order thinking. Boston, MA: Allyn and Bacon. First citation in article Google Scholar
Haladyna, T. M. (2004). Developing and validating multiple-choice test items. New York, NY: Routledge. First citation in article Crossref, Google Scholar
Haladyna, T. M. & Downing, S. M. (1989). Validity of a taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 2, 51–78. First citation in article Crossref, Google Scholar
Haladyna, T. M. & Downing, S. M. (1993). How many options is enough for a multiple choice test item? Educational and Psychological Measurement, 53, 999–1010. First citation in article Crossref, Google Scholar
Haladyna, T. M. & Downing, S. M. (2004). Construct-irrelevant variance in high-stakes testing. Educational Measurement: Issues and Practice, 23 (1), 17–27. First citation in article Crossref, Google Scholar
Haladyna, T. M., Downing, S. M. & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15, 309–333. First citation in article Crossref, Google Scholar
Haladyna, T. M. & Rodriguez, M. C. (2013). Developing and validating test items. New York, NY: Routledge. First citation in article Crossref, Google Scholar
Hancock, G. R. (1994). Cognitive complexity and the comparability of multiple-choice and constructed-response test formats. The Journal of Experimental Education, 62, 143–157. First citation in article Crossref, Google Scholar
Hancock, G. R., Thiede, K. W., Sax, G. & Michael, W. B. (1993). Reliability of comparably written two-option multiple-choice and true-false test items. Educational and Psychological Measurement, 53, 651–660. First citation in article Crossref, Google Scholar
Hanna, G. S. & Johnson, F. R. (1978). Reliability and validity of multiple-choice tests developed by four distractor selection procedures. The Journal of Educational Research, 71, 203–206. First citation in article Crossref, Google Scholar
Higham, P. A. & Arnold, M. M. (2007). Beyond reliability and validity: The role of metacognition in psychological testing. In R. A. Degregorio (Ed.), New developments in psychological testing (pp. 139–162). Hauppauge, NY: Nova Science. First citation in article Google Scholar
Hohensinn, C. & Kubinger, K. D. (2011). Applying item response theory methods to examine the impact of different response formats. Educational and Psychological Measurement, 71, 732–746. First citation in article Crossref, Google Scholar
Hollingworth, L., Beard, J. J. & Proctor, T. P. (2007). An investigation of item type in a standards-based assessment. Practical Assessment, Research & Evaluation, 12 (18). Retrieved from http://pareonline.net/genpare.asp?v=12&n=18 First citation in article Google Scholar
Jozefowicz, R. F., Koeppen, B. M., Case, S., Galbraith, R., Swanson, D. & Glew, R. H. (2002). The quality of in-house medical school examinations. Academic Medicine, 77, 156–161. First citation in article Crossref, Google Scholar
Kastner, M. & Stangl, B. (2011). Multiple choice and constructed response tests: Do test format and scoring matter? Procedia – Social and Behavioral Sciences, 12, 263–273. First citation in article Google Scholar
Kingston, N. M., Tiemann, G. C., Miller Jr, H. L. & Foster, D. (2012). An analysis of the discrete-option multiple-choice item type. Psychological Test and Assessment Modeling, 54, 3–19. First citation in article Google Scholar
Kintsch, W. (1970). Models for free recall and recognition. In D. A. Norman (Ed.), Models of human memory (pp. 331–373). New York, NY: Academic Press. First citation in article Crossref, Google Scholar
Kubinger, K. D. (2014). Gutachten zur Erstellung «gerichtsfester» Multiple-Choice-Prüfungsaufgaben. Psychologische Rundschau, 65, 169–178. First citation in article Link, Google Scholar
Lane, S. (2004). Validity of high-stakes assessment: Are students engaged in complex thinking? Educational Measurement: Issues and Practice, 23 (3), 6–14. First citation in article Crossref, Google Scholar
Lee, H. S., Liu, O. L. & Linn, M. C. (2011). Validating measurement of knowledge integration in science using multiple-choice and explanation items. Applied Measurement in Education, 24, 115–136. First citation in article Crossref, Google Scholar
Leighton, J. P. (2004). Avoiding misconception, misuse, and missed opportunities: The collection of verbal reports in educational achievement testing. Educational Measurement: Issues and Practice, 23 (4), 6–15. First citation in article Crossref, Google Scholar
Lesage, E., Valcke, M. & Sabbe, E. (2013). Scoring methods for multiple choice assessment in higher education – Is it still a matter of number right scoring or negative marking? Studies in Educational Evaluation, 39, 188–193. First citation in article Crossref, Google Scholar
Levine, M. V. & Drasgow, F. (1983). The relation between incorrect option choice and estimated ability. Educational and Psychological Measurement, 43, 675–685. First citation in article Crossref, Google Scholar
Liu, O. L., Lee, H. S. & Linn, M. C. (2011). An investigation of explanation multiple-choice items in science assessment. Educational Assessment, 16, 164–184. First citation in article Crossref, Google Scholar
Lord, F. M. (1975). Formula scoring and number right scoring. Journal of Educational Measurement, 12, 7–11. First citation in article Crossref, Google Scholar
Lord, F. M. (1977). The optimal number of choices per item: a comparison of four approaches. Journal of Educational Measurement, 14, 33–38. First citation in article Crossref, Google Scholar
Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34, 207–218. First citation in article Crossref, Google Scholar
Martínez, R. J., Moreno, R., Martín, I. & Trigo, M. E. (2009). Evaluation of five guidelines for option development in multiple-choice item-writing. Psicothema, 21, 326–330. First citation in article Google Scholar
McAllister, D. & Guidice, R. M. (2012). This is only a test: A machine-graded improvement to the multiple-choice and true-false examination. Teaching in Higher Education, 17, 193–207. First citation in article Crossref, Google Scholar
McCoubrie, P. (2004). Improving the fairness of multiple-choice questions: A literature review. Medical Teacher, 26, 709–712. First citation in article Crossref, Google Scholar
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp. 13–103). New York, NY: Macmillan. First citation in article Google Scholar
Millman, J., Bishop, C. H. & Ebel, R. (1965). An analysis of test-wiseness. Educational and Psychological Measurement, 25, 707–726. First citation in article Crossref, Google Scholar
Mitkov, R., Ha, A. L. & Karamanis, N. (2006). A computer-aided environment for generating multiple-choice test items. Natural Language Engineering, 12 (2), 177–194. First citation in article Crossref, Google Scholar
Mittring, G. & Rost, D. H. (2008). Die verflixten Distraktoren. Diagnostica, 54, 193–201. First citation in article Link, Google Scholar
Muñiz, J. & Menéndez, F. (2011). The answer-until-correct item format revisited. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 7, 103–110. First citation in article Link, Google Scholar
Nickerson, R. S. (1989). New directions in educational assessment. Educational Researcher, 18, 3–7. First citation in article Crossref, Google Scholar
Oosterhof, A. C. & Glasnapp, D. R. (1974). Comparative reliabilities and difficulties of the multiple-choice and true-false formats. The Journal of Experimental Education, 42, 62–64. First citation in article Crossref, Google Scholar
Owens, R. E., Hanna, G. S. & Coppedge, F. L. (1970). Comparison of multiple-choice tests using different types of distractor selection techniques. Journal of Educational Measurement, 7, 87–90. First citation in article Crossref, Google Scholar
Pamphlett, R. (2005). It takes only 100 true-false items to test medical students: true or false? Medical Teacher, 27, 468–470. First citation in article Crossref, Google Scholar
Prieto, G. & Delgado, A. R. (1999a). The role of instructions in the variability of sex-related differences in multiple-choice tests. Personality and Individual Differences, 27, 1067–1077. First citation in article Crossref, Google Scholar
Prieto, G. & Delgado, A. R. (1999b). The effect of instructions on multiple-choice test scores. European Journal of Psychological Assessment, 15, 143–143. First citation in article Link, Google Scholar
Rauch, D. & Hartig, J. (2010). Multiple-choice versus open-ended response formats of reading test items: A two-dimensional IRT analysis. Psychological Test and Assessment Modeling, 52, 354–379. First citation in article Google Scholar
Raven, J., Raven, J. C. & Court, J. H. (1998). Manual for Raven's Progressive Matrices and Vocabulary Scales. Section 3, The Standard Progressive Matrices. Oxford: Oxford Psychologists Press. First citation in article Google Scholar
Reich, G. A. (2013). Imperfect models, imperfect conclusions: An exploratory study of multiple-choice tests and historical knowledge. The Journal of Social Studies Research, 37, 3–16. First citation in article Crossref, Google Scholar
Rodriguez, M. C. (1997, April). The art & science of item writing: A meta-analysis of multiple-choice item format effects. Paper presented at the Annual Meeting of the American Education Research Association, Chicago, IL. First citation in article Google Scholar
Rodriguez, M. C. (2002). Choosing an item format. In G. Tindal & T. M. Haladyna (Eds.), Large-scale assessment programs for all students: Validity, technical adequacy, and implementation (pp. 211–229). Mahwah, NJ: Lawrence Erlbaum Associates. First citation in article Google Scholar
Rodriguez, M. C. (2003). Construct equivalence of multiple-choice and constructed-response items: A random effects synthesis of correlations. Journal of Educational Measurement, 40, 163–184. First citation in article Crossref, Google Scholar
Rodriguez, M. C. (2005). Three options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement: Issues and Practice, 24 (2), 3–13. First citation in article Crossref, Google Scholar
Roediger, H. L., III & Marsh, E. J. (2005). The positive and negative consequences of multiple-choice testing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 1155–1159. First citation in article Crossref, Google Scholar
Rost, D. H. & Sparfeldt, J. R. (2007). Leseverständnis ohne Lesen? Zeitschrift für Pädagogische Psychologie, 21, 305–314. First citation in article Link, Google Scholar
Rost, J. (2004). Lehrbuch Testtheorie-Testkonstruktion (2., vollst. überarb. u. erw. Aufl.). Bern: Hans Huber. First citation in article Google Scholar
Sarnacki, R. E. (1979). An examination of test-wiseness in the cognitive test domain. Review of Educational Research, 49, 252–279. First citation in article Crossref, Google Scholar
Schuwirth, L. W. T., van der Vleuten, C., Stoffers, H. E. J. H. & Peperkamp, A. G. W. (1996). Computerized long-menu questions as an alternative to open-ended questions in computerized assessment. Medical Education, 30, 50–55. First citation in article Crossref, Google Scholar
Scouller, K. M. (1998). The influence of assessment method on students' learning approaches: Multiple choice question examination versus assignment essay. Higher Education, 35, 453–472. First citation in article Crossref, Google Scholar
Scouller, K. M. & Prosser, M. (1994). Students' experiences in studying for multiple choice question examinations. Studies in Higher Education, 19, 267–279. First citation in article Crossref, Google Scholar
Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29, 4–14. First citation in article Crossref, Google Scholar
Shermis, M. D. & Burstein, J. C. (Eds.). (2002). Automated essay scoring: A cross-disciplinary perspective. New York, NY: Routledge. First citation in article Google Scholar
Shizuka, T., Takeuchi, O., Yashima, T. & Yoshizawa, K. (2006). A comparison of three-and four-option english tests for university entrance selection purposes in Japan. Language Testing, 23, 35–57. First citation in article Crossref, Google Scholar
Simkin, M. G. & Kuechler, W. L. (2005). Multiple-choice tests and student understanding: What is the connection? Decision Sciences Journal of Innovative Education, 3, 73–98. First citation in article Crossref, Google Scholar
Sparfeldt, J. R., Kimmel, R., Löwenkamp, L., Steingräber, A. & Rost, D. H. (2012). Not read, but nevertheless solved? Three experiments on PIRLS multiple choice reading comprehension test items. Educational Assessment, 17, 214–232. First citation in article Crossref, Google Scholar
Struyven, K., Dochy, F. & Janssens, S. (2005). Students' perceptions about evaluation and assessment in higher education: A review. Assessment & Evaluation in Higher Education, 30, 331–347. First citation in article Crossref, Google Scholar
Swanson, D. B., Holtzman, K. Z. & Allbee, K. (2008). Measurement characteristics of content-parallel single-best-answer and extended-matching questions in relation to number and source of options. Academic Medicine, 83, 21–24. First citation in article Crossref, Google Scholar
Swanson, D. B., Holtzman, K. Z., Allbee, K. & Clauser, B. E. (2006). Psychometric characteristics and response times for content-parallel extended-matching and one-best-answer items in relation to number of options. Academic Medicine, 81, 52–55. First citation in article Crossref, Google Scholar
Tarrant, M. & Ware, J. (2008). Impact of item-writing flaws in multiple-choice questions on student achievement in high-stakes nursing assessments. Medical Education, 42, 198–206. First citation in article Crossref, Google Scholar
Tarrant, M., Knierim, A., Hayes, S. K. & Ware, J. (2006). The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse Education in Practice, 6, 354–363. First citation in article Crossref, Google Scholar
Tarrant, M., Ware, J. & Mohammed, A. M. (2009). An assessment of functioning and non-functioning distractors in multiple-choice questions: A descriptive analysis. BMC Medical Education, 9, 40–40. First citation in article Crossref, Google Scholar
Traub, R. E. & Fisher, C. W. (1977). On the equivalence of constructed-response and multiple-choice tests. Applied Psychological Measurement, 1, 355–369. First citation in article Crossref, Google Scholar
Trevisan, M. S., Sax, G. & Michael, W. B. (1991). The effects of the number of options per item and student ability on test validity and reliability. Educational and Psychological Measurement, 51, 829–837. First citation in article Crossref, Google Scholar
Trevisan, M. S., Sax, G. & Michael, W. B. (1994). Estimating the optimal number of options per item using an incremental option paradigm. Educational and Psychological Measurement, 54, 86–91. First citation in article Crossref, Google Scholar
Tripp, A. & Tollefson, N. (1985). Are complex multiple-choice options more difficult and discriminating than conventional multiple-choice options? The Journal of Nursing Education, 24 (3), 92–98. First citation in article Crossref, Google Scholar
Tuerlinckx, F. & De Boeck, P. (2001). The effect of ignoring item interactions on the estimated discrimination parameters in item response theory. Psychological Methods, 6, 181–195. First citation in article Crossref, Google Scholar
Tversky, A. (1964). On the optimal number of alternatives at a choice point. Journal of Mathematical Psychology, 1, 386–391. First citation in article Crossref, Google Scholar
Von Schrader, S. & Ansley, T. (2006). Sex differences in the tendency to omit items on multiple-choice tests: 1980–2000. Applied Measurement in Education, 19, 41–65. First citation in article Crossref, Google Scholar
Vyas, R. & Supe, A. (2008). Multiple-choice questions: A literature review on the optimal number of options. National Medical Journal of India, 21 (3), 130–133. First citation in article Google Scholar
Wainer, H. & Thissen, D. (1993). Combining multiple-choice and constructed-response test scores: Toward a marxist theory of test construction. Applied Measurement in Education, 6, 103–118. First citation in article Crossref, Google Scholar
Wallach, P. M., Crespo, L. M., Holtzman, K. Z., Galbraith, R. M. & Swanson, D. B. (2006). Use of a committee review process to improve the quality of course examinations. Advances in Health Sciences Education, 11, 61–68. First citation in article Crossref, Google Scholar
Wan, L. & Henly, G. A. (2012). Measurement properties of two innovative item formats in a computer-based test. Applied Measurement in Education, 25, 58–78. First citation in article Crossref, Google Scholar
Wilcox, R. (1981). Solving measurement problems with an answer-until-correct scoring procedure. Applied Psychological Measurement, 5, 399–414. First citation in article Crossref, Google Scholar
Willing, S., Ostapczuk, M. & Musch, J. (2014). Do sequentially-presented answer options prevent the use of testwiseness cues on continuing medical education tests? Advances in Health Sciences Education, Advance online publication. doi: 10.1007/s10459-014-9528-2 First citation in article Google Scholar
Wilson, M. & Adams, R. J. (1995). Rasch Models for item bundles. Psychometrica, 60, 181–198. First citation in article Crossref, Google Scholar
Winkel, O. (2010). Higher education reform in Germany: How the aims of the bologna process can be simultaneously supported and missed. International Journal of Educational Management, 24, 303–313. First citation in article Google Scholar
Yaman, S. (2011). The optimal number of choices in multiple-choice tests: Some evidence for science and technology education. New Educational Review, 23, 227–241. First citation in article Google Scholar
Zeidner, M. (1987). Essay versus multiple-choice type classroom exams: The student's perspective. Journal of Educational Research, 80, 352–358. First citation in article Crossref, Google Scholar
Zenisky, A. L. & Sireci, S. G. (2002). Technological innovations in large-scale assessment. Applied Measurement in Education, 15, 337–362. First citation in article Crossref, Google Scholar
Zieky, M. (2006). Fairness review in assessment. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of Test Development (pp. 359–376). Mahwah, NJ: Lawrence Erlbaum Associates. First citation in article Google Scholar
Zimmerman, D. W. & Williams, R. H. (1982). Element of chance and comparative reliability of matching tests and multiple-choice tests. Psychological Reports, 50, 975–980. First citation in article Crossref, Google Scholar
Zimmerman, D. W. & Williams, R. H. (2003). A new look at the influence of guessing on the reliability of multiple-choice tests. Applied Psychological Measurement, 27, 357–371. First citation in article Crossref, Google Scholar

Volume 29Issue 3-4Oktober 2015

ISSN: 1010-0652eISSN: 1664-2910

Licenses & Copyright

Keywords

Acknowledgments:

Die Autoren danken dem Hauptherausgeber Prof. Jörn Sparfeldt sowie drei anonymen Begutachtenden für ihre hilfreichen Anregungen zu diesem Manuskript.

PDF download

Verify Phone

Congrats!

Multiple-Choice-Prüfungen an Hochschulen?

Ein Literaturüberblick und Plädoyer für mehr praxisorientierte Forschung

Abstract

Literatur

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Multiple-Choice-Prüfungen an Hochschulen?

Ein Literaturüberblick und Plädoyer für mehr praxisorientierte Forschung

Abstract

Literatur

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners