Abstract
The item selection rule (ISR) most commonly used in computerized adaptive testing (CAT) is to select the item with maximum Fisher information for the current trait estimation (PFI). Several alternative ISRs have been proposed. Among them, Fisher information considered in an interval (FI*I), Fisher information weighted with the likelihood function (FI*L), Kullback-Leibler information considered in an interval (KL*I) and Kullback-Leibler weighted with the likelihood function (KL*L) have shown a greater precision of trait estimation at the early stages of CAT. A new ISR is proposed, Fisher information by interval with geometric mean (FI*IG), which tries to rectify some detected problems in FI*I. We evaluate accuracy and item bank security for these six ISRs. FI*IG is the only ISR which simultaneously outperforms PFI in both variables. For the other ISRs, there seems to be a trade-off between accuracy and security, PFI being the one with worse accuracy and greater security, and the ISRs using the likelihood function the reverse.
References
(2006). Maximum information stratification method for controlling item exposure in computerized adaptive testing. Psicothema, 18, 156–159.
(2004). Reglas de selección de items en tests adaptativos informatizados. [Item selection rules in computerized adaptive testing]. Metodología de las Ciencias del Comportamiento, Volumen Especial. 55–61.
(2006). Estrategias de selección de ítems en un test adaptativo informatizado para la evaluación de inglés escrito. [Item selection rules in a computerized adaptive test for the assessment of written English]. Psicothema, 18, 828–834.
(2008). Incorporating randomness in Fisher information for improving item exposure control in CATs. British Journal of Mathematical and Statistical Psychology, 61, 493–513.
(in press). Multiple maximum exposure rates in computerized adaptive testing. Applied Psychological Measurement.
(1968). Some latent ability models and their use in inferring an examinee’s ability. In , Statistical theories of mental test scores (pp. 392–479). Reading, MA: Addison-Wesley.
(1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444.
(1962). The asymptotic properties of ML estimators when sampling from associated populations. Biometrika, 4, 205–214.
(2004). Understanding computerized adaptive testing – from Robbins-Monro to Lord and beyond. In , The SAGE handbook of quantitative methodology for the social sciences (pp. 117–133). Sage Publications.
(2001). a-Stratified multistage computerized adaptive testing with b blocking. Applied Psychological Measurement, 25, 333–341.
(1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20, 213–229.
(1999). a-Stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211–222.
(2008). To weight or not to weight? Balancing influence of initial items in adaptive testing. Psychometrika, 73, 441–450.
(2002). Hypergeometric family and item overlap rates in computerized adaptive testing. Psychometrika, 67, 387–398.
(2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71–103.
(2004). Effects of practical constraints on item selection rules at the early stages of computerized adaptive testing. Journal of Educational Measurement, 41, 149–174.
(2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24, 241–255.
(2003). The relationship between item exposure and test overlap in computerized adaptive testing. Journal of Educational Measurement, 40, 129–145.
(2005). Controlling item exposure and test overlap in computerized adaptive testing. Applied Psychological Measurement, 29, 204–217.
(2000). Estimation of trait level in computerized adaptive testing. Applied Psychological Measurement, 24, 257–265.
(2003). Computerized adaptive testing using the nearest-neighbors criterion. Applied Psychological Measurement, 27, 204–216.
(2000, April). Specific information item selection for adaptive testing. Paper presented at the annual meeting of National Council on Measurement in Education, New Orleans, LA.
(1995, April). New algorithms for item selection and exposure control with adaptive testing. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.
(1990). The effect of item selection procedure and stepsize on computerized adaptive attitude measurement using the rating scale model. Applied Psychological Measurement, 14, 355–366.
(1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23, 249–261.
(2007). A review of item exposure control strategies for computerized adaptive testing developed from 1983 to 2005. Journal of Technology, Learning, and Assessment, 5(8), Retrieved June 28, 2007, from www.jtla.org.
(2005). Increasing the homogeneity of CAT’s item-exposure rates by minimizing or maximizing varied target functions while assembling shadow tests. Journal of Educational Measurement, 42, 245–269.
(1971). Robbins-Monro procedures for tailored testing. Educational and Psychological Measurement, 31, 3–31.
(1977). A broad-range tailored test of verbal ability. Applied Psychological Measurement, 1, 95–100.
(1983). Unbiased estimators of ability parameters, of their variance, and their parallel-form reliability. Psychometrika, 48, 233–245.
(1983). Reliability and validity of adaptive ability tests in a military setting. In , New horizons in testing (pp. 223–236). New York: Academic Press.
(2003). Tests adaptativos informatizados [Computerized adaptive testing]. Madrid: UNED.
(1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70, 351–356.
(1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35, 311–327.
(1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 34, 100.
(1977). A use of the information function in tailored testing. Applied Psychological Measurement, 1, 233–247.
(1998). Controlling item exposure conditional on ability in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 23, 57–75.
(2000). In , Computerized adaptive testing: Theory and practice (pp. 163–182). Dordrecht, The Netherlands: Kluwer Academic.
(1985, October). Controlling item exposure rates in computerized adaptive testing. Proceedings of the 27th annual meeting of the Military Tesing Association (pp. 973–977). San Diego, CA.
(1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201–216.
(2000a). Computerized adaptive testing: Theory and practice. Norwell, MA: Kluwer.
(2000b). Capitalization on item calibration error in adaptive testing. Applied Measurement in Education, 12, 35–53.
(2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29, 273–291.
(2003). Item selection in polytomous CAT. In , New developments in psychometrics (pp. 207–214). Tokyo, Japan: Springer-Verlag.
(1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22, 203–226.
(1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35, 109–135.
(1998). Protecting the integrity of computerized testing item pools. Educational Measurement: Issues and Practice, 17, 17–27.
(1984). An investigation of methods for reducing sampling error in certain IRT procedures. Applied Psychological Measurement, 8, 347–364.


