Skip to main content
Original Article

Assessing the Quality and Effectiveness of the Factor Score Estimates in Psychometric Factor-Analytic Applications

Published Online:https://doi.org/10.1027/1614-2241/a000170

Abstract. This article proposes an approach, intended for factor-analytic (FA) psychometric applications, which aims to assess the extent to which the FA-derived individual score estimates are accurate and allow the respondents to be consistently ordered and effectively differentiated over the range of trait values that is appropriate given the purposes of the test. Three groups of properties are assessed: (a) fineness, (b) probability, and (c) range, and, within each group, different indices are proposed. Overall, the proposal is comprehensive in that it can be used with (a) different factor score estimates derived from both the linear model and the categorical variable methodology model, and (b) any type of unrestricted or restricted FA solution. All the indices proposed have been implemented in a non-commercial and widely known program for exploratory factor analysis. The usefulness of the proposal is illustrated with a real-data example in the personality domain.

References

  • Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444. https://doi.org/10.1177/014662168200600405 First citation in articleCrossrefGoogle Scholar

  • Brown, A., & Croudace, T. (2015). Scoring and estimating score precision using multidimensional IRT. In S. P. ReiseD. A. RevickiEds., Handbook of Item Response Theory modeling: Applications to typical performance assessment (pp. 307–333). New York, NY: Routledge. First citation in articleGoogle Scholar

  • Chang, H., & Stout, W. (1993). The asymptotic posterior normality of the latent trait in an IRT model. Psychometrika, 58, 37–52. https://doi.org/10.1007/BF02294469 First citation in articleCrossrefGoogle Scholar

  • Cliff, N. (1977). A theory of consistency of ordering generalizable to tailored testing. Psychometrika, 42, 375–399. https://doi.org/10.1007/BF02293657 First citation in articleCrossrefGoogle Scholar

  • Cronbach, L. J., & Glesser, G. C. (1964). The signal/noise ratio in the comparison of reliability coefficients. Educational and Psychological Measurement, 24, 467–480. https://doi.org/10.1177/001316446402400303 First citation in articleCrossrefGoogle Scholar

  • Devlieger, I., & Rosseel, Y. (2017). Factor score path analysis. Methodology, 13, 31–38. https://doi.org/10.1027/1614-2241/a000130 First citation in articleLinkGoogle Scholar

  • Ferrando, P. J. (2012). Assessing the discriminating power of item and test scores in the linear factor-analysis model. Psicologica, 33, 111–134. First citation in articleGoogle Scholar

  • Ferrando, P. J., & Lorenzo-Seva, U. (2013). Unrestricted item factor analysis and some relations with item response theory (Technical report). Tarragona, Spain: Department of Psychology, Universitat Rovira i Virgili, Tarragona. First citation in articleGoogle Scholar

  • Ferrando, P. J., & Lorenzo-Seva, U. (2016). A note on improving EAP trait estimation in oblique factor-analytic and item response theory models. Psicologica, 37, 235–247. First citation in articleGoogle Scholar

  • Ferrando, P. J., & Lorenzo-Seva, U. (2017). Program FACTOR at 10: Origins, development and future directions. Psicothema, 29, 236–240. https://doi.org/10.7334/psicothema2016.304 First citation in articleGoogle Scholar

  • Ferrando, P. J., & Lorenzo-Seva, U. (2018a). Assessing the quality and appropriateness of factor solutions and factor score estimates in exploratory item factor analysis. Educational and Psychological Measurement, 78, 762–780. https://doi.org/10.1177/0013164417719308 First citation in articleCrossrefGoogle Scholar

  • Ferrando, P. J., & Lorenzo-Seva, U. (2018b). On the added value of multiple factor score estimates in essentially unidimensional models. Educational and Psychological Measurement, 79, 249–271. https://doi.org/10.1177/0013164418773851 First citation in articleCrossrefGoogle Scholar

  • Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430–450. https://doi.org/10.1037/1082-989X.6.4.430 First citation in articleCrossrefGoogle Scholar

  • Hendriks, A.A.J., Hofstee, W.K.B., & De Raad, B. (1999). The Five-Factor Inventory (FFPI). Personality and Individual Differences, 27, 307–325. First citation in articleCrossrefGoogle Scholar

  • Jackson, R. W. B. (1939). Reliability of mental tests. British Journal of Psychology, 29, 267–287. https://doi.org/10.1111/j.2044-8295.1939.tb00918.x First citation in articleGoogle Scholar

  • Levine, R., & Lord, F. M. (1959). An index of the discriminating power of a test at different parts of the score range. Educational and Psychological Measurement, 19, 497–503. https://doi.org/10.1177/001316445901900402 First citation in articleCrossrefGoogle Scholar

  • Loevinger, J. (1954). The attenuation paradox in test theory. Psychological Bulletin, 51, 493–504. https://doi.org/10.1037/h0058543 First citation in articleCrossrefGoogle Scholar

  • Lord, F. M. (1983). Statistical bias in maximum likelihood estimators of item parameters. Psychometrika, 48(3), 425–435. https://doi.org/10.1007/BF02293684 First citation in articleCrossrefGoogle Scholar

  • Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison Wesley. First citation in articleGoogle Scholar

  • McDonald, R. P. (1982). Linear vs. non linear models in Item Response Theory. Applied Psychological Measurement, 6, 379–396. https://doi.org/10.1177/014662168200600402 First citation in articleCrossrefGoogle Scholar

  • Milholland, J. E. (1955). The reliability of test discriminations. Educational and Psychological Measurement, 15, 362–375. https://doi.org/10.1177/001316445501500404 First citation in articleCrossrefGoogle Scholar

  • Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359–381. https://doi.org/10.1007/BF02306026 First citation in articleCrossrefGoogle Scholar

  • Nicewander, W. A., & Thomasson, G. L. (1999). Some reliability estimates for computerized adaptive tests. Applied Psychological Measurement, 23, 239–247. https://doi.org/10.1177/01466219922031356 First citation in articleCrossrefGoogle Scholar

  • Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016a). Evaluating bifactor models: Calculating and interpreting statistical indices. Psychological Methods, 21, 137–150. https://doi.org/10.1037/met0000045 First citation in articleCrossrefGoogle Scholar

  • Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016b). Applying bifactor statistical indices in the evaluation of psychological measures. Journal of Personality Assessment, 98, 223–237. https://doi.org/10.1080/00223891.2015.1089249 First citation in articleCrossrefGoogle Scholar

  • Rodríguez-Fornells, A., Lorenzo-Seva, U., & Andrés-Pueyo, A. (2001). Psychometric properties of the Spanish adaptation of the Five Factor Personality Inventory. European Journal of Psychological Assessment, 17, 145–153. https://doi.org/10.1027//1015-5759.17.2.145 First citation in articleLinkGoogle Scholar

  • Samejima, F. (1977). Weakly parallel tests in latent trait theory with some criticism of classical test theory. Psychometrika, 42, 193–198. https://doi.org/10.1007/BF02294048 First citation in articleCrossrefGoogle Scholar

  • Thissen, D., & Orlando, M. (2001). Item response theory for items scored in two categories. In D. ThissenH. WainerEds., Test scoring (pp. 73–140). Mahwah, NJ: LEA. First citation in articleGoogle Scholar

  • Wright, B. D. (1996). Reliability and separation. Rasch Measurement Transactions, 9, 472–474. First citation in articleGoogle Scholar

  • Wright, B. D., & Masters, G. N. (2002). Number of person or item strata. Rasch Measurement Transactions, 16, 888. First citation in articleGoogle Scholar

  • Yang, Y., & Green, S. B. (2015). Evaluation of structural equation modeling estimates of reliability for scales with ordered categorical items. Methodology, 11, 23–24. https://doi.org/10.1027/1614-2241/a000087 First citation in articleLinkGoogle Scholar

  • Yuan, K. H., Chan, W., Marcoulides, G. A., & Bentler, P. M. (2016). Assessing structural equation models by equivalence testing with adjusted fit indexes. Structural Equation Modeling, 23, 319–330. https://doi.org/10.1080/10705511.2015.1065414 First citation in articleCrossrefGoogle Scholar