Skip to main content
Original Article

Degrees of Freedom in Multigroup Confirmatory Factor Analyses

Are Models of Measurement Invariance Testing Correctly Specified?

Published Online:https://doi.org/10.1027/1015-5759/a000500

Abstract. Measurement invaraiance is a key concept in psychological assessment and a fundamental prerequisite for meaningful comparisons across groups. In the prevalent approach, multigroup confirmatory factor analysis (MGCFA), specific measurement parameters are constrained to equality across groups. The degrees of freedom (df) for these models readily follow from the hypothesized measurement model and the invariance constraints. In light of research questioning the soundness of statistical reporting in psychology, we examined how often reported df match with the df recalcualted based on information given in the publications. More specifically, we reviewed 128 studies from six leading peer-reviewed journals focusing on psychological assessment and recalculated the df for 302 measurement invariance testing procedures. Overall, about a quarter of all articles included at least one discrepancy with metric and scalar invariance being more frequently affected. We discuss moderators of these discrepancies and identify typical pitfalls in measurement invariance testing. Moreover, we provide example syntax for different methods of scaling latent variables and introduce a tool that allows for the recalculation of df in common MGCFA models to improve the statistical soundness of invariance testing in psychological research.

References

  • American Educational Research Association (AERA), American Psychological Association (APA), National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. First citation in articleGoogle Scholar

  • Bakker, M., van Dijk, A. & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7, 543–554. https://doi.org/10.1177/1745691612459060 First citation in articleCrossrefGoogle Scholar

  • Bakker, M. & Wicherts, J. M. (2011). The (mis)reporting of statistical results in psychology journals. Behavior Research Methods, 43, 666–678. https://doi.org/10.3758/s13428-011-0089-5 First citation in articleCrossrefGoogle Scholar

  • Beaujean, A. A. (2014). Latent variable modeling using R: A step by step guide. New York, NY: Routledge/Taylor & Francis Group. First citation in articleCrossrefGoogle Scholar

  • Borsboom, D. (2006a). When does measurement invariance matter? Medical Care, 44, 176–181. https://doi.org/10.1097/01.mlr.0000245143.08679.cc First citation in articleCrossrefGoogle Scholar

  • Borsboom, D. (2006b). The attack of the psychometricians. Psychometrika, 71, 425–440. https://doi.org/10.1007/s11336-006-1447-6 First citation in articleCrossrefGoogle Scholar

  • Chambers, C. (2017). The seven deadly sins of Psychology: A manifesto for reforming the culture of scientific practice. Princeton, NJ: Princeton University Press. First citation in articleCrossrefGoogle Scholar

  • Cheung, G. W. & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255. https://doi.org/10.1207/S15328007SEM0902_5 First citation in articleCrossrefGoogle Scholar

  • Cortina, J. M., Green, J. P., Keeler, K. R. & Vandenberg, R. J. (2017). Degrees of freedom in SEM: Are we testing the models that we claim to test? Organizational Research Methods, 20, 350–378. https://doi.org/10.1177/1094428116676345 First citation in articleCrossrefGoogle Scholar

  • Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7–29. https://doi.org/10.1177/0956797613504966 First citation in articleCrossrefGoogle Scholar

  • Eich, E. (2014). Business not as usual. Psychological Science, 25, 3–6. https://doi.org/10.1177/0956797613512465 First citation in articleCrossrefGoogle Scholar

  • Geiser, C. (2013). Data analysis with Mplus. New York, NY: Guilford Press. First citation in articleGoogle Scholar

  • Little, T. D., Slegers, D. W. & Card, N. A. (2006). A non-arbitrary method of identifying and scaling latent variables in SEM and MACS models. Structural Equation Modeling, 13, 59–72. https://doi.org/10.1207/s15328007sem1301_3 First citation in articleCrossrefGoogle Scholar

  • Markus, K. A. & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. New York, NY: Routledge. First citation in articleCrossrefGoogle Scholar

  • Millsap, R. E. (2001). When trivial constraints are not trivial: The choice of uniqueness constraints in confirmatory factor analysis. Structural Equation Modeling, 8, 1–17. https://doi.org/10.1207/S15328007SEM0801_1 First citation in articleCrossrefGoogle Scholar

  • Millsap, R. E. (2011). Statistical approaches to measurement invariance. New York, NY: Routledge. First citation in articleGoogle Scholar

  • Muthén, L. K. & Muthén, B. O. (1998–2017). Mplus user’s guide (8th ed). Los Angeles, CA: Muthén & Muthén. First citation in articleGoogle Scholar

  • Nelson, L. D., Simmons, J. & Simonsohn, U. (2018). Psychology’s renaissance. Annual Review of Psychology, 69, 511–534. https://doi.org/10.1146/annurev-psych-122216-011836 First citation in articleCrossrefGoogle Scholar

  • Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., … Yarkoni, T. (2015). Promoting an open research culture. Science, 348, 1422–1425. https://doi.org/10.1126/science.aab2374 First citation in articleCrossrefGoogle Scholar

  • Nowok, B., Raab, G. M. & Dibben, C. (2016). Synthpop: Bespoke creation of synthetic data in R. Journal of Statistical Software, 74, 1–26. https://doi.org/10.18637/jss.v074.i11 First citation in articleCrossrefGoogle Scholar

  • Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S. & Wicherts, J. M. (2016). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 48, 1205–1226. https://doi.org/10.3758/s13428-015-0664-2 First citation in articleCrossrefGoogle Scholar

  • Pendergast, L., von der Embse, N., Kilgus, S. & Eklund, K. (2017). Measurement equivalence: A non-technical primer on categorical multi-group confirmatory factor analysis in school psychology. Journal of School Psychology, 60, 65–82. https://doi.org/10.1016/j.jsp.2016.11.002 First citation in articleCrossrefGoogle Scholar

  • Piwowar, H. A., Day, R. S. & Fridsma, D. B. (2007). Sharing detailed research data is associated with increased citation rate. PLoS One, 2, e308. https://doi.org/10.1371/journal.pone.0000308 First citation in articleCrossrefGoogle Scholar

  • R Development Core Team. (2018). R: A language and environment for statistical computing. [Computer software]. Retrieved from https://www.r-project.org/ First citation in articleGoogle Scholar

  • Raju, N. S., Laffitte, L. J. & Byrne, B. M. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87, 517–529. https://doi.org/10.1037/0021-9010.87.3.517 First citation in articleCrossrefGoogle Scholar

  • Rensvold, R. B. & Cheung, G. W. (2001). Testing for metric invariance using structural equation models: Solving the standardization problem. In C. A. SchriesheimL. L. NeiderEds., Equivalence in measurement Research in management (pp. 21–50). Greenwich, CT: Information Age. First citation in articleGoogle Scholar

  • Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. https://doi.org/10.18637/jss.v048.i02 First citation in articleCrossrefGoogle Scholar

  • Schmidt, T. (2017). Statcheck does not work: All the numbers. Reply to Nuijten et al. (2017). https://doi.org/10.31234/osf.io/hr6qy First citation in articleGoogle Scholar

  • Simmons, J. P., Nelson, L. D. & Simonsohn, U. (2011). False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. https://doi.org/10.1177/0956797611417632 First citation in articleCrossrefGoogle Scholar

  • Simonsohn, U. (2013). Just post it: The lesson from two cases of fabricated data detected by statistics alone. Psychological Science, 24, 1875–1888. https://doi.org/10.1177/0956797613480366 First citation in articleCrossrefGoogle Scholar

  • Soderberg, C. K. (2018). Using OSF to share data: A step-by-step guide. Advances in Methods and Practices in Psychological Science, 1, 115–120. https://doi.org/10.1177/2515245918757689 First citation in articleCrossrefGoogle Scholar

  • Vandenberg, R. J. & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4–70. https://doi.org/10.1177/109442810031002 First citation in articleCrossrefGoogle Scholar

  • Wicherts, J. M. (2016). The importance of measurement invariance in neurocognitive ability testing. The Clinical Neuropsychologist, 30, 1006–1016. https://doi.org/10.1080/13854046.2016.1205136 First citation in articleCrossrefGoogle Scholar

  • Wicherts, J. M. & Dolan, C. V. (2010). Measurement invariance in confirmatory factor analysis: An illustration using IQ test performance of minorities. Educational Measurement: Issues and Practice, 29, 39–47. https://doi.org/10.1111/j.1745-3992.2010.00182.x First citation in articleCrossrefGoogle Scholar

  • Williams, R. (2012). Using the margins command to estimate and interpret adjusted predictions and marginal effects. Stata Journal, 12, 308–331. First citation in articleCrossrefGoogle Scholar

  • Yoon, M. & Millsap, R. E. (2007). Detecting violations of factorial invariance using data-based specification searches: A Monte Carlo study. Structural Equation Modeling, 14, 435–463. https://doi.org/10.1080/10705510701301677 First citation in articleCrossrefGoogle Scholar