Degrees of Freedom in Multigroup Confirmatory Factor Analyses
Are Models of Measurement Invariance Testing Correctly Specified?
Abstract
Abstract. Measurement invaraiance is a key concept in psychological assessment and a fundamental prerequisite for meaningful comparisons across groups. In the prevalent approach, multigroup confirmatory factor analysis (MGCFA), specific measurement parameters are constrained to equality across groups. The degrees of freedom (df) for these models readily follow from the hypothesized measurement model and the invariance constraints. In light of research questioning the soundness of statistical reporting in psychology, we examined how often reported df match with the df recalcualted based on information given in the publications. More specifically, we reviewed 128 studies from six leading peer-reviewed journals focusing on psychological assessment and recalculated the df for 302 measurement invariance testing procedures. Overall, about a quarter of all articles included at least one discrepancy with metric and scalar invariance being more frequently affected. We discuss moderators of these discrepancies and identify typical pitfalls in measurement invariance testing. Moreover, we provide example syntax for different methods of scaling latent variables and introduce a tool that allows for the recalculation of df in common MGCFA models to improve the statistical soundness of invariance testing in psychological research.
References
2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7, 543–554. https://doi.org/10.1177/1745691612459060
(2011). The (mis)reporting of statistical results in psychology journals. Behavior Research Methods, 43, 666–678. https://doi.org/10.3758/s13428-011-0089-5
(2014). Latent variable modeling using R: A step by step guide. New York, NY: Routledge/Taylor & Francis Group.
(2006a). When does measurement invariance matter? Medical Care, 44, 176–181. https://doi.org/10.1097/01.mlr.0000245143.08679.cc
(2006b). The attack of the psychometricians. Psychometrika, 71, 425–440. https://doi.org/10.1007/s11336-006-1447-6
(2017). The seven deadly sins of Psychology: A manifesto for reforming the culture of scientific practice. Princeton, NJ: Princeton University Press.
(2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233–255. https://doi.org/10.1207/S15328007SEM0902_5
(2017). Degrees of freedom in SEM: Are we testing the models that we claim to test? Organizational Research Methods, 20, 350–378. https://doi.org/10.1177/1094428116676345
(2014). The new statistics: Why and how. Psychological Science, 25, 7–29. https://doi.org/10.1177/0956797613504966
(2014). Business not as usual. Psychological Science, 25, 3–6. https://doi.org/10.1177/0956797613512465
(2013). Data analysis with Mplus. New York, NY: Guilford Press.
(2006). A non-arbitrary method of identifying and scaling latent variables in SEM and MACS models. Structural Equation Modeling, 13, 59–72. https://doi.org/10.1207/s15328007sem1301_3
(2013). Frontiers of test validity theory: Measurement, causation, and meaning. New York, NY: Routledge.
(2001). When trivial constraints are not trivial: The choice of uniqueness constraints in confirmatory factor analysis. Structural Equation Modeling, 8, 1–17. https://doi.org/10.1207/S15328007SEM0801_1
(2011). Statistical approaches to measurement invariance. New York, NY: Routledge.
(1998–2017). Mplus user’s guide (8th ed). Los Angeles, CA: Muthén & Muthén.
(2018). Psychology’s renaissance. Annual Review of Psychology, 69, 511–534. https://doi.org/10.1146/annurev-psych-122216-011836
(2015). Promoting an open research culture. Science, 348, 1422–1425. https://doi.org/10.1126/science.aab2374
(2016). Synthpop: Bespoke creation of synthetic data in R. Journal of Statistical Software, 74, 1–26. https://doi.org/10.18637/jss.v074.i11
(2016). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 48, 1205–1226. https://doi.org/10.3758/s13428-015-0664-2
(2017). Measurement equivalence: A non-technical primer on categorical multi-group confirmatory factor analysis in school psychology. Journal of School Psychology, 60, 65–82. https://doi.org/10.1016/j.jsp.2016.11.002
(2007). Sharing detailed research data is associated with increased citation rate. PLoS One, 2, e308. https://doi.org/10.1371/journal.pone.0000308
(2018). R: A language and environment for statistical computing. [Computer software]. Retrieved from https://www.r-project.org/
. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87, 517–529. https://doi.org/10.1037/0021-9010.87.3.517
(2001).
(Testing for metric invariance using structural equation models: Solving the standardization problem . In C. A. SchriesheimL. L. NeiderEds., Equivalence in measurement Research in management (pp. 21–50). Greenwich, CT: Information Age.2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. https://doi.org/10.18637/jss.v048.i02
(2017). Statcheck does not work: All the numbers. Reply to Nuijten et al. (2017). https://doi.org/10.31234/osf.io/hr6qy
(2011). False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. https://doi.org/10.1177/0956797611417632
(2013). Just post it: The lesson from two cases of fabricated data detected by statistics alone. Psychological Science, 24, 1875–1888. https://doi.org/10.1177/0956797613480366
(2018). Using OSF to share data: A step-by-step guide. Advances in Methods and Practices in Psychological Science, 1, 115–120. https://doi.org/10.1177/2515245918757689
(2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4–70. https://doi.org/10.1177/109442810031002
(2016). The importance of measurement invariance in neurocognitive ability testing. The Clinical Neuropsychologist, 30, 1006–1016. https://doi.org/10.1080/13854046.2016.1205136
(2010). Measurement invariance in confirmatory factor analysis: An illustration using IQ test performance of minorities. Educational Measurement: Issues and Practice, 29, 39–47. https://doi.org/10.1111/j.1745-3992.2010.00182.x
(2012). Using the margins command to estimate and interpret adjusted predictions and marginal effects. Stata Journal, 12, 308–331.
(2007). Detecting violations of factorial invariance using data-based specification searches: A Monte Carlo study. Structural Equation Modeling, 14, 435–463. https://doi.org/10.1080/10705510701301677
(