Originalarbeit

Eine Einführung in die Plausible-Values-Technik für die psychologische Forschung

Oliver Lüdtke

Leibniz-Institut für Pädagogik der Naturwissenschaften und Mathematik, Kiel

Zentrum für internationale Bildungsvergleichsstudien (ZIB), München

Search for more papers by this author

and

Alexander Robitzsch

Leibniz-Institut für Pädagogik der Naturwissenschaften und Mathematik, Kiel

Zentrum für internationale Bildungsvergleichsstudien (ZIB), München

Search for more papers by this author

Published Online:February 28, 2017https://doi.org/10.1026/0012-1924/a000175

Abstract

Zusammenfassung. In der psychologischen Forschung durchgeführte Messungen zur Erfassung von Konstrukten sind meistens mit einem Messfehler behaftet. Diese Messfehler führen zu verzerrten Schätzern von Populationsparametern und deren Standardfehlern. In den letzten Jahrzehnten hat sich im Bereich der Large-Scale-Assessments mit der Plausible-Values-Technik ein Verfahren zur Korrektur von messfehlerbehafteten Zusammenhängen zwischen latenten Variablen und beobachteten Kovariaten etabliert. Der vorliegende Beitrag führt anhand eines einfachen Beispiels aus der Klassischen Testtheorie in dieses komplexe statistische Verfahren ein. Es wird gezeigt, dass alternative Verfahren zur Schätzung von Personenwerten im Allgemeinen zu verzerrten Schätzungen von Zusammenhängen auf Populationsebene führen. In einer Simulationsstudie werden diese Befunde auf ein IRT-Modell für dichotome Indikatoren übertragen. Aus diagnostischer Sicht wird betont, dass Plausible Values nicht zur Schätzung von individuellen Fähigkeitsausprägungen verwendet werden sollen. Abschließend werden methodische Herausforderungen bei der Anwendung der Plausible-Values-Technik sowie das Potential für die psychologische Forschung diskutiert.

An Introduction to the Plausible Value Technique for Psychological Research

Abstract. In psychological research, the assessment of most constructs is affected by measurement error. Measurement error results in biased estimates of population parameters and their standard errors. In the past few decades, in the area of large-scale assessment studies, the plausible values technique has been established as a procedure for correcting relationships between latent variables and covariates. The present article introduces this complex statistical technique using a simple example from classic test theory. It shows that alternative procedures for estimating person parameters result in biased estimates of relationships at the population level. A simulation study was conducted to demonstrate that these findings also hold for an item response model in the case of dichotomous indicators. The results highlight that plausible values should not be used for estimating individual person parameters and are not appropriate for individual psychological assessment. Finally, we discuss methodological challenges in the application of the plausible value technique and the potential of this technique for psychological research.

Literatur

Adams, R. J. (2005). Reliability as a measurement design effect. Studies in Educational Evaluation, 31, 162 – 172. First citation in article Crossref, Google Scholar
Anderson, J. W. & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103, 411 – 423. First citation in article Crossref, Google Scholar
Asparouhov, T. & Muthén, B. (2010). Plausible values for latent variables using Mplus (Tech. Rep.). Mplus Technical Report. Retrieved May 27, 2015 from http://statmodel.com/download/Plausible.pdf First citation in article Google Scholar
Blackwell, M., Honaker, J. & King, G. (im Druck a). A unified approach to measurement error and missing data: Overview and applications. Sociological Methods and Research. First citation in article Google Scholar
Blackwell, M., Honaker, J. & King, G. (im Druck b). A unified approach to measurement error and missing data: Details and extensions. Sociological Methods and Research. First citation in article Google Scholar
Bodner, T. (2008). What improves with increased missing data imputations? Structural Equation Modeling, 15, 651 – 675. First citation in article Crossref, Google Scholar
Bollen, K. A. (1989). Structural equations with latent variables. New York, NY: Wiley. First citation in article Crossref, Google Scholar
Bolstad, W. (2007). Introduction to Bayesian statistics. New York, NY: Wiley. First citation in article Crossref, Google Scholar
Brachinger, H. W. & Ost, F. (1996). Modelle mit latenten Variablen: Faktorenanalyse, Latent-Structure-Analyse und LISREL-Analyse. In L. FahrmeirA. HamerleG. TutzHrsg., Multivariate statistische Verfahren (S. 638 – 766). Berlin: de Gruyter. First citation in article Google Scholar
Buuren, S. van & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45 (3), 1 – 67. Retrieved from http://www.jstatsoft.org/v45/i03/ First citation in article Crossref, Google Scholar
Carstensen, C. H., Frey, A., Walter, O. & Knoll, S. (2007). Technische Grundlagen des dritten internationalen Vergleichs. In M. PrenzelC. ArteltJ. BaumertW. BlumM. HammannE. Kliemeet al.Hrsg, PISA 2006. Die Ergebnisse der dritten internationalen Vergleichsstudie (S. 367 – 390). Münster: Waxmann. First citation in article Google Scholar
Cole, S. R., Chu, H. & Greenland, S. (2006). Multiple-imputation for measurement-error correction. International Journal of Epidemiology, 35, 1074 – 1081. First citation in article Crossref, Google Scholar
Collins, L. M., Schafer, J. L. & Kam, C.-M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330 – 351. First citation in article Crossref, Google Scholar
von Davier, M., Gonzalez, E. & Mislevy, R. J. (2009). What are plausible values and why are they useful? IERI Monograph Series: Issues and Methodologies in Large Scale Assessments, 2, 9 – 36. First citation in article Google Scholar
von Davier, M., Sinharay, S., Oranje, A. & Beaton, A. (2006). The statistical procedures used in national assessment of educational progress: Recent developments and future directions. In C. R. RaoS. SinharayEds., Handbook of statistics: Vol. 26. Psychometrics (pp. 1039 – 1055). Amsterdam: Elsevier. First citation in article Google Scholar
Devlieger, I., Mayer, A. & Rosseel, Y. (2016). Hypothesis testing using factor score regression: A comparison of four methods. Educational and Psychological Measurement, 76, 741 – 770. First citation in article Crossref, Google Scholar
Eid, M. & Schmidt, K. (2014). Testtheorie und Testkonstruktion. Göttingen: Hogrefe. First citation in article Google Scholar
Enders, C. K. (2010). Applied missing data analysis. New York, NY: Guilford. First citation in article Google Scholar
Enders, C. K., Mistler, S. A. & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods, 21, 222 – 240. First citation in article Crossref, Google Scholar
Graham, J. W., Olchowski, A. E. & Gilreath, T. D. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory. Preventions Science, 8, 206 – 213. First citation in article Crossref, Google Scholar
Grice, J. W. (2001). A comparison of factor scores under conditions of factor obliquity. Psychological Methods, 6, 67 – 83. First citation in article Crossref, Google Scholar
Hartig, J. & Goldhammer, F. (2010). Modelle der Item-Response-Theorie. In S. MaschkeL. StecherHrsg., Enzyklopädie Erziehungswissenschaft Online. Fachgebiet Methoden der empirischen erziehungswissenschaftlichen Forschung, Quantitative Forschungsmethoden. Weinheim: Juventa. First citation in article Google Scholar
Hayduk, L. A. (1987). Structural equation modeling with LISREL: Essentials and advances. Baltimore: John Hopkins University Press. First citation in article Google Scholar
Hoff, P. D. (2009). A first course in Bayesian statistical methods. New York, NY: Springer. First citation in article Crossref, Google Scholar
Honaker, J., King, G. & Blackwell, M. (2011). Amelia II: A program for missing data. Journal of Statistical Software, 45, 1 – 47. Retrieved May 27, 2015, from http://www.jstatsoft.org/v45/i07/ First citation in article Crossref, Google Scholar
Kelley, T. L. (1927). Interpretation of educational measurements. New York, NY: World Book. First citation in article Google Scholar
Kiefer, T., Robitzsch, A. & Wu, M. (2016). TAM: Test analysis modules (R package version 1.16 – 2) [Computer software]. Retrieved from http://CRAN.R-project.org/package=TAM First citation in article Google Scholar
Ledgerwood, A. & Shrout, P. E. (2011). The trade-off between accuracy and precision in latent variable models of mediation processes. Journal of Personality and Social Psychology, 101, 1174 – 1188. First citation in article Crossref, Google Scholar
Li, D., Oranje, A. & Jiang, Y. (2009). On the estimation of hierarchical latent regression models for large-scale assessments. Journal of Educational and Behavioral Statistics, 34, 433 – 463. First citation in article Crossref, Google Scholar
Little, R. J. A. & Rubin, D. B. (2002). Statistical analysis with missing data. New York: Wiley. First citation in article Crossref, Google Scholar
Little, T. D., Rhemtulla, M., Gibson, K. & Schoemann, A. M. (2013). Why the items versus parcels controversy needn’t be one. Psychological Methods, 18, 285 – 300. First citation in article Crossref, Google Scholar
Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley. First citation in article Google Scholar
Lüdtke, O. & Robitzsch, A. (2010). Umgang mit fehlenden Daten in der empirischen Bildungsforschung. In S. MaschkeL. StecherHrsg., Enzyklopädie Erziehungswissenschaft Online. Fachgebiet Methoden der empirischen erziehungswissenschaftlichen Forschung, Quantitative Forschungsmethoden. Weinheim: Juventa. First citation in article Google Scholar
McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum. First citation in article Google Scholar
McDonald, R. P. (2010). Structural models and the art of approximation. Perspectives on Psychological Science, 5, 675 – 686. First citation in article Crossref, Google Scholar
McDonald, R. P. (2011). Measuring latent quantities. Psychometrika, 76, 511 – 536. First citation in article Crossref, Google Scholar
Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56, 177 – 196. First citation in article Crossref, Google Scholar
Mislevy, R. J., Johnson, E. G. & Muraki, E. (1992). Scaling procedures in NAEP. Journal of Educational Statistics, 17, 131 – 154. First citation in article Crossref, Google Scholar
Monseur, C. & Adams, R. J. (2009). Plausible values: How to deal with their limitations. Journal of Applied Measurement, 10, 320 – 334. First citation in article Google Scholar
Muthén, B. O. (2004). Mplus technical appendices. Los Angeles, CA: Muthén & Muthén. First citation in article Google Scholar
Oberski, D. L. & Satorra, A. (2013). Measurement error models with uncertainty about the error variance. Structural Equation Modeling, 20, 409 – 428. First citation in article Crossref, Google Scholar
Oranje, A. & Ye, L. (2014). Population model size, bias and variance in educational survey assessments. In L. RutkowskiM. von DavierD. RutkowskiEds., Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis (pp. 203 – 228). Boca Raton, FL: CRC Press. First citation in article Google Scholar
Raykov, T. & Marcoulides, G. A. (2011). Introduction to psychometric theory. Routledge: Taylor & Francis. First citation in article Crossref, Google Scholar
Rost, J. (2004). Lehrbuch Testtheorie – Testkonstruktion. Bern: Huber. First citation in article Google Scholar
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. Hoboken, NJ: Wiley. First citation in article Crossref, Google Scholar
Schafer, J. L. & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147 – 177. First citation in article Crossref, Google Scholar
Schofield, L. S., Junker, B. W., Taylor, L. J. & Black, D. A. (2015). Predictive inference using latent variables with covariates. Psychometrika, 80, 727 – 747. First citation in article Crossref, Google Scholar
Skrondal, A. & Laake, P. (2001). Regression among factor scores. Psychometrika, 66, 563 – 575. First citation in article Crossref, Google Scholar
Skrondal, A. & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. Boca Raton, FL: Chapman & Hall/CRC. First citation in article Crossref, Google Scholar
Sterba, S. K. & MacCallum, R. C. (2010). Variability in parameter estimates and model fit across repeated allocations of items to parcels. Multivariate Behavioral Research, 45, 322 – 358. First citation in article Crossref, Google Scholar
Steyer, R. & Eid, M. (2001). Messen und Testen. Berlin: Springer. First citation in article Crossref, Google Scholar
Traub, R. E. (1994). Reliability for the social sciences: Theory and applications. Thousand Oaks, CA: Sage Publications. First citation in article Google Scholar
Wainer, H. & Thissen, D. (2001). True score theory: The traditional method. In D. ThissenH. WainerEds., Test scoring (pp. 23 – 72). Hillsdale, NJ: Lawrence Erlbaum Associates. First citation in article Google Scholar
Yang, J. S. & Seltzer, M. (2016). Handling measurement error in predictors using a multilevel latent variable plausible values approach. In J. R. HarringL. M. StapletonS. N. BeretvasEds., Advances in multilevel modeling for educational research (pp. 295 – 333). Charlotte, NC: IAP. First citation in article Google Scholar
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427 – 450. First citation in article Crossref, Google Scholar
Weirich, S., Haag, N., Hecht, M., Böhme, K., Siegle, T. & Lüdtke, O. (2014). Nested multiple imputation in large-scale assessments. Large-scale Assessments in Education, 2 (9), 1 – 18. First citation in article Crossref, Google Scholar
Williams, L. J. & O’Boyle, E. (2011). The myth of global fit indices and alternatives for assessing latent variable relations. Organizational Research Methods, 14, 350 – 369. First citation in article Crossref, Google Scholar
Wolf, E. J., Harrington, K. M., Clark, S. L. & Miller, M. W. (2013). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety. Educational and Psychological Measurement, 73, 913 – 934. First citation in article Crossref, Google Scholar
Wu, M. (2005). The role of plausible values in large-scale surveys. Studies in Educational Evaluation, 31, 114 – 128. First citation in article Crossref, Google Scholar

Volume 63Issue 3Juli 2017

ISSN: 0012-1924eISSN: 2190-622X

Licenses & Copyright

Keywords

Acknowledgments:

Die Arbeit wurde im Rahmen des Zentrums für internationale Bildungsvergleichsstudien (ZIB) durchgeführt, das vom Bundesministerium für Bildung und Forschung (BMBF) und der Kultusministerkonferenz gefördert wird.

PDF download

Verify Phone

Congrats!

Eine Einführung in die Plausible-Values-Technik für die psychologische Forschung

Abstract

Literatur

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Eine Einführung in die Plausible-Values-Technik für die psychologische Forschung

Abstract

Literatur

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners