Abstract

Zusammenfassung. Aus der klassischen Testtheorie (Spearman-Brown-Formel) wird gewöhnlich die Empfehlung abgeleitet, Tests aus möglichst vielen Items zusammenzustellen. Anhand mathematischer Ableitungen wird im folgenden Beitrag gezeigt, dass die Reliabilität und Validität einer Skala nur unter sehr strengen Voraussetzungen (Parallelität bzw. Rasch-Homogenität) zwingend mit zunehmender Testlänge ansteigen. Sind diese Voraussetzungen nicht erfüllt, so kann die Verlängerung eines Tests durchaus zu Einbußen bei den Gütekriterien führen. Auch bei zufälliger Itemselektion hängt es von den Eigenschaften des jeweiligen Itempools ab, ob mit zunehmender Testlänge Einbußen oder Verbesserungen der Testgüte zu erwarten sind. Ein negativer Zusammenhang der Testlänge mit der Reliabilität und Validität kann sich demnach nicht nur bei gezielter Auswahl der Items ergeben.

Myths and paradoxes of classical test theory (I): About test length, reliability, and validity

Abstract. A common suggestion derived from classical test theory is to assemble as many items as possible for a test. However, the reliability and validity of a lengthened test must improve only if very strong assumptions (parallel or Rasch-homogenous items) are satisfied. If these assumptions are violated, lengthening a test can impair reliability and validity. Even if items are selected randomly, it depends on the characteristics of the item pool whether lengthening of a test leads to higher or lower values of reliability and validity. Consequently, a negative relationship is not only possible if items are selected systematically.

Literatur

Amelang, M. Zielinski, W. (2002). Psychologische Diagnostik und Intervention (3. Aufl.). Berlin: Springer First citation in article Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F.M. Lord & M.R. Novick (Eds.), Statistical theories of mental test scores (pp. 397- 479). Reading, Massachusetts: Addison-Wesley First citation in article Google Scholar
Burisch, M. (1984). You don’t always get what you pay for: Measuring depression with short and simple versus long and sophisticated scales. Journal of Research in Personality, 18, 81– 98 First citation in article Crossref, Google Scholar
Burisch, M. (1997). Test-length and validity revisited. European Journal of Personality, 11, 303– 315 First citation in article Crossref, Google Scholar
Edwards, R. H. (1981). Coefficients of effective length. Educational and Psychological Measurement, 41, 283– 285 First citation in article Crossref, Google Scholar
Forster, O. (1999). Analysis 1 (5. Aufl.). Braunschweig: Vieweg First citation in article Google Scholar
Hoijtink, H. Boosma, A. (1996). Statistical inference based on latent ability estimates. Psychometrika, 61, 313– 330 First citation in article Crossref, Google Scholar
Jackson, D. N. (1984). Personality Research Form manual . Port Huron, MI: Research Psychologists Press First citation in article Google Scholar
Kendall, M. G. Stuart, A. (1977). The advanced theory of statistics (Vol.1.): Distribution Theory (4th ed.). New York: Hafner First citation in article Google Scholar
Krauth, J. (1995). Testkonstruktion und Testtheorie . Weinheim: PVU First citation in article Google Scholar
Lienert, G. A. Raatz, U. (1994). Testaufbau und Testanalyse . Weinheim: PVU First citation in article Google Scholar
Loevinger, J. (1954). The attenuation paradox in test theory. Psychological Bulletin, 51, 493– 504 First citation in article Crossref, Google Scholar
Lord, F. M. (1955). Some perspectives on “the attenuation paradox” in test theory. Psychological Bulletin, 52, 505– 510 First citation in article Crossref, Google Scholar
Lord, F. M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability. Psychometrika, 48, 233– 245 First citation in article Crossref, Google Scholar
Lord, F. M. Novick, M. R. (1968). Statistical theories of mental test scores . Reading, Massachusetts: Addison-Wesley First citation in article Google Scholar
Paunonen, S. V. (1984). Optimizing the validity of personality assessments: The importance of aggregation and item content. Journal of Research in Personality, 18, 411– 431 First citation in article Crossref, Google Scholar
Paunonen, S. V. Jackson, D. N. (1985). The validity of formal and informal Personality Assessments. Journal of Research in Personality, 19, 331– 342 First citation in article Crossref, Google Scholar
Rost, J. (1996). Lehrbuch Testtheorie Testkonstruktion . Bern: Huber First citation in article Google Scholar
Samejima, F. (1993). An approximation for the bias function of the maximum likelihood estimate of a latent variable for the general case where the item responses are discrete. Psychometrika, 58, 119– 138 First citation in article Crossref, Google Scholar
Samejima, F. (1994). Estimation of reliability coefficients using the test information function and its modifications. Applied Psychological Measurement, 18, 229– 244 First citation in article Crossref, Google Scholar
Steyer, R. Eid, M. (2001). Messen und Testen (2. Aufl.). Berlin: Springer First citation in article Google Scholar
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427– 450 First citation in article Crossref, Google Scholar
Williams, R. H. Zimmerman, D. W. (1982). Reconsideration of the “attenuation paradox”-and some new paradoxes in test validity. Journal of Experimental Education, 50, 164– 171 First citation in article Crossref, Google Scholar
Zimmerman, D. W. Williams, R. H. (1980). Is classical test theory ‘robust’ under violation of the assumption of uncorrelated errors?. Canadian Journal of Psychology, 34, 227– 236 First citation in article Crossref, Google Scholar

Volume 51Issue 1Januar 2005

ISSN: 0012-1924eISSN: 2190-622X

History

Published onlineJanuar 2005

Licenses & Copyright

Keywords

Acknowledgments:

Ich danke Herrn Prof. Dr. Manfred Amelang, Herrn Klaus Dieter Horlacher, Herrn Nabil Yousfi und den beiden anonymen Gutachtern für die zahlreichen Anregungen und Korrekturen.

PDF download

Verify Phone

Congrats!

Mythen und Paradoxien der klassischen Testtheorie (I)

Abstract

Literatur

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Mythen und Paradoxien der klassischen Testtheorie (I)

Abstract

Literatur

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners