Abstract
Zusammenfassung. In der Intelligenzdiagnostik können für Skalenscores Gesamt- und Konstruktreliabilitäten unterschieden werden. Während Gesamtreliabilitäten sich auf die gesamte “wahre“ Varianz in einem Skalenscore beziehen, spiegeln Konstruktreliabilitäten die Genauigkeit wider, mit dem ein Skalenscore ein bestimmtes Fähigkeitskonstrukt erfasst. Gesamt- und Konstruktreliabilitäten sind nicht identisch, wenn man annimmt, dass sich die Skalenscores multidimensional aus Varianzanteilen zusammensetzen, welche die allgemeine kognitive Fähigkeit g sowie spezifischere kognitive Fähigkeiten messen. In dieser Arbeit illustrieren wir dieses Problem für die Skalen des Berliner Intelligenzstrukturtests (BIS-Test) anhand einer Schülerstichprobe (N = 910). Zur Berechnung von Gesamt- und Konstruktreliabilitäten verwendeten wir eine Methode, die auf Modellparametern aus konfirmatorischen Faktorenanalysen basiert. Während die Gesamtreliabilitäten weitestgehend zufrieden stellend waren (die Werte lagen zwischen .77 für die Merkfähigkeit und .93 für g), waren die Konstruktreliabilitäten der spezifischen kognitiven Fähigkeiten unabhängig vom verwendeten Koeffizienten nicht zufrieden stellend (die Werte lagen zwischen .17 für die numerische Fähigkeit und .67 für die Verarbeitungskapazität). Mögliche Implikationen der Ergebnisse für die Einzelfalldiagnostik werden diskutiert.
Abstract. Two aspects of the reliability of scale scores of intelligence measures can be distinguished: the amount of variance in the scale scores that is accounted for by all underlying cognitive abilities (composite reliability) and the degree to which the scale score reflects one specific ability (construct reliability). Composite reliability and construct reliability are not identical if scale scores represent a multidimensional composite that contains variance in general cognitive ability (g) and variance in specific cognitive abilities. In this paper we illustrate this problem using the scales of the Berlin Intelligence Structure Test and data from a student sample (N = 910). We estimated composite and construct reliability using a method based on the model parameters of confirmatory factor analysis. Composite reliabilities were satisfactory (and ranged between .77 for memory and .93 for g). Construct reliabilities of the specific cognitive abilities were inadequate, however, independent of the measure of construct reliability applied (values ranged between .17 for quantitative ability and .67 for reasoning). Possible implications for the diagnosis of individuals’ cognitive abilities are discussed.
Literatur
Bacon, D. R. Sauer, P. L. Young, M. (1995). Composite reliability in structural equation modeling. Educational and Psychological Measurement, 55, 394– 406Beckmann, J. F. Guthke, J. (1999). Berliner Intelligenzstruktur-Test (BIS). Form 4. Diagnostica, 45, 56– 61Bollen, K. A. (1989). Structural equations with latent variables . New York: John Wiley & SonsBrunner, M. Süß, H.-M. (2005). Analyzing the reliability of multidimensional measures: An example from intelligence research. Educational and Psychological Measurement, 65, 227– 240Carroll, J. B. (1993). Human cognitive abilities - A survey of factoranalytic studies . New York: Cambridge University PressCattell, R. B. Radcliffe, J. A. (1962). Reliabilities and validities of simple and extended weighted and buffered unifactor scales. The British Journal of Statistical Psychology, 15, 113– 128Cattell, R. B. Tsujioka, B. (1964). The importance of factor-trueness and validity, versus homogeneity and orthogonality, in test scales. Educational and Psychological Measurement, 24, 3– 30Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98– 104Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297– 334Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430– 450Gustafsson, J. E. (2002). Measurement from a hierarchical point of view. In H. I. Braun, D. N. Jackson & D. E. Wiley (Eds.), The role of constructs in psychological and educational measurement (pp. 73-95). Mahwah, NJ: Lawrence Erlbaum AssociatesGustafsson, J. E. Balke, G. (1993). General and specific abilities as predictors of school achievement. Multivariate Behavioral Research, 28, 407– 434Hancock, G. R. Mueller, R. O. (2001). Rethinking construct reliability within latent variable systems. In R. Cudeck, S. du Toit & D. Soerbom (Eds.), Structural equation modeling: present and future - A festschrift in honor of Karl Jöreskog (pp. 195-216). Lincolnwood, IL: Scientific Software InternationalHogan, T. P. Benjamin, A. Brezinski, K. L. (2000). Reliability methods: A note on the frequency of use of various types. Educational and Psychological Measurement, 60, 523– 531Horn, J. L. Noll, J. (1997). Human cognitive capabilities: Gf-Gc theory. In D. P. Flanagan, J. L. Genshaft & P. L. Harrison (Eds.), Contemporary intellectual assessment. Theories, tests, and issues (pp. 53-91). New York: The Guilford PressHu, L.-T. Bentler, P. M. (1998). Fit Indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424– 453Irvine, S. H. Kyllonen, P. C. (Eds.) (2002). Item generation for test development . Mahwah, NJ: Lawrence ErlbaumJäger, A. O. (1982). Mehrmodale Klassifikation von Intelligenzleistungen. Experimentell kontrollierte Weiterentwicklung eines deskriptiven Intelligenzstrukturmodells. Diagnostica, 28, 195– 226Jäger, A. O. Süß, H.-M. Beauducel, A. (1997). Berliner Intelligenzstruktur-Test, Form 4 . Göttingen: HogrefeKaplan, D. (1990). Evaluating and modifying covariance structure models: A review and recommendation. Multivariate Behavioral Research, 25, 137– 155Kyllonen, P. C. (1994). Aptitude testing inspired by information processing: A test of the four-sources model. The Journal of General Psychology, 120, 375– 405Lienert, G. A. (1989). Testaufbau und Testanalyse . München: Psychologie Verlags UnionLord, F. M. Novick, M. R. (1968). Statistical theories of mental test scores . Reading, MA: Addison-WesleyLubinski, D. (2004). Introduction to the special section on cognitive abilities: 100 years after Spearman’s (1904) “‘General intelligence’, objectively determined and measured”. Journal of Personality and Social Psychology, 86, 96– 111Marsh, H. W. Balla, J. R. McDonald, R. P. (1988). Goodness-of-fit indexes in confirmatory factor analysis: The effect of sample size. Psychological Bulletin, 103, 391– 410Marsh, H. W. Trautwein, U. Lüdtke, O. Köller, O. Baumert, J. (2005). Academic self-concept, interest, grades, and standardized test scores: Reciprocal effects models of causal ordering. Child Development, 76, 397– 416McDonald, R. P. (1999). Test theory: A unified treatment . Mahwah, NJ.: Lawrence Erlbaum AssociatesMuthén, L. K. Muthén, B. O. (1998-2004). Mplus User´s Guide. Third Edition . Los Angeles, CA: Muthén & MuthénNunnally, J. C. Bernstein, I. H. (1994). Psychometric theory . New York: McGraw-HillRaykov, T. (1997). Estimation of composite reliability for congeneric measures. Applied Psychological Measurement, 21, 173– 184Raykov, T. Shrout, P. E. (2002). Reliability of scales with general structure: Point and interval estimation using a structural equation modeling approach. Structural Equation Modeling, 9, 195– 212Sechrest, L. (1963). Incremental validity: A recommendation. Educational and Psychological Measurement, 23, 153– 158Snijders, T. A. B. Bosker, R. J. (1999). Multilevel Analysis - An introduction to basic and advanced multilevel modeling . London: SAGE PublicationsSüß, H.-M. (1996). Intelligenz, Wissen und Problemlösen. Kognitive Voraussetzungen für erfolgreiches Handeln bei computersimulierten Problemen . Göttingen: HogrefeSüß, H.-M. (2001). Prädiktive Validität der Intelligenz im schulischen und außerschulischen Bereich. In E. Stern & J. Guthke (Hrsg.), Perspektiven der Intelligenzforschung. Ein Lehrbuch für Fortgeschrittene (S. 109-135). Lengerich: Pabst Science PublishersSüß, H.-M. Beauducel, A. (2005). Faceted models of intelligence. In O. Wilhelm & R. Engle (Eds.), Handbook of understanding and measuring intelligence (pp. 313-332). Thousand Oaks: SageSüß, H.-M. Oberauer, K. Wittmann, W. W. Wilhelm, O. Schulze, R. (2002). Working-memory capacity explains reasoning ability - and a little bit more. Intelligence, 30, 261– 288Weis, S. Süß, H.-M. (in Druck) Reviving the search for social intelligence. A multitrait-multimethod study of its structure and construct validity. Personality and Individual Differences,Wiggins, J. S. (1973). Personality and prediction: Principles of personality assessment . Reading: Addison-WesleyWittmann, W. W. (1988). Multivariate reliability theory: Principles of symmetry and successful validation strategies. In J. R. Nesselroade & R. B. Cattell (Eds.), Handbook of multivariate experimental psychology (2nd ed.) (pp. 505-560). New York: PlenumWittmann, W. W. Süß, H.-M. (1999). Investigating the paths between working memory, intelligence, knowledge, and complex problemsolving performances via Brunswik symmetry. In P. L. Ackerman, P. C. Kyllonen & R. D. Roberts (Eds.), Learning and individual differences. Process, trait and content determinants (pp. 77-108). Washington, DC: APA Books