Originalartikel

Analyse von Lernumwelten

Ansätze zur Bestimmung der Reliabilität und Übereinstimmung von Schülerwahrnehmungen

Oliver Lüdtke

Forschungsbereich Erziehungswissenschaft und Bildungssysteme, Max-Planck-Institut für Bildungsforschung, Berlin

Search for more papers by this author

Ulrich Trautwein

Forschungsbereich Erziehungswissenschaft und Bildungssysteme, Max-Planck-Institut für Bildungsforschung, Berlin

Search for more papers by this author

Mareike Kunter

Forschungsbereich Erziehungswissenschaft und Bildungssysteme, Max-Planck-Institut für Bildungsforschung, Berlin

Search for more papers by this author

, and

Jürgen Baumert

Forschungsbereich Erziehungswissenschaft und Bildungssysteme, Max-Planck-Institut für Bildungsforschung, Berlin

Search for more papers by this author

Published Online:October 03, 2006https://doi.org/10.1024/1010-0652.20.12.85

Abstract

Zusammenfassung. In der pädagogisch-psychologischen Forschung zur Unterrichtsqualität wird häufig auf Schülerurteile zurückgegriffen. Diese werden über die Schüler einer Klasse gemittelt, wenn ein Maß für die geteilte schulische Umwelt (z. B. Unterrichtsqualität) gebildet werden soll, das sich in Beziehung zu anderen Konstrukten (z. B. Schulleistung, Schulform) setzen lässt. Oft bleibt dabei unberücksichtigt, inwiefern die Schüler einer Klasse in ihren Wahrnehmungen der schulischen Umwelt tatsächlich übereinstimmen und wie reliabel die aggregierten Schülerwahrnehmungen sind. In diesem Beitrag werden Verfahren aus der Organisationspsychologie aufgegriffen, um die Reliabilität und Übereinstimmung von Schülerwahrnehmungen zu bestimmen. Anhand einer Reihe von Skalen zur Unterrichtswahrnehmung aus der TIMS-Studie (N = 2064 Schüler in 100 Klassen) wird die Anwendung der vorgestellten Indizes veranschaulicht. Abschließend wird die Fruchtbarkeit dieser Indizes für die Analyse von Mehrebenenstrukturen diskutiert, und es werden Empfehlungen für die Forschungspraxis gegeben.

The Analysis of Learning Environments: Approaches to Determine the Reliability and Agreement of Student Ratings

Abstract. The majority of studies in educational research rely on student ratings to assess characteristics of the learning environment. At the classroom level the aggregated student ratings reflect perceptions of the shared learning environment, corrected for individual idiosyncrasies. Although this strategy is often applied in research on learning and instruction, neither the reliability and validity of the aggregated student ratings nor the amount of within-group agreement between the students in a class have been subject to much investigation. The present study introduces and discusses different procedures proposed in organizational psychology for assessing the reliability and agreement of students' ratings of their instruction. The proposed indices will be illustrated by reanalyzing the students' ratings of their mathematics lessons in the TIMS Study (N = 2064 students in 100 classes).

References

Baumert, J. , Gruehn, S. , Heyn, S , Köller, O. , Schnabel, K.-H. (1997). Bildungsverläufe und psychosoziale Entwicklung im Jugendalter (BIJU). Dokumentation, Band 1. Skalenlängsschnitt I, Welle 1-4 . Berlin: Max-Planck-Institut für Bildungsforschung First citation in article Google Scholar
Baumert, J. , Lehmann, R.H. , Lehrke, M. , Schmitz, B. , Clausen, M. , Hosenfeld, I. , Köller, O. , Neubrand, J. (1997). TIMSS - Mathematisch-naturwissenschaftlicher Unterricht im internationalen Vergleich: Deskriptive Befunde . Opladen: Leske & Budrich First citation in article Crossref, Google Scholar
Bliese, P.D. (1998). Group size, ICC values, and group-level correlations: A simulation. Organizational Research Methods, 1, 355– 373 First citation in article Crossref, Google Scholar
Bliese, P.D. (2000). Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In K.J. Klein & S.W. Kozlowski (Eds.), Multilevel theory, research, and methods in organizations (pp. 349-381). San Francisco, CA: Jossey-Bass First citation in article Google Scholar
Bliese, P.D. (2003). Multilevel modeling in R. A brief introduction to R, the Multilevel package and the NLME package . Unpublished manuscript First citation in article Google Scholar
Brown, R.D. , Hauenstein, N.M.A. (2005). Interrater agreement reconsidered: An alternative to the r_WG indices. Organizational Research Methods, 8, 165– 184 First citation in article Crossref, Google Scholar
Burke, M.J. , Dunlap, W.P. (2002). Estimating interrater agreement with the average deviation index: A user's guide. Organizational Research Methods, 5, 159– 172 First citation in article Crossref, Google Scholar
Burke, M.J. , Finkelstein, L.M. , Dusig, M.S. (1999). On average deviation indices for estimating interrater agreement. Organizational Research Methods, 2, 49– 68 First citation in article Crossref, Google Scholar
Chan, D. (1998). Functional relations among constructs in the same content domain at different levels of analysis: A typology of compositional models. Journal of Applied Psychology, 83, 234– 246 First citation in article Crossref, Google Scholar
Church, M.A. , Elliot, A.J. , Gable, S.L. (2001). Perceptions of classroom environment, achievement goals, and achievement outcomes. Journal of Educational Psychology, 93, 43– 54 First citation in article Crossref, Google Scholar
Clausen, M. (2002). Unterrichtsqualität: Eine Frage der Perspektive? . Münster: Waxmann First citation in article Google Scholar
Cohen, A. , Doveh, E. , Eick, U. (2001). Statistical properties of the r_WG(J) index of agreement. Psychological Methods, 6, 297– 310 First citation in article Crossref, Google Scholar
Ditton, H. (2002). Lehrkräfte und Unterricht aus Schülersicht. Zeitschrift für Pädagogik, 48, 262– 286 First citation in article Google Scholar
Dreesmann, H. (1982). Unterrichtsklima. Wie Schüler den Unterricht wahrnehmen . Weinheim: Beltz First citation in article Google Scholar
Dunlap, W.P. , Burke, M.J. , Smith-Crowe, K. (2003). Accurate tests of statistical significance for r_WG and average deviation interrater agreement indexes. Journal of Applied Psychology, 88, 356– 362 First citation in article Crossref, Google Scholar
Efron, B. , Tibshirani, R. (1993). An introduction to the bootstrap . New York: Chapman & Hill First citation in article Crossref, Google Scholar
Fend, H. , Specht, W. (1986). Erziehungsumwelten. Bericht aus dem Projekt “Entwicklung im Jugendalter” . Konstanz: Universität, Sozialwissenschaftliche Fakultät First citation in article Google Scholar
Finn, R.H. (1970). A note on estimating the reliability of categorical data. Educational and Psychological Measurement, 30, 71– 76 First citation in article Crossref, Google Scholar
Griffith, J. (2002). Is quality/effectiveness an empirically demonstrable school attribute? Statistical aids for determining appropriate levels of analysis. School Effectiveness and School Improvement, 13, 91– 122 First citation in article Crossref, Google Scholar
Gruehn, S. (2000). Unterricht und schulisches Lernen: Schüler als Quellen der Unterrichtsbeschreibung . Münster: Waxmann First citation in article Google Scholar
Helmke, A. (2003). Unterrichtsqualität . Seelze: Kallmeyersche Verlagsbuchhandlung First citation in article Google Scholar
James, L.R. , Demaree, R.G. , Wolf, G. (1984). Estimating within-group interrater reliability with and without response bias. Journal of Applied Psychology, 69, 85– 98 First citation in article Crossref, Google Scholar
James, L.R. , Demaree, R.G. , Wolf, G. (1993). r_WG: An assessment of within-group interrater agreement. Journal of Applied Psychology, 78, 306– 309 First citation in article Crossref, Google Scholar
Kane, M.T. , Brennan, R.L. (1977). The generalizability of class means. Review of Educational Research, 47, 267– 292 First citation in article Crossref, Google Scholar
Klein, K.J. , Conn, A.B. , Smith, B.D. , Sorra, J.S. (2001). Is everyone in agreement? An exploration of within-group agreement in employee perceptions of the work environment. Journal of Applied Psychology, 86, 3– 16 First citation in article Crossref, Google Scholar
Klieme, E. , Rakoczy, K. (2003). Unterrichtsqualität aus Schülerperspektive: Kulturspezifische Profile, regionale Unterschiede und Zusammenhänge mit Effekten von Unterricht. In J. Baumert, C. Artelt, E. Klieme, M. Neubrand, M. Prenzel, U. Schiefele, W. Schneider, K.-J. Tillmann & M. Weiß (Hrsg.), PISA 2000: Ein differenzierter Blick auf die Länder der Bundesrepublik Deutschland (S. 333-359). Opladen: Leske + Budrich First citation in article Google Scholar
Kozlowski, S.W. , Hattrup, K. (1992). A disagreement about within-group agreement: Disentangling issues of consistency versus consensus. Journal of Applied Psychology, 77, 161– 167 First citation in article Crossref, Google Scholar
Kunter, M. (2005). Multiple Ziele im Mathematikunterricht . Münster: Waxmann First citation in article Google Scholar
Kunter, M. , Stanat, P. (2002). Soziale Kompetenz von Schülerinnen und Schülern. Die Rolle von Schulmerkmalen für die Vorhersage ausgewählter Aspekte. Zeitschrift für Erziehungswissenschaften, 5, 49– 71 First citation in article Crossref, Google Scholar
LeBreton, J.M. , James, L.R. , Lindell, M.K. (2005). Recent issues regarding r_WG, r*_WG, r_WG(J), and r*_WG(J) . Organizational Research Methods, 8, 128– 138 First citation in article Crossref, Google Scholar
Lindell, M.K. , Brandt, C.J. (1997). Measuring interrater agreement for ratings of a single target. Applied Psychological Measurement, 21, 271– 278 First citation in article Crossref, Google Scholar
Lindell, M.K. , Brandt, C.J. (1999). Assessing interrater agreement on the job relevance of a test: A comparison of CVI, T, r_WG(J), and r*_WG(J) indexes. Journal of Applied Psychology, 84, 640– 647 First citation in article Crossref, Google Scholar
Lindell, M.K. , Brandt, C.J. , Whitney, D.J. (1999). A revised index of interrater agreement for multi-item ratings of a single target. Applied Psychological Measurement, 23, 127– 135 First citation in article Crossref, Google Scholar
Lüdtke, O. , Köller, O. (2002). Individuelle Bezugsnormorientierung und soziale Vergleiche im Mathematikunterricht. Einfluss unterschiedlicher Referenzrahmen auf das fachspezifische Selbstkonzept der Begabung. Zeitschrift für Entwicklungspsychologie und Pädagogische Psychologie, 34, 156– 166 First citation in article Link, Google Scholar
Lüdtke, O. , Robitzsch, A. , Köller, O. (2002). Statistische Artefakte bei Kontexteffekten in der pädagogisch-psychologischen Forschung. Zeitschrift für Pädagogische Psychologie, 16, 217– 231 First citation in article Link, Google Scholar
McGraw, K.O. , Wong, S.P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30– 46 First citation in article Crossref, Google Scholar
Raudenbush, S.W. , Bryk, A.S. (2002). Hierarchical linear models (2nd ed.). Thousand Oaks, CA: Sage First citation in article Google Scholar
Rheinberg, F. (1998). Bezugsnormorientierung. In D.H. Rost (Hrsg.), Handwörterbuch Pädagogische Psychologie (S. 39- 43). Weinheim: Beltz/PVU First citation in article Google Scholar
Ryan, A.M. , Gheen, M.H. , Midgley, C. (1998). Why do some students avoid asking for help? An examination of the interplay among students' academic efficacy, teachers' social-emotional role, and the classroom goal structure. Journal of Educational Psychology, 90, 528– 535 First citation in article Crossref, Google Scholar
Schmidt, F.L. , Hunter, J.E. (1989). Interrater reliability coefficients cannot be computed when only one stimulus is rated. Journal of Applied Psychology, 75, 322– 327 First citation in article Google Scholar
Schwarzer, R. , Lange, B. , Jerusalem, M. (1982). Die Bezugsnorm des Lehrers aus der Sicht des Schülers. In F. Rheinberg (Hrsg.), Jahrbuch für Empirische Erziehungswissenschaft 1982. Bezugsnormen zur Schulleistungsbewertung: Analyse und Intervention (S. 161-172). Düsseldorf: Schwann First citation in article Google Scholar
Shrout, P.E. , Fleiss, J.L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 420– 428 First citation in article Crossref, Google Scholar
Snijders, T.A.B. , Bosker, R.J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling . London: Sage First citation in article Google Scholar
Wirtz, M. , Caspar, F. (2002). Beurteilerübereinstimmung und Beurteilerreliabilität . Göttingen: Hogrefe First citation in article Google Scholar

Volume 20Issue 1/2Januar 2006

ISSN: 1010-0652eISSN: 1664-2910

Licenses & Copyright

Keywords

Acknowledgments:

Wir möchten uns bei Alexander Robitzsch und Olaf Köller für wertvolle Hinweise und Kommentare zu früheren Versionen dieses Artikels bedanken.

PDF download

Verify Phone

Congrats!

Analyse von Lernumwelten

Ansätze zur Bestimmung der Reliabilität und Übereinstimmung von Schülerwahrnehmungen

Abstract

References

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Analyse von Lernumwelten

Ansätze zur Bestimmung der Reliabilität und Übereinstimmung von Schülerwahrnehmungen

Abstract

References

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners