Original Article

Estimation of and Confidence Interval Formation for Reliability Coefficients of Homogeneous Measurement Instruments

Ken Kelley

Department of Management, University of Notre Dame, IN, USA

Search for more papers by this author

and

Ying Cheng

Department of Psychology, University of Notre Dame, IN, USA

Search for more papers by this author

Published Online:January 01, 2012https://doi.org/10.1027/1614-2241/a000036

Abstract

The reliability of a composite score is a fundamental and important topic in the social and behavioral sciences. The most commonly used reliability estimate of a composite score is coefficient α. However, under regularity conditions, the population value of coefficient α is only a lower bound on the population reliability, unless the items are essentially τ-equivalent, an assumption that is likely violated in most applications. A generalization of coefficient α, termed ω, is discussed and generally recommended. Furthermore, a point estimate itself almost certainly differs from the population value. Therefore, it is important to provide confidence interval limits so as not to overinterpret the point estimate. Analytic and bootstrap methods are described in detail for confidence interval construction for ω. We go on to recommend the bias-corrected bootstrap approach for ω and provide open source and freely available R functions via the MBESS package to implement the methods discussed.

References

American Psychological Association . (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association. First citation in article Google Scholar
Bentler, P. M. (2009). Alpha, distribution-free, and model-based internal consistency reliability. Psychometrika, 74, 137–143. First citation in article Crossref, Google Scholar
Casella, G. , Berger, R. L. (2002). Statistical inference (2nd ed.). Pacific Grove, CA: Duxbury Press. First citation in article Google Scholar
Cheung, M. W.-L. (2009). Constructing approximate confidence intervals for parameters with structural constructing approximate confidence intervals for parameters with structural equation models. Structural Equation Modeling, 16, 267–294. First citation in article Crossref, Google Scholar
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. First citation in article Crossref, Google Scholar
Cronbach, L. J. , Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and Psychological Measurement, 64, 391–418. First citation in article Crossref, Google Scholar
Efron, B. (1987). Better bootstrap confidence intervals. Journal of American Statistical Association, 82, 171–185. First citation in article Crossref, Google Scholar
Efron, B. (1998). R. A. Fisher in the 21st century. Statistical Science, 13, 95–114. First citation in article Crossref, Google Scholar
Efron, B. , Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman & Hall/CRC. First citation in article Crossref, Google Scholar
Fan, X. , Thompson, B. (2001). Confidence intervals around score reliability coefficients, please: An EPM guidelines editorial. Educational and Psychological Measurement, 61, 517–532. First citation in article Google Scholar
Ford Motor Company Fund (2008–2010). Retrieved from www.fordpas.org/. First citation in article Google Scholar
Green, S. B. , Hershberger, S. L. (2000). Correlated errors in true score models and their effect on coefficient alpha. Structural Equation Modeling, 7, 251–270. First citation in article Crossref, Google Scholar
Green, S. B. , Yang, Y. (2009a). Commentary on coefficient alpha: A cautionary tale. Psychometrika, 74, 121–135. First citation in article Crossref, Google Scholar
Green, S. B. , Yang, Y. (2009b). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74, 155–167. First citation in article Crossref, Google Scholar
Grissom, R. J. , Kim, J. J. (2005). Effect sizes for research: A broad practical approach. Mahwah, NJ: Erlbaum. First citation in article Google Scholar
Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York, NY: McGraw-Hill Book Company. First citation in article Google Scholar
Gulliksen, H. (1950). Theory of mental tests. New York, NY: Wiley. First citation in article Crossref, Google Scholar
Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255–282. First citation in article Crossref, Google Scholar
Guzzo, R. A. , Dickson, M. W. (1996). Teams in organizations: Recent research on performance and effectiveness. Annual Review of Psychology, 47, 307–338. First citation in article Crossref, Google Scholar
Hahn, G. , Meeker, W. (1991). Statistical intervals: A guide for practitioners. New York, NY: Wiley. First citation in article Crossref, Google Scholar
Harlow, L. L. , Mulaik, S. A. , Steiger, J. H. (1997). What if there were no significance tests?. Mahwah, NJ: Erlbaum. First citation in article Google Scholar
Hunter, J. E. , Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park, CA: Sage. First citation in article Crossref, Google Scholar
Jones, L. V. , Thissen, D. (2007). A history and overview of psychometrics. In C. R. Rao, S. Sinharay, (Eds.), Handbook of statistics: Psychometrics (Vol. 26, pp. 1–27). New York, NY: Elsevier. First citation in article Google Scholar
Jöreskog, K. , Sörbom, D. (1996). LISREL 8: User’s reference guide (2nd ed.). Chicago, IL: Scientific Software International. First citation in article Google Scholar
Kelley, K. (2005). The effects of nonnormal distributions on confidence intervals around the standardized mean difference: Bootstrapping as an alternative to parametric confidence intervals. Educational and Psychological Measurement, 65, 51–69. First citation in article Crossref, Google Scholar
Kelley, K. (2007a). Confidence intervals for standardized effect sizes: Theory, application, and implementation. Journal of Statistical Software, 20(8), 1–24. First citation in article Crossref, Google Scholar
Kelley, K. (2007b). Methods for the Behavioral, Educational, and Educational Sciences: An R package. Behavior Research Methods, 39, 979–984. First citation in article Crossref, Google Scholar
Kelley, K. , Lai, K (2010). MBESS 3.0 (or greater): [computer software and manual]. Retrieved from www.cran.r-project.org/. First citation in article Google Scholar
Komaroff, E. (1997). Effect of simultaneous violations of essential τ-equivalence and uncorrelated error on coefficient α . Applied Psychological Measurement, 21, 337–348. First citation in article Crossref, Google Scholar
Kozlowski, S. W. J. , Ilgen, D. R. (2006). Enhancing the effectiveness of work groups and teams. Psychological Science in the Public Interest, 7, 77–124. First citation in article Crossref, Google Scholar
Kuder, G. F. , Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151–160. First citation in article Crossref, Google Scholar
Li, H. (1997). A unifying expression for the maximal reliability of a linear composite. Psychometrika, 62, 245–249. First citation in article Crossref, Google Scholar
Lord, F. M. , Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley. First citation in article Google Scholar
McArdle, J. J. , McDonald, R. P. (1984). Some algebraic properties of the reticular action model for moment structures. British Journal of Mathematical and Statistical Psychology, 37, 234–251. First citation in article Crossref, Google Scholar
McDonald, R. P. (1970). The theoretical foundations of principal factor analysis, canonical factor analysis, and alpha factor analysis. British Journal of Mathematical and Statistical Psychology, 38, 1–21. First citation in article Crossref, Google Scholar
McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Erlbaum. First citation in article Google Scholar
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166. First citation in article Crossref, Google Scholar
Miller, R. G. (1974). The jackknife – A review. Biometrika, 61, 1–15. First citation in article Google Scholar
Novick, M. R. , Lewis, C. (1967). Coefficient alpha and the reliability of composite measurements. Psychometrika, 32, 1–13. First citation in article Crossref, Google Scholar
Oehlert, G. W. (1992). A note on the delta method. The American Statistician, 46, 27–29. First citation in article Google Scholar
Pawitan, Y. (2001). In all likelihood: Statistical modelling and inference using likelihood. New York, NY: Oxford University Press. First citation in article Google Scholar
R Development Core Team (2010). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria: R Development Core Team. (ISBN 3-900051-07-0). First citation in article Google Scholar
Raykov, T. (1997). Estimation of composite reliability for congeneric measures. Applied Psychological Measurement, 21, 173–184. First citation in article Crossref, Google Scholar
Raykov, T. (2002). Analytic estimation of standard error and confidence interval for scale reliability. Multivariate Behavioral Research, 37, 89–103. First citation in article Crossref, Google Scholar
Revelle, W. (1979). Hierarchical cluster-analysis and the internal structure of tests. Multivariate Behavioral Research, 14, 57–74. First citation in article Crossref, Google Scholar
Revelle, W. , Zinbarg, R. E. (2009). Coefficients alpha, beta, omega, and the GLB: Comments on Sijtsma. Psychometrika, 74, 145–154. First citation in article Crossref, Google Scholar
Schervish, M. J. (1995). Theory of statistics. New York, NY: Springer. First citation in article Crossref, Google Scholar
Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: Implications for training of researchers. Psychological Methods, 1, 115–129. First citation in article Crossref, Google Scholar
Sijtsma, K. (2009a). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74, 107–120. First citation in article Crossref, Google Scholar
Sijtsma, K. (2009b). Reliability beyond theory and into practice. Psychometrika, 74, 169–173. First citation in article Crossref, Google Scholar
Spearman, C. (1904). “General intelligence”, objectively determined and measured. American Journal of Psychology, 15, 201–292. First citation in article Crossref, Google Scholar
Sundstrom, E. (1999). The challenges of supporting work team effectiveness. In E. Sundstrom, (Ed.), Supporting work team effectiveness (pp. 3–23). San Francisco, CA: Jossey-Bass. First citation in article Google Scholar
Task Force on Reporting of Research Methods in AERA Publications (2006). Standards for reporting on empirical social science research in AERA publications, american educational. Washington, DC: American Educational Research Association. First citation in article Google Scholar
Ten Berge, J. M. F. (2004). The greatest lower bound to the reliability of a test and the hypothesis of unidimensionality. Psychometrika, 69, 613–625. First citation in article Google Scholar
Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect sizes. Educational Researcher, 31, 25–32. First citation in article Crossref, Google Scholar
Thompson, B. (2003). Understanding reliability and coefficient alpha, really. In B. Thompson, (Ed.), Score reliability: Contemporary thinking on reliability issues (pp. 3–23). Thousand Oaks, CA: Sage. First citation in article Crossref, Google Scholar
Wang, L. , Zhuang, X. , Liu, L. , MacCann, C. , & Roberts, R. D. (2009). Assessing teamwork and collaboration in high school students: A multi-method approach. Canadian Journal of School Psychology, 24, 108–124. First citation in article Crossref, Google Scholar
Wilkinson, L. , The American Psychological Association Task Force on Statistical Inference. (1999). Statistical methods in psychology: Guidelines and explanations. American Psychologist, 54, 594–604. First citation in article Crossref, Google Scholar
Yuan, K.-H. , Bentler, P. M. (2002). On robustness of the normal-theory based asymptotic distributions of three reliability coefficient estimates. Psychometrika, 67, 251–259. First citation in article Crossref, Google Scholar
Yuan, K.-H. , Guarnaccia, C. A. , Hayslip, B. Jr. (2003). A study of the distribution of sample coefficient alpha with the Hopkins symptom checklist: Bootstrap versus asymptotics. Educational and Psychological Measurement, 63, 5–23. First citation in article Crossref, Google Scholar
Zhuang, Z. , MacCann, C. , Wang, L. , Liu, L. , & Roberts, R. D. (2008). Development and validity evidence supporting a teamwork and collaboration assessment for high school students ETS Research Report RR-08-50 Princeton, NJ: Educational Testing Service. First citation in article Google Scholar
Zimmerman, D. W. (1975). Probability spaces, Hilbert spaces, and the axioms of test theory. Psychometrika, 40, 395–412. First citation in article Crossref, Google Scholar
Zimmerman, D. W. , Zumbo, B. D. , Lalonde, C. (1993). Coefficient alpha as an estimate of test reliability under violations of two assumptions. Educational and Psychological Measurement, 53, 33–49. First citation in article Crossref, Google Scholar
Zinbarg, R. E. , Revelle, W. , Yovel, I. , Li, W. (2005). Cronbach’s α, Revelle’s β, and Mcdonald’s ω _h: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70, 123–133. First citation in article Crossref, Google Scholar

Volume 8Issue 2August 2012

ISSN: 1614-1881eISSN: 1614-2241

History

AcceptedNovember 22, 2010

Licenses & Copyright

Keywords

PDF download

Verify Phone

Congrats!

Estimation of and Confidence Interval Formation for Reliability Coefficients of Homogeneous Measurement Instruments

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Estimation of and Confidence Interval Formation for Reliability Coefficients of Homogeneous Measurement Instruments

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners