Sufficient Sample Sizes for Multilevel Modeling
Abstract
Abstract. An important problem in multilevel modeling is what constitutes a sufficient sample size for accurate estimation. In multilevel analysis, the major restriction is often the higher-level sample size. In this paper, a simulation study is used to determine the influence of different sample sizes at the group level on the accuracy of the estimates (regression coefficients and variances) and their standard errors. In addition, the influence of other factors, such as the lowest-level sample size and different variance distributions between the levels (different intraclass correlations), is examined. The results show that only a small sample size at level two (meaning a sample of 50 or less) leads to biased estimates of the second-level standard errors. In all of the other simulated conditions the estimates of the regression coefficients, the variance components, and the standard errors are unbiased and accurate.
References
Afshartous D. (1995, April). Determination of sample size for multilevel model design. . Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CABerkhof J. , Snijders T. A. B. (2001). Variance component testing in multilevel models. Educational and Behavioral Statistics, 26, 133– 152Browne W. J. (1998). Applying MCMC methods to multilevel models . Unpublished doctoral dissertation, University of Bath, UKBrowne W. J. , Draper D. (2000). Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics, 15, 391– 420Bryk A. S. , Raudenbush S. W. (1989). Methodology for cross-level organizational research. Research in Organizational Behavior, 7, 233– 273Busing F. (1993). Distribution characteristics of variance estimates in two-level models . Unpublished manuscript, Leiden University, the NetherlandsCohen J. (1988). Statistical power analysis for the behavioral sciences . Mahwah, NJ: ErlbaumCohen J. , Cohen P. (1983). Applied multiple regression analysis for the behavioral sciences . Hillsdale, NJ: ErlbaumElston D. A. (1998). Estimation of the denominator degrees of freedom of F-distributions for assessing Wald statistics for fixed-effect factors in unbalanced mixed designs. Biometrics, 41, 477– 486Goldstein H. (1995). Multilevel statistical models . (2nd ed.). London: Edwards Arnold. New York: HalsteadGoldstein H. (2003). Multilevel statistical models . (3rd ed.). London: ArnoldGulliford M. C. , Ukoumunne O. C. , Chinn S. (1999). Components of variance and intraclass correlations for the design of community-based surveys and intervention studies. American Journal of Epidemiology, 149, 876– 883Heck R. H. , Thomas S. L. (2000). An introduction to multilevel modeling techniques . Mahwah, NJ: ErlbaumHox J. J. (2002). Multilevel analysis: Techniques and applications . Mahwah, NJ: ErlbaumKim K.-S. (1990). Multilevel data analysis: A comparison of analytical alternatives . Unpublished doctoral dissertation, University of California at Los AngelesKish L. (1965). Survey sampling . New York: WileyKlein K. , Kozlowski S. W. J. (Eds.) (2000). Multilevel theory, research, and methods in organizations . San Francisco: Jossey-BassKreft I. G. G. (1996). Are multilevel techniques necessary? An overview, including simulation studies . Unpublished manuscript, California State University at Los Angeles. Retrieved July 6, 2005 from www.calstatela.edu/faculty/ikreft/quarterly.htmlKreft I. G. G. , De Leeuw J. (1998). Introducing multilevel modeling . Newbury Park, CA: SageManor O. , Zucker D. M. (2004). Small sample inference for the fixed effects in the mixed linear model. Computational Statistics and Data Analysis, 46, 801– 817Muthén B. , Satorra A. (1995). Complex sample data in structural equation modeling. In P. V. Marsden (Ed.),Sociological methodology (pp. 267-316). Oxford, England: BlackwellRasbash J. , Browne W. , Goldstein H. , Yang M. , Plewis I. , Healy M. (2000). A user’s guide to MLwiN . London: Multilevel Models Project, University of LondonRaudenbush S. W. (1989). The analysis of longitudinal, multilevel data. International Journal of Educational Research, 13, 721– 740Raudenbush S. W. , Bryk A. S. (2002). Hierarchical linear models . Newbury Park, CA: SageSnijders T. A. B. (1996). Analysis of longitudinal data using the hierarchical linear model. Quality and Quantity, 30, 405– 426Snijders T. A. B. , Bosker R. J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling . Thousand Oaks, CA: SageVan der Leeden R. , Busing F. (1994). First iteration versus IGLS RIGLS estimates in two-level models: A Monte Carlo study with ML3 . Unpublished manuscript, Leiden University, the NetherlandsVan der Leeden R. , Busing F. , Meijer E. (1997, April). Applications of bootstrap methods for two-level models . Paper presented at the Multilevel Conference, AmsterdamWelham S. J. , Thompson R. (1997). Likelihood ratio tests for fixed model terms using residual maximum likelihood. Journal of the Royal Statistical Society, Series B, 59, 701– 714Yung Y.-F. , Chan W. (1999). Statistical analysis using bootstrapping: Concepts and implementation. In R. H. Hoyle (Ed.),Statistical strategies for small sample research (pp. 82-105). Thousand Oaks, CA: Sage