Sample Size Issues for Cluster Randomized Trials With Discrete-Time Survival Endpoints
Abstract
With cluster randomized trials complete groups of subjects are randomized to treatment conditions. An important question might be whether and when the subjects experience a particular event, such as smoking initiation or recovery from disease. In the social sciences the timing of such events is often measured in discrete time by using time intervals. At the planning phase of a cluster randomized trial one should decide on the number of clusters and cluster size such that parameters are estimated accurately and sufficient power on the test on treatment effect is achieved. On basis of a simulation study it is concluded that regression coefficients are estimated more accurately than the variance of the random cluster effect. In addition, it is shown that power increases with cluster size and number of clusters, and that a sufficient power cannot always be achieved by using larger cluster sizes at a fixed number of clusters.
References
2000). The effect of cluster randomization in sample size in prevention research. The Journal of Family Practice, 50, 241–246.
(2008). Sample size calculations in clinical research (2nd ed.). New York, NY: Chapman and Hall.
(2003). Modelling survival data in medical research (2nd ed.). New York, NY: Chapman and Hall/CRC.
(1972). Regression models and life tables. Journal of the Royal Statistical Society, Series B, 34, 187–220.
(2000). Design and analysis of cluster randomization trials in health research. London, UK: Edward Arnold.
(2009). The frailty model. New York, NY: Springer.
(2005). nQuery advisor version 6.0 users guide. Los Angeles, CA Statistical Solutions.
(2005). MSurvPow: A Fortran program to calculate the sample size and power for cluster-randomized clinical trails with survival outcomes. Computer Methods and Programs in Biomedicine, 78, 61–67.
(1991). Nonlinear multilevel models, with an application to discrete response data. Biometrika, 78, 45–51.
(2003). Multilevel statistical models. London, UK: Edward Arnold.
(1996). Improved approximation for multilevel models with binary responses. Journal of the Royal Statistical Society A, 159, 505–513.
(1999). Components of variance and intraclass correlations for the design of community-based surveys and intervention studies: data from the health survey for England 1994. American Journal of Epidemiology, 149, 876–883.
(2009). Cluster randomised trials. Boca Rotan, FL: CRC Press.
(2008). Supermix. Mixed effects models. Lincolnwood, IL: Scientific Software International.
(2000). Random-effects regression analysis of correlated grouped-time survival data. Statistical Methods in Medical Research, 9, 161–179.
(2008). Pass. User’s guide. Kaysville, Utah: NCSS.
(2008). Applied survival analysis. Regression modeling of time-to-event data (2nd ed.). Hoboken, NJ: Wiley.
(2010). Multilevel analysis. Techniques and applications (2nd ed.). New York, NY: Routledge.
(in press ). Power analysis for trials with discrete-time survival endpoints. Journal of Educational and Behavioral Statistics.2010). Sample sizes for clinical trials. Boca Raton, FL: Chapman and Hall/CRC Press.
(2005). Sufficient sample sizes for multilevel modeling. Methodology, 1, 86–92.
(2006). Survival analysis: A practical approach (2nd ed.). Chichester, UK: Wiley.
(2005). Effectiveness of home visitation by public-health nurses in prevention of the recurrence of child physical abuse and neglect: A randomised controlled trial. Lancet, 365, 1786–1793.
(2000). Sample size estimation for survival outcomes in cluster-randomized studies with small cluster sizes. Biometrics, 56, 616–621.
(2003 ). Discrete-time survival analysis for single and recurrent events using latent variables.(Unpublished doctoral dissertation) . University of California, Los Angeles.1989). Generalized linear models. London, UK: Chapman and Hall.
(2000). Design issues for experiments in multilevel populations. Journal of Educational and Behavioral Statistics, 25, 271–284.
(2001). Optimal experimental designs for multilevel logistic models. The Statistician, 50, 1–14.
(2003). A comparison of estimation methods for multilevel logistic models. Computational Statistics, 18, 19–37.
(2007). A simulation study of sample size for multilevel logistic regression models. BMC Medical Research Methodology, 7, 34.
(1998). Design and analysis of group-randomized trials. New York, NY: Oxford University Press.
(2005). Discrete-time survival mixture analysis. Journal of Educational and Behavioral Statistics, 30, 27–58.
(2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 9, 599–620.
(2007). Mplus user’s guide (4th ed.). Los Angeles, CA: Muthén & Muthén.
(2009). A user’s guide to MLwiN (Version 2.10). Bristol, UK: Centre for Multilevel Modelling, University of Bristol.
(1997). Statistical analysis and optimal design for cluster randomized studies. Psychological Methods, 2, 173–185.
(2004). HLM 6. Hierarchical linear and nonlinear modeling. Chicago, IL: Scientific Software International.
(2010). An analysis of survival time to re-arrest in treated and non-treated jailers. The Journal of Forensic Psychiatry & Psychology, 21, 102–112.
(1996). Intraclass correlation estimates in a school-based smoking prevention study. American Journal of Epidemiology, 144, 425–433.
(1993). It’s about time: Using discrete-time survival analysis to study duration and the timing of events. Journal of Educational and Behavior Statistics, 18, 155–195.
(2003). Applied longitudinal data analysis. Modeling change and event occurrence. Oxford, UK: Oxford Unversity Press.
(1999). Multilevel analysis. An introduction to basic and advanced multilevel modeling. London, UK: Sage.
(2008). Designing and running randomised trials in health, education and the social sciences: An introduction. Basingstoke: Palgrave Macmillan.
(1997). Log-linear models for event histories. Thousand Oakes, CA: Sage.
(