Original Article

Optimal Sample Sizes for Testing the Equivalence of Two Means

Jiin-Huarng Guo

Department of Applied Mathematics, National Pingtung University, Pingtung City, Taiwan

Search for more papers by this author

Hubert J. Chen

Department of Statistics, University of Georgia, Athens, GA, USA

Search for more papers by this author

, and

Wei-Ming Luh

https://orcid.org/0000-0002-7043-9188

Institute of Education, National Cheng Kung University, Tainan City, Taiwan

Search for more papers by this author

Published Online:August 29, 2019https://doi.org/10.1027/1614-2241/a000171

Abstract

Abstract. Equivalence tests (also known as similarity or parity tests) have become more and more popular in addition to equality tests. However, in testing the equivalence of two population means, approximate sample sizes developed using conventional techniques found in the literature on this topic have usually been under-valued as having less statistical power than is required. In this paper, the authors first address the reason for this problem and then provide a solution using an exhaustive local search algorithm to find the optimal sample size. The proposed method is not only accurate but is also flexible so that unequal variances or sampling unit costs for different groups can be considered using different sample size allocations. Figures and a numerical example are presented to demonstrate various configurations. An R Shiny App is also available for easy use (https://optimal-sample-size.shinyapps.io/equivalence-of-means/).

References

Allan, T. A., & Cribbie, R. A. (2013). Evaluating the equivalence of, or difference between, psychological treatments: A exploration of recent intervention studies. Canadian Journal of Behavioral Science, 45, 320–328. https://doi.org/10.1037/a0033357 First citation in article Crossref, Google Scholar
Anderson, S. F., & Maxwell, S. E. (2016). There’s more than one way to conduct a replication study: Beyond statistical significance. Psychological Methods, 21, 1–12. https://doi.org/10.1037/met0000051 First citation in article Crossref, Google Scholar
Ball, L. C., Cribbie, R. A., & Steele, J. R. (2013). Beyond gender differences using tests of equivalence to evaluate gender similarities. Psychology of Women Quarterly, 37, 147–154. https://doi.org/10.1177/0361684313480483 First citation in article Crossref, Google Scholar
Barker, L., Luman, E. T., McCauley, M. M., & Chu, S. Y. (2002). Assessing equivalence: An alternative to the use of difference tests for measuring disparities in vaccination coverage. American Journal of Epidemiology, 156, 1056–1061. https://doi.org/10.1093/aje/kwf149 First citation in article Crossref, Google Scholar
Berger, R. L., & Hsu, J. C. (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statistical Science, 11, 283–319. https://doi.org/10.1214/ss/1032280304 First citation in article Crossref, Google Scholar
Burns, D. R., & Elswick, R. K. Jr. (2001). Equivalence testing with dental clinical trials. Journal of Dental Research, 80, 1513–1517. https://doi.org/10.1177/00220345010800060701 First citation in article Crossref, Google Scholar
Chow, S. C. (2011). Sample size calculations for clinical trials. WIREs Computational Statistics, 3, 414–427. https://doi.org/10.1002/wics.155 First citation in article Crossref, Google Scholar
Chow, S. C., & Liu, J. P. (2008). Design and analysis of bioavailability and bioequivalence studies (3rd ed.). New York, NY: Marcel Dekker. First citation in article Crossref, Google Scholar
Chow, S. C., Shao, J., & Wang, H. (2008). Sample size calculations in clinical research (2nd ed.). New York, NY: Taylor & Francis. First citation in article Google Scholar
Dannenberg, O., Dette, H., & Munk, A. (1994). An extension of Welch’s approximate t-solution to comparative bioequivalence trials. Biometrika, 81, 91–101. https://doi.org/10.1093/biomet/81.1.91 First citation in article Crossref, Google Scholar
Dette, H., & Munk, A. (1997). Optimum allocation of treatments for Welch’s test in equivalence assessment. Biometrics, 53, 1143–1150. https://doi.org/10.2307/2533572 First citation in article Crossref, Google Scholar
FDA. (2001). Guidance for industry: Statistical approaches to establishing bioequivalence. Rockville, MD: Center for Drug Evaluation and Research. First citation in article Google Scholar
Garrett, K. A. (1997). Use of statistical tests of equivalence (bioequivalence tests) in plant pathology (Letter to the Editor). Phytopathology, 87, 372–374. First citation in article Crossref, Google Scholar
Guo, J. H., Chen, H. J., & Luh, W. M. (2011). Sample size planning with the cost constraint for testing superiority and equivalence of two independent groups. British Journal of Mathematical and Statistical Psychology, 64, 439–461. https://doi.org/10.1348/000711010X512408 First citation in article Crossref, Google Scholar
Guo, J. H., & Luh, W. M. (2009). Optimum sample size allocation to minimize cost or maximize power for the two-sample trimmed mean test. British Journal of Mathematical and Statistical Psychology, 62, 283–298. https://doi.org/10.1348/000711007X267289 First citation in article Crossref, Google Scholar
Guo, J. H., & Luh, W. M. (2017). Sample size calculations for testing equivalence of two exponential distributions with right censoring: Allocation with costs. Methodology, 13, 144–156. https://doi.org/10.1027/1614-2241/a000139 First citation in article Link, Google Scholar
Hauschke, D., Kieser, M., Diletti, E., & Burke, M. (1999). Sample size determination for proving equivalence based on the ratio of two means for normally distributed data. Statistics in Medicine, 18, 93–105. https://doi.org/10.1002/(SICI)1097-0258(19990115)18:1<93::AID-SIM992>3.3.CO;2-# First citation in article Crossref, Google Scholar
Jan, S.-L., & Shieh, G. (2017). Optimal sample size determinations for the heteroscedastic two one-sided tests of mean equivalence: Design schemes and software implementations. Journal of Educational and Behavioral Statistics, 42, 145–165. https://doi.org/10.3102/1076998616671974 First citation in article Crossref, Google Scholar
Jones, B., Jarvis, P., Lewis, J. A., & Ebbutt, A. F. (1996). Trials to assess equivalence: The importance of rigorous methods. British Medical Journal, 313, 36–39. https://doi.org/10.1136/bmj.313.7048.36 First citation in article Crossref, Google Scholar
Julious, S. A. (2004). Tutorial in biostatistics: Sample sizes for clinical trials with normal data. Statistics in Medicine, 23, 1921–1986. https://doi.org/10.1002/sim.1783 First citation in article Crossref, Google Scholar
Julious, S. A. (2010). Sample sizes for clinical trials. Boca Raton, FL: Taylor & Francis. First citation in article Google Scholar
Kieser, M., & Hauschke, D. (1999). Approximate sample sizes for testing hypotheses about the ratio and difference of two means. Journal of Biopharmaceutical Statistics, 9, 641–650. https://doi.org/10.1081/BIP-100101200 First citation in article Crossref, Google Scholar
Lakens, D. (2017). Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. Social Psychological and Personality Science, 8, 355–362. https://doi.org/10.1177/1948550617697177 First citation in article Crossref, Google Scholar
Lakens, D., McLatchie, N., Isager, P. M., Scheel, A. M., & Dienes, Z. (2018). Improving inferences about null effects with Bayes factors and equivalence tests. Journal of Gerontology: Psychological Sciences, Series B, gby065. https://doi.org/10.1093/geronb/gby065 First citation in article Crossref, Google Scholar
Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence testing for psychological research: A tutorial. Advances in Methods and Practices in Psychological Science, 1, 259–269. https://doi.org/10.1177/2515245918770963 First citation in article Crossref, Google Scholar
Lehmann, E. L. (1986). Testing statistical hypotheses. New York, NY: Springer. First citation in article Crossref, Google Scholar
Limentani, G. B., Ringo, M. C., Ye, F., Bergquist, M. L., & McSorley, E. O. (2005). Beyond the t-test: Statistical equivalence testing. Analytical Chemistry, 77, 221–226. https://doi.org/10.1021/ac053390m First citation in article Crossref, Google Scholar
Liu, J. P., & Chow, S. C. (1992). Sample size determination for the two one-sided tests procedure in bioequivalence. Journal of Pharmacokinetics and Biopharmaceutics, 20, 101–104. https://doi.org/10.1007/BF01143188 First citation in article Crossref, Google Scholar
Luh, W. M., & Guo, J. H. (2016). Sample size planning for the noninferiority or equivalence of a linear contrast with cost considerations. Psychological Methods, 21, 13–34. https://doi.org/10.1037/met0000039 First citation in article Crossref, Google Scholar
Mecklin, C. J. (2003). The use of equivalence testing in conjunction with standard hypothesis testing and effect sizes. Journal of Modern Applied Statistical Methods, 2, 329–340. https://doi.org/10.22237/jmasm/1067645160 First citation in article Crossref, Google Scholar
Metzler, C. M. (1974). Bioavailablity: A problem of equivalence. Biometrics, 30, 309–317. https://doi.org/10.2307/2529651 First citation in article Crossref, Google Scholar
Meyners, M. (2012). Equivalence tests – A review. Food Quality and Preference, 26, 231–245. First citation in article Crossref, Google Scholar
Owen, D. B. (1965). A special case of a bivariate non-central t-distribution. Biometrika, 52, 437–446. https://doi.org/10.1093/biomet/52.3-4.437 First citation in article Crossref, Google Scholar
Phillips, K. F. (1990). Power of the two one-sided tests procedure in bioequivalence. Journal of Pharmacokinetics and Biopharmaceutics, 18, 137–144. https://doi.org/10.1007/BF01063556 First citation in article Crossref, Google Scholar
Quertemont, E. (2011). How to statistically show the absence of an effect. Psychologica Belgica, 51, 109–127. https://doi.org/10.5334/pb-51-2-109 First citation in article Crossref, Google Scholar
R Development Core Team. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing: Vienna, Austria. Retrieved from http://www.r-project.org First citation in article Google Scholar
Rogers, J. L., Howard, K. I., & Vessey, J. T. (1993). Using significance tests to evaluate equivalence between two experimental groups. Psychological Bulletin, 113, 553–565. https://doi.org/10.1037/0033-2909.113.3.553 First citation in article Crossref, Google Scholar
Ruscio, J., & Roche, B. (2012). Variance heterogeneity in published psychological research: A review and a new index. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 8, 1–11. https://doi.org/10.1027/1614-2241/a000034 First citation in article Link, Google Scholar
Schuirmann, D. J. (1987). A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. Journal of Pharmacokinetics and Biopharmaceutics, 15, 657–680. https://doi.org/10.1007/BF01068419 First citation in article Crossref, Google Scholar
Seaman, M. A., & Serlin, R. C. (1998). Equivalence confidence intervals for two-group comparisons of means. Psychological Methods, 3, 403–411. https://doi.org/10.1037/1082-989X.3.4.403 First citation in article Crossref, Google Scholar
Stegner, B. L., Bostrom, A. G., & Greenfield, T. K. (1996). Equivalence testing for use in psychosocial and services research: An introduction with examples. Evaluation and Program Planning, 19, 193–198. https://doi.org/10.1016/0149-7189(96)00011-0 First citation in article Crossref, Google Scholar
Tsai, C. A., Huang, C. Y., & Liu, J. P. (2014). An approximate approach to sample size determination in bioequivalence testing with multiple pharmacokinetic responses. Statistics in Medicine, 33, 3300–3317. https://doi.org/10.1002/sim.6182 First citation in article Crossref, Google Scholar
Wang, H., & Chow, S. C. (2002). A practical approach for comparing means of two groups without equal variance assumption. Statistics in Medicine, 21, 3137–3151. https://doi.org/10.1002/sim.1238 First citation in article Crossref, Google Scholar
Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal. Biometrika, 29, 350–362. https://doi.org/10.1093/biomet/29.3-4.350 First citation in article Crossref, Google Scholar
Wellek, S. (2003). Testing statistical hypotheses of equivalence. Boca Raton, FL: Chapman & Hall/CRC. First citation in article Google Scholar
Westlake, W. J. (1979). Statistical aspects of comparative bioequivalence trials. Biometrics, 35, 273–280. https://doi.org/10.2307/2529949 First citation in article Crossref, Google Scholar
Westlake, W. J. (1988). Bioavailability and bioequivalence of pharmaceutical formulations. In K. PeaceEd., Biopharmaceutical statistics for drug development. New York, NY: Marcel Dekker. First citation in article Google Scholar
Zhang, P. (2003). A simple formula for sample size calculation in equivalence studies. Journal of Biopharmaceutical Statistics, 13, 529–538. https://doi.org/10.1081/BIP-120022772 First citation in article Crossref, Google Scholar

Volume 15Issue 3August 2019

ISSN: 1614-1881eISSN: 1614-2241

History

ReceivedOctober 27, 2017
RevisedSeptember 1, 2018
AcceptedMay 13, 2019
Published onlineAugust 29, 2019

Licenses & Copyright

Keywords

Acknowledgments:

We wish to thank Prof. Yu-Sheng Hsu at Georgia State University for his helpful comments, T. Y. Chao for his technical assistance and the editor and reviewers for their insightful suggestions.

PDF download

Funding:

The research was supported by a National Science Council grant, Taiwan (NSC97-2118-M-153-001) and a 2009-2010 Fulbright Foundation grant, USA to Jiin-Huarng Guo; by a National Science Council grant, Taiwan (NSC98-2118-M-006-001) to Hubert J. Chen; by a National Science Council grant, Taiwan (NSC104-2410-H-006-015) to Wei-Ming Luh.

Verify Phone

Congrats!

Optimal Sample Sizes for Testing the Equivalence of Two Means

Abstract

References

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Optimal Sample Sizes for Testing the Equivalence of Two Means

Abstract

References

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners