Effect Size Estimation From t-Statistics in the Presence of Publication Bias
A Brief Review of Existing Approaches With Some Extensions
Abstract
Abstract. Publication bias hampers the estimation of true effect sizes. Specifically, effect sizes are systematically overestimated when studies report only significant results. In this paper we show how this overestimation depends on the true effect size and on the sample size. Furthermore, we review and follow up methods originally suggested by Hedges (1984), Iyengar and Greenhouse (1988), and Rust, Lehmann, and Farley (1990) allowing the estimation of the true effect size from published test statistics (e.g., from the t-values of reported significant results). Moreover, we adapted these methods allowing meta-analysts to estimate the percentage of researchers who consign undesired results in a research domain to the file drawer. We also apply the same logic to the case when significant results tend to be underreported. We demonstrate the application of these procedures for conventional one-sample and two-sample t-tests. Finally, we provide R and MATLAB versions of a computer program to estimate the true unbiased effect size and the prevalence of publication bias in the literature.
References
2017). Sample-size planning for more accurate statistical power: A method adjusting sample effect sizes for publication bias and uncertainty. Psychological Science, 28, 1547–1562. https://doi.org/10.1177/0956797617723724
(2005).
(Fail-safe N or file-drawer number . In H. R. RothsteinA. J. SuttonM. BorensteinEds., Publication bias in meta-analysis: Prevention, assessment and adjustments (pp. 111–125). Chichester, UK: Wiley. https://doi.org/10.1002/0470870168.ch71988). Publication bias: A problem in interpreting medical data. Journal of the Royal Statistical Society. Series A (Statistics in Society), 151, 419–463. https://doi.org/10.2307/2982993
(1994). Operating characteristics of a rank correlation test for publication bias. Biometrics, 50, 1088–1101. Retrieved from http://www.jstor.org/stable/2533446
(2003). Implications of the tobacco industry documents for public health and policy. Annual Review of Public Health, 24, 267–288. https://doi.org/10.1146/annurev.publhealth.24.100901.140813
(2009). Introduction to meta-analysis. Chichester, UK: Wiley. https://doi.org/10.1002/9780470743386
(2016). p-curve and p-hacking in observational research. PLoS One, 11, e0149144. https://doi.org/10.1371/journal.pone.0149144
(2017). A parsimonious weight function for modeling publication bias. Psychological Methods, 22, 28–41. https://doi.org/10.1037/met0000119
(2015). Publication bias as a function of study characteristics. Psychological Methods, 20, 310–330. https://doi.org/10.1037/met0000046
(1988). Statistical power analysis for the behavioral sciences (Vol. 2). Hillsdale, NJ: Erlbaum.
(1992). A power primer. Psychological Bulletin, 112, 155–159. https://doi.org/10.1037/0033-2909.112.1.155
(1997). Finding the missing science: The fate of studies submitted for review by a human subjects committee. Psychological Methods, 2, 447–452. https://doi.org/10.1037/1082-989X.2.4.447
(1986). Effect of positive findings on submission and acceptance rates: A note on meta-analysis bias. Professional Psychology: Research and Practice, 17, 136–137. https://doi.org/10.1037/0735-7028.17.2.136
(2015). A surge of p-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too). PeerJ, 3, e733. https://doi.org/10.7717/peerj.733
(2005).
(Publication bias: Recognizing the problem, understanding its origins and scope, and preventing harm . In H. R. RothsteinA. J. SuttonM. BorensteinEds., Publication bias in meta-analysis: Prevention, assessment and adjustments (pp. 9–33). Chichester, UK: Wiley.2015). Does publication bias inflate the apparent efficacy of psychological treatment for major depressive disorder? A systematic review and meta-analysis of US National Institutes of Health-funded trials. PloS One, 10, e0137864. https://doi.org/10.1371/journal.pone.0137864
(2005).
(Publication bias: Recognizing the problem, understanding its origins and scope, and preventing harm . In H. R. RothsteinA. J. SuttonM. BorensteinEds., The trim and fill method (pp. 127–144). Chichester, UK: Wiley.2000). Trim and fill: A simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics, 56, 455–463. https://doi.org/10.1111/j.0006-341x.2000.00455.x
(1997). Bias in meta-analysis detected by a simple, graphical test. British Medical Journal, 315, 629–634. https://doi.org/10.1136/bmj.315.7109.629
(2016). A Bayesian perspective on the reproducibility project: Psychology. PLoS One, 11, 1–13. https://doi.org/10.1371/journal.pone.0149794
(2010a). Do pressures to publish increase scientists’ bias? An empirical support from US states data. PLoS One, 5, 1–7. https://doi.org/10.1371/journal.pone.0010271
(2010b). “Positive” results increase down the hierarchy of the sciences. PLoS One, 5, 1–10. https://doi.org/10.1371/journal.pone.0010068
(2012). Negative results are disappearing from most disciplines and countries. Scientometrics, 90, 891–904. https://doi.org/10.1007/s11192-011-0494-7
(1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika, 10, 507–521. https://doi.org/10.2307/2331838
(2014). The frequency of excess success for articles in Psychological Science. Psychonomic Bulletin & Review, 21, 1180–1187. https://doi.org/10.3758/s13423-014-0601-x
(2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345, 1502–1505. https://doi.org/10.1126/science.1255484
(2004). Meta-analysis of intellectual and neuropsychological test performance in attention-deficit/hyperactivity disorder. Neuropsychology, 18, 543–555. https://doi.org/10.1037/0894-4105.18.3.543
(2015). The distribution of probability values in medical abstracts: An observational study. BMC Research Notes, 8, 721. https://doi.org/10.1186/s13104-015-1691-x
(1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1–20. https://doi.org/10.1037/h0076157
(2016). A Bayesian approach to mitigation of publication bias. Psychonomic Bulletin & Review, 23, 74–86. https://doi.org/10.3758/s13423-015-0868-6
(2015). The extent and consequences of p-hacking in science. PLoS Biology, 13, e1002106. https://doi.org/10.1371/journal.pbio.1002106
(1981). Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational and Behavioral Statistics, 6, 107–128. https://doi.org/949 10.3102/10769986006002107
(1984). Estimation of effect size under nonrandom sampling: The effects of censoring studies yielding statistically insignificant mean differences. Journal of Educational and Behavioral Statistics, 9, 61–85. https://doi.org/10.3102/10769986009001061
(1992). Modeling publication selection effects in meta-analyis. Statistical Science, 7, 246–255. https://doi.org/10.1214/ss/1177011364
(2017). Plausibility and influence in selection models: A comment on Citkowicz and Vevea (2017). Psychological Methods, 22, 42–46. https://doi.org/10.1037/met0000108
(1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.
(1996). Estimating effect size under publication bias: Small sample properties and robustness of a random effects selection model. Journal of Educational and Behavioral Statistics, 21, 299–332. https://doi.org/10.3102/10769986021004299
(2005).
(Selection method approaches . In H. R. RothsteinA. J. SuttonM. BorensteinEds., Publication bias in meta-analysis: Prevention, assessment and adjustments (pp. 145–174). Chichester, UK: Wiley.2000). Shameful science: Four decades of the German tobacco industry’s hidden research on smoking and health. Tobacco Control, 9, 242–248. https://doi.org/10.1136/tc.9.2.242
(2005). Introduction to mathematical statistics (6th ed.). New Delhi, India: Pearson Prentice Hall.
(2012). The behavior of the p-value when the alternative hypothesis is true. Biometrics, 53, 11–22. Retrieved from http://www.jstor.org/stable/2533093
(1990). Gender differences in mathematics performance – A meta-analysis. Psychological Bulletin, 107, 139–155. https://doi.org/10.1037//0033-2909.107.2.139
(2005). Why most published research findings are false. PLoS Medicine, 2, e124. https://doi.org/10.1371/journal.pmed.0020124
(2014). How to make more published research true. PLoS Medicine, 11, e1001747. https://doi.org/10.1371/journal.pmed.1001747
(2015). Failure to replicate: Sound the alarm. Cerebrum cer-12-a-1(Nov–Dec), 1–12.
(1988). Selection models and the file drawer problem. Statistical Science, 3, 133–135. https://doi.org/10.1214/ss/1177013019
(2015). Statistical methods for dealing with publication bias in meta-analysis. Statistics in Medicine, 34, 343–360. https://doi.org/10.1002/sim.6342
(2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524–532. https://doi.org/10.1177/0956797611430953
(1994). Continuous univariate distributions (Vol. 2). New York, NY: Wiley.
(2015). Evidential value that exercise improves BMI z-score in overweight and obese children and adolescents. BioMed Research International, 2015, 1–5. https://doi.org/10.1155/2015/151985
(2003). Mathematics interventions for children with special educational needs. Remedial and Special Education, 24, 97–114. https://doi.org/10.1155/2015/151985
(2015a). On the challenges of drawing conclusions from p-values just below 0.05. PeerJ, 3, e1142. https://doi.org/doi.org/10.7717/peerj.1142
(2015b). What p-hacking really looks like: A comment on Masicampo and LaLande (2012). The Quarterly Journal of Experimental Psychology, 68, 829–832. https://doi.org/10.1080/17470218.2014.982664
(1978). Estimating effect size: Bias resulting from the significance criterion in editorial decisions. British Journal of Mathematical and Statistical Psychology, 31, 107–112.
(2006). Evidence based medicine: The case of the misleading funnel plot. British Medical Journal, 333, 597–600. https://doi.org/10.1136/bmj.333.7568.597
(1984). Summing up: The science of reviewing research. Cambridge, MA: Harvard University Press.
(2013). Overestimated effect of Epo administration on aerobic exercise capacity: A meta-analysis. American Journal of Sports Science and Medicine, 1, 17–27. https://doi.org/10.12691/ajssm-1-2-2
(2012). A peculiar prevalence of p values just below .05. The Quarterly Journal of Experimental Psychology, 65, 2271–2279. https://doi.org/10.1080/17470218.2012.711335
(2014). You cannot step into the same river twice: When power analyses are optimistic. Perspectives on Psychological Science, 9, 612–625. https://doi.org/10.1177/1745691614548513
(2016). Adjusting for publication bias in meta-analysis: An evaluation of selection methods and some cautionary notes. Perspectives on Psychological Science, 11, 730–749. https://doi.org/10.1177/1745691616662243
(2011). Aggregate and individual replication probability within an explicit model of the research process. Psychological Methods, 16, 337–360. https://doi.org/10.1037/a0023347
(2009). Is the mirror neuron system involved in imitation? A short review and meta-analysis. Neuroscience and Biobehavioral Reviews, 33, 975–980. https://doi.org/10.1016/j.neubiorev.2009.03.010
(1964). A simplex method for function minimization. The Computer Journal, 7, 308–313.
(2015). Restoring Study 329: Efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. British Medical Journal, 351, h320. https://doi.org/10.1136/bmj.h4320
(2015). Estimating the reproducibility of psychological science. Science, 349, aac4716. https://doi.org/10.1126/science.aac471
(2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7, 531–536. https://doi.org/10.1177/1745691612463401
(2007). Performance of the trim and fill method in the presence of publication bias and between-study heterogeneity. Statistics in Medicine, 26, 4544–4562. https://doi.org/10.1002/sim.2889
(2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7, 331–363. https://doi.org/10.1037/1089-2680.7.4.331
(1996). Measures of effect size. Behavior Research Methods, Instruments, & Computers, 28, 12–22. https://doi.org/10.3758/BF03203631
(1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86, 638–641. https://doi.org/10.1037/0033-2909.86.3.638
(1963). The interpretation of levels of significance by psychological researchers. The Journal of Psychology, 55, 33–38. https://doi.org/10.1080/00223980.1963.9916596
(1980). Introduction to probability models (2nd ed.). New York, NY: Academic Press.
(2005). Publication bias in meta-analysis: Prevention, assessment and adjustments. Chichester, UK: Wiley.
(1990). Estimating publication bias in meta-analysis. Journal of Marketing Research, 27, 220–226. Retrieved from http://www.jstor.org/stable/3172848
(2000). The effects of psychological therapies under clinically representative conditions: A meta-analysis. Psychological Bulletin, 126, 512–529. https://doi.org/10.1037//0033-2909.126.4.512
(2008). Vectorized adaptive quadrature in MATLAB. Journal of Computational and Applied Mathematics, 211, 131–140. https://doi.org/10.1016/j.cam.2006.11.021
(2015). Romance, risk, and replication: Can consumer choices and risk-taking be primed by mating motives? Journal of Experimental Psychology: General, 144, 142–158. https://doi.org/10.1037/xge0000116
(2014). p-Curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143, 534–547. https://doi.org/10.1037/a0033242
(2014). p-Curve and effect size: Correcting for publication bias using only significant results. Psychological Science, 9, 666–681. https://doi.org/10.1177/1745691614553988
(1964). The importance of negative results in psychological research. The Canadian Psychologist, 5, 225–232. https://doi.org/10.1037/h0083036
(1959). Publication decisions and their possible effects on inferences drawn from tests of significance – or vice versa. Journal of the American Statistical Association, 54, 30–34. https://doi.org/10.1080/01621459.1959.10501497
(1995). Publication decisions revisited: The effect of the outcome of statistical tests on the decision to publish and vice versa. The American Statistician, 49, 108–112. https://doi.org/10.1080/00031305.1995.10476125
(2005).
(The funnel plot . In H. R. RothsteinA. J. SuttonM. BorensteinEds., Publication bias in meta-analysis: Prevention, assessment and adjustments (pp. 75–98). Chichester, UK: Wiley.2005).
(Regression methods to detect publication and other bias in meta-analysis . In H. R. RothsteinS. SuttonM. BorensteinEds., Publication bias in meta-analysis: Prevention, assessment and adjustments (pp. 99–110). Chichester, UK: Wiley.2005).
(Evidence concerning the consequences of publication and related biases . In H. R. RothsteinA. J. SuttonM. BorensteinEds., Publication bias in meta-analysis: Prevention, assessment and adjustments (pp. 175–192). Chichester, UK: Wiley.2000). Modelling publication bias in meta-analysis: A review. Statistical Methods in Medical Research, 9, 421–445. https://doi.org/10.1191/096228000701555244
(2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15, 1–18. https://doi.org/10.1371/journal.pbio.2000797
(2000). Publication bias in meta-analysis: Its causes and consequences. Journal of Clinical Epidemiology, 53, 207–216. https://doi.org/10.1016/S0895-4356(99)00161-4
(2017). Some properties of the p-curves, with an application to gradual publication bias. Psychological Methods. Advance Online Publication. https://doi.org/10.1037/met0000125
(2015). Underpowered samples, false negatives, and unconscious learning. Psychonomic Bulletin & Review, 23, 87–102. https://doi.org/10.3758/s13423-015-0892-6
(2014). Why publishing everything is more effective than selective publishing of statistically significant results. PloS one, 9, e84896. https://doi.org/10.1371/journal.pone.0084896
(2015). Meta-analysis using effect size distributions of only statistically significant studies. Psychological Methods, 20, 293–309. https://doi.org/10.1037/met0000025
(1995). A general linear model for estimating effect size in the presence of publication bias. Psychometrika, 60, 419–435. https://doi.org/10.1007/BF02294384
(2005). Publication bias in research synthesis: sensitivity analysis using a priori weight functions. Psychological Methods, 10, 428–443. https://doi.org/1122 10.1037/1082-989X.10.4.428
(