Abstract
Most questions across science call for quantitative answers, ideally, a single best estimate plus information about the precision of that estimate. A confidence interval (CI) expresses both efficiently. Early experimental psychologists sought quantitative answers, but for the last half century psychology has been dominated by the nonquantitative, dichotomous thinking of null hypothesis significance testing (NHST). The authors argue that psychology should rejoin mainstream science by asking better questions – those that demand quantitative answers – and using CIs to answer them. They explain CIs and a range of ways to think about them and use them to interpret data, especially by considering CIs as prediction intervals, which provide information about replication. They explain how to calculate CIs on means, proportions, correlations, and standardized effect sizes, and illustrate symmetric and asymmetric CIs. They also argue that information provided by CIs is more useful than that provided by p values, or by values of Killeen’s prep, the probability of replication.
References
2000). Statistics with confidence: Confidence intervals and statistical guidelines (2nd ed.). London: British Medical Journal Books.
(2001). Publication manual of the American Psychological Association (5th ed.). Washington, DC: Author.
(1988). Statistical power analysis for the behavioral sciences (2nd ed.). New York: Academic Press.
(1994). The earth is round (p < .05). American Psychologist, 49, 997–1003.
(2005). Understanding the average probability of replication: Comment on Killeen. Psychological Science, 16, 1002–1004.
(2007). Inference by eye: Pictures of confidence intervals and thinking about levels of confidence. Teaching Statistics, 29, 89–93.
(2008). Replication and p intervals: p values predict the future only vaguely, but confidence intervals do much better. Perspectives on Psychological Science, 3.
(2007). Statistical reform in psychology: Is anything changing? Psychological Science, 18, 230–232.
(2001). A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncentral distributions. Educational and Psychological Measurement, 61, 532–574.
(2005). Inference by eye: Confidence intervals and how to read pictures of data. American Psychologist, 60, 170–180.
(2006). Confidence intervals and replication: Where will the next mean fall? Psychological Methods, 11, 217–227.
(2004). Replication, and researchers’ understanding of confidence intervals and standard error bars. Understanding Statistics, 3, 299–311.
(1998). Meta-analysis. In , Modern epidemiology (2nd ed., pp. 643–673). Philadelphia, PA: Lippincott-Raven.
(1986). The fallacy of employing standardized regression coefficients and correlations as measures of effect. American Journal of Epidemiology, 123, 203–208.
(2005). Effect sizes for research. A broad practical approach. Mahwah, NJ: Erlbaum.
(1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press.
(2005). An alternative to null hypothesis significance tests. Psychological Science, 16, 345–353.
(2006). Beyond statistical inference: A decision theory for science. Psychonomic Bulletin & Review, 13, 549–562.
(2008). Replication statistics. In , Best practice in quantitative methods (pp. 103–124). Thousand Oaks, CA: Sage.
(2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington DC, USA: American Psychological Association.
(1996). Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56, 746–759.
(1996). Psychology will be a much better science when we change the way we analyze data. Current Directions in Psychological Science, 5, 161–171.
(1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806–834.
(2004). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies. Psychological Methods, 9, 147–163.
(2000). Proportions and their differences. In , Statistics with confidence: Confidence intervals and statistical guidelines (2nd ed., pp. 45–56). London: British Medical Journal Books.
(1994). Parametric measures of effect size. In , The handbook of research synthesis (pp. 231–244). New York: Russell Sage Foundation.
(2009). Effect sizes: Why, when, and how to use them. Zeitschrift für Psychologie / Journal of Psychology, 217, xxx–xxx.
(2003). Confidence intervals. Thousand Oaks, CA: Sage.
(1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604.
(