Which Data to Meta-Analyze, and How?
A Specification-Curve and Multiverse-Analysis Approach to Meta-Analysis
Abstract
Abstract. Which data to analyze, and how, are fundamental questions of all empirical research. As there are always numerous flexibilities in data-analytic decisions (a “garden of forking paths”), this poses perennial problems to all empirical research. Specification-curve analysis and multiverse analysis have recently been proposed as solutions to these issues. Building on the structural analogies between primary data analysis and meta-analysis, we transform and adapt these approaches to the meta-analytic level, in tandem with combinatorial meta-analysis. We explain the rationale of this idea, suggest descriptive and inferential statistical procedures, as well as graphical displays, provide code for meta-analytic practitioners to generate and use these, and present a fully worked real example from digit ratio (2D:4D) research, totaling 1,592 meta-analytic specifications. Specification-curve and multiverse meta-analysis holds promise to resolve conflicting meta-analyses, contested evidence, controversial empirical literatures, and polarized research, and to mitigate the associated detrimental effects of these phenomena on research progress.
References
2001). Effects of violent video games on aggressive behavior, aggressive cognition, aggressive affect, physiological arousal, and prosocial behavior: A meta-analytic review of the scientific literature. Psychological Science, 12, 353–359. https://doi.org/10.1111/1467-9280.00366
(2010). Violent video game effects on aggression, empathy, and prosocial behavior in Eastern and Western countries: A meta-analytic review. Psychological Bulletin, 136, 151–173. https://doi.org/10.1037/a0018251
(2017). How do we love? Romantic love style in men is related to lower testosterone levels. Physiological Research, 66, 695–703.
(2014). Population matters when modeling hurricane fatalities [Letter to the editor]. Proceedings of the National Academy of Sciences of the United States of America, 111, E5331–E5332. https://doi.org/10.1073/pnas.1417030111
(2011). Sexual differentiation of human behavior: Effects of prenatal and pubertal organizational hormones. Frontiers in Neuroendocrinology, 32, 183–200. https://doi.org/10.1016/j.yfrne.2011.03.001
(2010). Organizational hypothesis: Instances of the fingerpost. Endocrinology, 151, 4116–4122. https://doi.org/10.1210/en.2010-0041
(2014). Comment: Beyond “evolutionary versus social”: Moving the cycle shift debate forward. Emotion Review, 6, 250–251. https://doi.org/10.1177/1754073914523050
(2012). Aggression, digit ratio, and variation in the androgen receptor, serotonin transporter, and dopamine D4 receptor genes in African foragers: The Hadza. Behavior Genetics, 42, 647–662. https://doi.org/10.1007/s10519-012-9533-2
(2015). Anthropometry in Klinefelter syndrome: Multifactorial influences due to CAG length, testosterone treatment and possibly intrauterine hypogonadism. Journal of Clinical Endocrinology and Metabolism, 100, E508–E517. https://doi.org/10.1210/jc.2014-2834
(2016). The association of the digit ratio and androgen receptor gene CAG polymorphism in patients with premature ovarian failure [in Chinese]. Journal of Ningxia Medical University, 38, 856–859, 867.
(2014). Are female hurricanes really deadlier than male hurricanes? [Letter to the editor]. Proceedings of the National Academy of Sciences of the United States of America, 111, E3497–E3498. https://doi.org/10.1073/pnas.1410910111
(2005). Prenatal sex hormone effects on child and adult sex-typed behavior: Methods and findings. Neuroscience and Biobehavioral Reviews, 29, 353–384. https://doi.org/10.1016/j.neubiorev.2004.11.004
(2014). Genetic variations in the androgen receptor are associated with steroid concentrations and anthropometrics but not with muscle mass in healthy young men. PLoS One, 9, e86235. https://doi.org/10.1371/journal.pone.0086235
(2015). Differences in salivary testosterone, digit ratio and empathy between intellectually gifted and control boys. Intelligence, 48, 76–84. https://doi.org/10.1016/j.intell.2014.11.002
(2016). Testosterone metabolism: A possible biological underpinning of non-verbal IQ in intellectually gifted girls. Acta Neurobiologiae Experimentalis, 76, 66–74. https://doi.org/10.21307/ane-2017-006
(2013). Mental rotation in intellectually gifted boys is affected by the androgen receptor CAG repeat polymorphism. Neuropsychologia, 94, 1693–1698. https://doi.org/10.1016/j.neuropsychologia.2013.05.016
(2007a). Evidence for publication bias in video game violence effects literature: A meta-analytic review. Aggression and Violent Behavior, 12, 470–482. https://doi.org/10.1016/j.avb.2007.01.001
(2007b). The good, the bad and the ugly: A meta-analytic review of positive and negative effects of violent video games. Psychiatric Quarterly, 78, 309–316. https://doi.org/10.1007/s11126-007-9056-9
(2014). Comment: Why meta-analyses rarely resolve ideological debates. Emotion Review, 6, 251–252. https://doi.org/10.1177/1754073914523046
(2015). Do angry birds make for angry children? A meta-analysis of video game influences on children’s and adolescents’ aggression, mental health, prosocial behavior, and academic performance. Perspectives on Psychological Science, 10, 646–666. https://doi.org/10.1177/1745691615592234
(2012). Relationship of 2D:4D finger ratio with muscle strength, testosterone, and androgen receptor CAG repeat genotype. American Journal of Physical Anthropology, 148, 81–87. https://doi.org/10.1002/ajpa.22044
(2014). The statistical crisis in science. American Scientist, 102, 460–465. https://doi.org/10.1511/2014.111.460
(2017). Brain volume and intelligence: The moderating role of intelligence measurement quality. Intelligence, 64, 18–29. https://doi.org/10.1016/j.intell.2017.06.004
(2014a). Do women’s mate preferences change across the ovulatory cycle? A meta-analytic review. Psychological Bulletin, 140, 1205–1259. https://doi.org/10.1037/a0035438
(2014b). Meta-analyses and p-curves support robust cycle shifts in women’s mate preferences: Reply to Wood and Carden (2014) and Harris, Pashler, and Mickes (2014). Psychological Bulletin, 140, 1272–1280. https://doi.org/10.1037/a0037714
(2012). Analysis of decisions made in meta-analyses of depression screening and the risk of confirmation bias: A case study. BMC Medical Research Methodology, 12, 76. https://doi.org/10.1186/1471-2288-12-76
(2014). Video games do affect social outcomes: A meta-analytic review of the effects of violent and prosocial video game play. Personality and Social Psychology Bulletin, 40, 578–589. https://doi.org/10.1177/0146167213520459
(2014). Ability of a meta-analysis to prevent redundant research: Systematic review of studies on pain from propofol injection. British Medical Journal, 349, g5219. https://doi.org/10.1136/bmj.g5219
(2012). Re-examining the Manning hypothesis: Androgen receptor polymorphism and the 2D:4D digit ratio. Evolution and Human Behavior, 33, 557–561. https://doi.org/10.1016/j.evolhumbehav.2012.02.003
(2014). Elastic analysis procedures: An incurable (but preventable) problem in the fertility effect literature. Comment on Gildersleeve, Haselton, and Fales (2014). Psychological Bulletin, 140, 1260–1264. https://doi.org/10.1037/a0036478
(2007). Research on 2D:4D: A promising challenge for the study of individual differences [Editorial]. Journal of Individual Differences, 28, 53–54. https://doi.org/10.1027/1614-0001.28.2.53
(2010). Sex-related variation in human behavior and the brain. Trends in Cognitive Sciences, 14, 448–456. https://doi.org/10.1016/j.tics.2010.07.005
(2011). Gender development and the human brain. Annual Review of Neuroscience, 34, 69–88. https://doi.org/10.1146/annurev-neuro-061010-113654
(2018). Fake facts and alternative truths in medical research. BMC Medical Ethics, 19, 4. https://doi.org/10.1186/s12910-018-0243-z
(2013). No evidence that 2D:4D is related to the number of CAG repeats in the androgen receptor gene. Frontiers in Endocrinology, 4, 185. https://doi.org/10.3389/fendo.2013.00185
(2010). Meta-analysis of digit ratio 2D:4D shows greater sex difference in the right hand. American Journal of Human Biology, 22, 619–630.
(2011). Aggression, digit ratio and variation in androgen receptor and monoamine oxidase A genes in men. Behavior Genetics, 41, 543–556. https://doi.org/10.1007/s10519-010-9404-7
(2014). Comment: Menstrual cycle fluctuations in women’s mate preferences. Emotion Review, 6, 253–254. https://doi.org/10.1177/1754073914523049
(2016). The mass production of redundant, misleading, and conflicted systematic reviews and meta‐analyses. Milbank Quarterly, 94, 485–514. https://doi.org/10.1111/1468-0009.12210
(2014). Comment: Alternatives to Wood et al.’s conclusions. Emotion Review, 6, 254–256. https://doi.org/10.1177/1754073914523048
(2014a). Female hurricanes are deadlier than male hurricanes. Proceedings of the National Academy of Sciences of the United States of America, 111, 8782–8787. https://doi.org/10.1073/pnas.1402786111
(2014b). Reply to Bakkensen and Larson: Population may matter but does not alter conclusions [Letter to the editor]. Proceedings of the National Academy of Sciences of the United States of America, 111, E5333. https://doi.org/10.1073/pnas.1419330111
(2014c). Reply to Christensen and Christensen and to Malter: Pitfalls of erroneous analyses of hurricanes names [Letter to the editor]. Proceedings of the National Academy of Sciences of the United States of America,111, E3499–E3500. https://doi.org/10.1073/pnas.1411652111
(2014d). Reply to Maley: Yes, appropriate modeling of fatality counts confirms female hurricanes are deadlier [Letter to the editor]. Proceedings of the National Academy of Sciences of the United States of America, 111, E3835. https://doi.org/10.1073/pnas.1414111111
(2003). Experiences of collaborative research. American Psychologist, 58, 723–730. https://doi.org/10.1037/0003-066X.58.9.723
(2018). Addressing replicability concerns via adversarial collaboration: Discovering hidden moderators of the minimal intergroup discrimination effect. Journal of Experimental Social Psychology, 78, 66–76. https://doi.org/10.1016/j.jesp.2018.05.001
(2011). 2D:4D ratios in the first 2 years of life: Stability and relation to testosterone exposure and sensitivity. Hormones and Behavior, 60, 256–263. https://doi.org/10.1016/j.yhbeh.2011.05.009
(2014). Spatial abilities are not related to testosterone levels and variation in the androgen receptor in healthy young men. General Physiology and Biophysics, 33, 311–319. https://doi.org/10.4149/gpb_2014005
(2016). Comparative evaluation of narrative reviews and meta-analyses: A case study. Zeitschrift für Psychologie, 224, 145–156. https://doi.org/10.1027/2151-2604/a000250
(2008). Finger forecasting: A pointer to athletic prowess in women – a preliminary investigation by an undergraduate biology class. American Biology Teacher, 70, 411–414. https://doi.org/10.1662/0002-7685(2008)70[411:FFAPTA]2.0.CO;2
(2018). A unified framework to quantify the credibility of scientific findings. Advances in Methods and Practices in Psychological Science, 1, 389–402. https://doi.org/10.1177/2515245918787489
(2012). Is CAG sequence length in the androgen receptor gene correlated with finger-length ratio? Personality and Individual Differences, 52, 224–227. https://doi.org/10.1016/j.paid.2011.09.009
(2014). Statistics show no evidence of gender bias in the public’s hurricane preparedness [Letter to the editor]. Proceedings of the National Academy of Sciences of the United States of America, 111, E3834. https://doi.org/10.1073/pnas.1413079111
(2014). Female hurricanes are not deadlier than male hurricanes [Letter to the editor]. Proceedings of the National Academy of Sciences of the United States of America, 111, E3496. https://doi.org/10.1073/pnas.1411428111
(2003). The second to fourth digit ratio and variation in the androgen receptor gene. Evolution and Human Behavior, 24, 399–405. https://doi.org/10.1016/S1090-5138(03)00052-7
(1998). The ratio of 2nd to 4th digit length: A predictor of sperm numbers and concentrations of testosterone, luteinizing hormone and oestrogen. Human Reproduction, 13, 3000–3004. https://doi.org/10.1093/humrep/13.11.3000
(2009). Androgen receptor CAG and GGN polymorphisms and 2D:4D finger ratio in male to female transsexuals [Abstract]. Journal of Sexual Medicine, 6(Suppl. 5), 419–420.
(2005). Big-brained people are smarter: A meta-analysis of the relationship between in vivo brain volume and intelligence. Intelligence, 33, 337–346. https://doi.org/10.1016/j.intell.2004.11.005
(2017). Overlapping network meta-analyses on the same topic: Survey of published studies. International Journal of Epidemiology, 46, 1999–2008. https://doi.org/10.1093/ije/dyx138
(2018). Psychology’s renaissance. Annual Review of Psychology, 69, 511–534. https://doi.org/10.1146/annurev-psych-122216-011836
(2012). GOSH: A graphical display of study heterogeneity. Research Synthesis Methods, 3, 214–223. https://doi.org/10.1002/jrsm.1053
(2015). Assessment of vibration of effects due to model specification can demonstrate the instability of observational associations. Journal of Clinical Epidemiology, 68, 1046–1058. https://doi.org/10.1016/j.jclinepi.2015.05.029
(2015). Meta-analysis of associations between human brain volume and intelligence differences: How strong are they and what do they mean? Neuroscience and Biobehavioral Reviews, 57, 411–432. https://doi.org/10.1016/j.neubiorev.2015.09.017
(2017). Probing birth-order effects on narrow traits using specification-curve analysis. Psychological Science, 28, 1821–1832. https://doi.org/10.1177/0956797617723726
(2015). Finding your way out of the forest without a trail of breadcrumbs: Development and evaluation of two novel displays of forest plots. Research Synthesis Methods, 6, 74–86. https://doi.org/10.1002/jrsm.1125
(2018). Many analysts, one dataset: Making transparent how variations in analytical choices affect results. Advances in Methods and Practices in Psychological Science, 1, 337–356. https://doi.org/10.1177/2515245917747646
(2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. https://doi.org/10.1177/0956797611417632
(2012). A 21 word solution. Dialogue, 26, 4–7. https://doi.org/10.2139/ssrn.2160588
(2015). Specification curve: Descriptive and inferential statistics on all reasonable specifications. Retrieved from http://sticerd.lse.ac.uk/seminarpapers/psyc16022016.pdf
(2016). Hurricane names: A bunch of hot air? Weather and Climate Extremes, 12, 80–84. https://doi.org/10.1016/j.wace.2015.11.006
(2016). Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11, 702–712. https://doi.org/10.1177/1745691616658637
(2016). Triangulating meta-analyses: The example of the serotonin transporter gene, stressful life events and major depression. BMC Psychology, 4, 23. https://doi.org/10.1186/s40359-016-0129-0
(2014). Community-augmented meta-analyses: Toward cumulative data assessment. Perspectives on Psychological Science, 9, 661–665. https://doi.org/10.1177/1745691614552498
(2014). Comment: The social neuroendocrinology example: Incorporating culture resolves biobehavioral evolutionary paradoxes. Emotion Review, 6, 256–257. https://doi.org/10.1177/1754073914523047
(2011). Special issue preamble: Digit ratio (2D:4D) and individual differences research. Personality and Individual Differences, 51, 367–370. https://doi.org/10.1016/j.paid.2011.04.018
(2014). No effects of androgen receptor gene CAG and GGC repeat polymorphisms on digit ratio (2D:4D): A comprehensive meta-analysis and critical evaluation of research. Evolution and Human Behavior, 35, 430–437. https://doi.org/10.1016/j.evolhumbehav.2014.05.009
(2018). Meta-analysis shows associations of digit ratio (2D:4D) and transgender identity are small at best. Endocrine Practice, 24, 386–390. https://doi.org/10.4158/EP-2017-0024
(2009). Scientometric analysis and bibliography of digit ratio (2D:4D) research, 1998–2008. Psychological Reports, 104, 922–956. https://doi.org/10.2466/PR0.104.3.922-956
(2011). Digit ratio (2D:4D) and sex-role orientation: Further evidence and meta-analysis. Personality and Individual Differences, 51, 417–422. https://doi.org/10.1016/j.paid.2010.06.009
(2010). Digit ratio (2D:4D) and sensation seeking: New data and meta-analysis. Personality and Individual Differences, 48, 72–77. https://doi.org/10.1016/j.paid.2009.08.019
(2018). Genome-wide association study identifies nine novel loci for 2D:4D finger ratio, a putative retrospective biomarker of testosterone exposure in utero. Human Molecular Genetics, 27, 2025–2038. https://doi.org/10.1093/hmg/ddy121
(2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3, 160018. https://doi.org/10.1038/sdata.2016.18
(2014). Author reply: Once again, menstrual cycles and mate preferences. Emotion Review, 6, 258–260. https://doi.org/10.1177/1754073914523053
(2014). Elusiveness of menstrual cycle effects on mate preferences: Comment on Gildersleeve, Haselton, and Fales (2014). Psychological Bulletin, 140, 1265–1271. https://doi.org/10.1037/a0036722
(2014). Meta-analysis of menstrual cycle effects on women’s mate preferences. Emotion Review, 6, 229–249. https://doi.org/10.1177/1754073914523073
(2018). Model uncertainty and the crisis in science. Socius. Advance online publication. https://doi.org/10.1177/2378023117737206
(2017). Model uncertainty and robustness: A computational framework for multimodel analysis. Sociological Methods & Research, 46, 3–40. https://doi.org/10.1177/0049124115610347
(2013). Relationship of 2D:4D finger ratio with androgen receptor CAG and GGN repeat polymorphism. American Journal of Human Biology, 25, 101–106. https://doi.org/10.1002/ajhb.22347
(2016). Relationship between androgen receptor CAG/GGN repeat polymorphisms and the ratio of 2D:4D [in Chinese]. Acta Anatomica Sinica, 47, 409–414.
(2018). Revisiting the relation of ratio of 2D:4D with the androgen receptor (AR) gene and the circulating testosterone levels: Cross-sectional study and meta-analyses. Manuscript submitted for publication
(2017). Rainforest plots for the presentation of patient-subgroup analysis in clinical trials. Annals of Translational Medicine, 5, 24. https://doi.org/10.21037/atm.2017.10.07
(