Registered Report

Taking a Closer Look at the Bayesian Truth Serum

A Registered Report

University of St Andrews, School of Economics & Finance; School of Philosophical, Anthropological and Film Studies, St Andrews, UK

Search for more papers by this author

and

Steven Verheyen

Steven Verheyen, Department of Psychology, Education and Child Studies, Erasmus University Rotterdam, Post Box 17383000 DR Rotterdam, The Netherlands,

[email protected]

Department of Psychology, Education and Child Studies, Erasmus University Rotterdam, Rotterdam, The Netherlands

Search for more papers by this author

Published Online:December 07, 2022https://doi.org/10.1027/1618-3169/a000558

Abstract

Abstract. Over the past few decades, psychology and its cognate disciplines have undergone substantial scientific reform, ranging from advances in statistical methodology to significant changes in academic norms. One aspect of experimental design that has received comparatively little attention is incentivization, i.e., the way that participants are rewarded and incentivized monetarily for their participation in experiments and surveys. While incentive-compatible designs are the norm in disciplines like economics, the majority of studies in psychology and experimental philosophy are constructed such that individuals’ incentives to maximize their payoffs in many cases stand opposed to their incentives to state their true preferences honestly. This is in part because the subject matter is often self-report data about subjective topics, and the sample is drawn from online platforms like Prolific or MTurk where many participants are out to make a quick buck. One mechanism that allows for the introduction of an incentive-compatible design in such circumstances is the Bayesian Truth Serum (BTS; Prelec, 2004), which rewards participants based on how surprisingly common their answers are. Recently, Schoenegger (2021) applied this mechanism in the context of Likert-scale self-reports, finding that the introduction of this mechanism significantly altered response behavior. In this registered report, we further investigate this mechanism by (1) attempting to directly replicate the previous result and (2) analyzing if the Bayesian Truth Serum’s effect is distinct from the effects of its constituent parts (increase in expected earnings and addition of prediction tasks). We fail to find significant differences in response behavior between participants who were simply paid for completing the study and participants who were incentivized with the BTS. Per our pre-registration, we regard this as evidence in favor of a null effect of up to V = .1 and a failure to replicate but reserve judgment as to whether the BTS mechanism should be adopted in social science fields that rely heavily on Likert-scale items reporting subjective data, seeing that smaller effect sizes might still be of practical interest and results may differ for items different from the ones we studied. Further, we provide weak evidence that the prediction task itself influences response distributions and that this task’s effect is distinct from an increase in expected earnings, suggesting a complex interaction between the BTS’ constituent parts and its truth-telling instructions.

References

Baillon, A. (2017). Bayesian markets to elicit private information. Proceedings of the National Academy of Sciences, 114(30), 7958–7962. 10.1073/pnas.1703486114 First citation in article Crossref Medline, Google Scholar
Barends, A. J., & de Vries, R. E. (2019). Noncompliant responding: Comparing exclusion criteria in MTurk personality research to improve data quality. Personality and Individual Differences, 143(6), 84–89. 10.1016/j.paid.2019.02.015 First citation in article Crossref, Google Scholar
Barnard, R., & Ulatowski, J. (2013). Truth, correspondence, and gender. Review of Philosophy and Psychology, 4(4), 621–638. 10.1007/s13164-013-0155-2 First citation in article Crossref, Google Scholar
Bay, D., Cook, G. L., & Yeboah, D. (2020). Recruiting method and its impact on participant behavior. In K. E. Karim (Ed.), Advances in accounting behavioral research (pp. 1–19). Emerald Publishing. 10.1108/S1475-148820200000023001 First citation in article Crossref, Google Scholar
Buhrmester, M. D., Kwang, T., & Gosling, S. D. (2011). Amazon's Mechanical Turk. Perspectives on Psychological Science, 6(1), 3–5. 10.1177/1745691610393980 First citation in article Crossref Medline, Google Scholar
Camerer, C. F., & Hogarth, R. M. (1999). The effects of financial incentives in experiments: A review and capital-labor-production framework. Journal of Risk and Uncertainty, 19(1), 7–48. 10.1023/A:1007850605129 First citation in article Crossref, Google Scholar
Carter, J. A., Pritchard, D., & Shepherd, J. (2019). Knowledge-how, understanding-why and epistemic luck: An experimental study. Review of Philosophy and Psychology, 10(4), 701–734 First citation in article Crossref, Google Scholar
Clay, F. J., Berecki-Gisolf, J., & Collie, A. (2014). How well do we report on compensation systems in studies of return to work: A systematic review. Journal of Occupational Rehabilitation, 24(1), 111–124. 10.1007/s10926-013-9435-z First citation in article Crossref Medline, Google Scholar
Crump, M. J., McDonnell, J. V., & Gureckis, T. M. (2013). Evaluating Amazon's Mechanical Turk as a tool for experimental behavioral research. PloS One, 8(3). e57410. 10.1371/journal.pone.0057410. First citation in article Crossref, Google Scholar
De Brigard, F., & Brady, W. J. (2013). The effect of what we think may happen on our judgments of responsibility. Review of Philosophy and Psychology, 4(2), 259–269. 10.1007/s13164-013-0133-8. First citation in article Crossref, Google Scholar
Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. 10.3758/brm.41.4.1149 First citation in article Crossref Medline, Google Scholar
Frank, M. R., Cebrian, M., Pickard, G., & Rahwan, I. (2017). Validating Bayesian Truth Serum in large-scale online human experiments. PloS ONE, 12(5), e0177385. 10.1371/journal.pone.0177385 First citation in article Crossref, Google Scholar
Galesic, M., Bruine de Bruin, W., Dumas, M., Kapteyn, A., Darling, J. E., & Meijer, E. (2018). Asking about social circles improves election predictions. Nature Human Behavior, 2(2), 187–193. 10.1038/s41562-018-0302-y First citation in article Crossref, Google Scholar
Hagman, W., Andersson, D., Västfjäll, D., & Tinghög, G. (2015). Public views on policies involving nudges. Review of Philosophy and Psychology, 6(3), 439–453. 10.1007/s13164-015-0263-2 First citation in article Crossref, Google Scholar
Hales, A. H., Wesselmann, E. D., & Hilgard, J. (2019). Improving psychological science through transparency and openness: An overview. Perspectives on Behavior Science, 42(1), 13–31. 10.1007/s40614-018-00186-8 First citation in article Crossref Medline, Google Scholar
Hauser, D. J., & Schwarz, N. (2016). Attentive turkers: MTurk participants perform better on online attention checks than do subject pool participants. Behavior Research Methods, 48(1), 400–407. 10.3758/s13428-015-0578-z First citation in article Crossref Medline, Google Scholar
Hertwig, R., & Ortmann, A. (2001). Experimental practices in economics: A methodological challenge for psychologists? Behavioral and Brain Sciences, 24(3), 383–403. 10.1017/s0140525x01004149 First citation in article Crossref Medline, Google Scholar
Herzog, S. M., & Hertwig, R. (2009). The Wisdom of many in one mind. Psychological Science, 20(2), 231–237. 10.1111/j.1467-9280.2009.02271.x First citation in article Crossref Medline, Google Scholar
Ho, C. J., Slivkins, A., Suri, S., & Vaughan, J. W. (2015, May). Incentivizing high quality Crowdwork. In Proceedings of the 24th International Conference on World Wide Web (pp. 419–429). 10.1145/2736277.2741102 First citation in article Crossref, Google Scholar
Hoch, S. J. (1985). Counterfactual reasoning and accuracy in predicting personal events. Journal of Experimental Psychology: Learning, Memory, and Cognition, 11(4), 719–731. 10.1037/0278-7393.11.1-4.719 First citation in article Crossref, Google Scholar
Howie, P. J., Wang, Y., & Tsai, J. (2011). Predicting new product adoption using Bayesian Truth Serum. Journal of Medical Marketing, 11(1), 6–16. 10.1057/jmm.2010.19 First citation in article Crossref, Google Scholar
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. 10.1177/0956797611430953 First citation in article Crossref Medline, Google Scholar
Kees, J., Berry, C., Burton, S., & Sheehan, K. (2017). An analysis of data quality: Professional panels, student subject pools, and Amazon's Mechanical Turk. Journal of Advertising, 46(1), 141–155. 10.1080/00913367.2016.1269304 First citation in article Crossref, Google Scholar
Keith, M. G., Tay, L., & Harms, P. D. (2017). Systems perspective of Amazon Mechanical Turk for organizational research: Review and recommendations. Frontiers in Psychology, 8. Article 1359. 10.3389/fpsyg.2017.01359 First citation in article Crossref Medline, Google Scholar
Klitzman, R., Albala, I., Siragusa, J., Nelson, K. N., & Appelbaum, P. S. (2007). The reporting of monetary compensation in research articles. Journal of Empirical Research on Human Research Ethics, 2(4), 61–67. 10.1525/jer.2007.2.4.61 First citation in article Crossref Medline, Google Scholar
Koriat, A., Lichtenstein, S., & Fischhoff, B. (1980). Reasons for confidence. Journal of Experimental Psychology: Human Learning and Memory, 6(2), 107–118. 10.1037/0278-7393.6.2.107 First citation in article Crossref, Google Scholar
Krueger, J. I., & Chen, L. J. (2014). The first cut is the deepest: Effects of social projection and dialectical bootstrapping on judgmental accuracy. Social Cognition, 32(4), 315–336. 10.1521/soco.2014.32.4.315 First citation in article Crossref, Google Scholar
Litman, L., Robinson, J., & Rosenzweig, C. (2015). The relationship between motivation, monetary compensation, and data quality among US- and India-based workers on Mechanical Turk. Behavior Research Methods, 47(2), 519–528. 10.3758/s13428-014-0483-x First citation in article Crossref Medline, Google Scholar
Lord, C. G., Lepper, M. R., & Preston, E. (1984). Considering the opposite: A corrective strategy for social judgment. Journal of Personality and Social Psychology, 47(6), 1231–1243. 10.1037//0022-3514.47.6.1231 First citation in article Crossref Medline, Google Scholar
Lorenz, J., Rauhut, H., Schweitzer, F., & Helbing, D. (2011). How social influence can undermine the wisdom of crowd effect. Proceedings of the National Academy of Sciences, 108(22), 9020–9025. 10.1073/pnas.1008636108 First citation in article Crossref Medline, Google Scholar
Loughran, T. A., Paternoster, R., & Thomas, K. J. (2014). Incentivizing responses to self-report questions in perceptual Deterrence studies: An investigation of the validity of Deterrence theory using Bayesian Truth Serum. Journal of Quantitative Criminology, 30(4), 677–707. 10.1007/s10940-014-9219-4 First citation in article Crossref, Google Scholar
Marks, G., & Miller, N. (1987). Ten years of research on the false-consensus effect: An empirical and theoretical review. Psychological Bulletin, 102(1), 72–90. 10.1037/0033-2909.102.1.72 First citation in article Crossref, Google Scholar
Mason, W., & Watts, D. J. (2009, June). Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 77–85). 10.1145/1600150.1600175 First citation in article Crossref, Google Scholar
Nadelhoffer, T., Yin, S., & Graves, R. (2020). Folk intuitions and the conditional ability to do otherwise. Philosophical Psychology, 33(7), 968–996. 10.1080/09515089.2020.1817884. First citation in article Crossref, Google Scholar
Nosek, B. A., & Lakens, D. (2014). Registered reports. Social Psychology, 45(3), 137–141. 10.1027/1864-9335/a000192 First citation in article Link, Google Scholar
Nosek, B. A., & Lindsay, D. S. (2018). Preregistration becoming the norm in psychological science. APS Observer, 31(3). https://www.psychologicalscience.org/observer/preregistration-becoming-the-norm-in-psychological-science/comment-page-1?pdf=true First citation in article Google Scholar
Offerman, T., Sonnemans, J., Van De Kuilen, G., & Wakker, P. P. (2009). A truth serum for non-Bayesians: Correcting proper scoring rules for risk Attitudes. The Review of Economic Studies, 76(4), 1461–1489. 10.1111/j.1467-937x.2009.00557.x First citation in article Crossref, Google Scholar
Paas, L. J., Dolnicar, S., & Karlsson, L. (2018). Instructional manipulation checks: A longitudinal analysis with implications for MTurk. International Journal of Research in Marketing, 35(2), 258–269. 10.1016/j.ijresmar.2018.01.003 First citation in article Crossref, Google Scholar
Peer, E., Brandimarte, L., Samat, S., & Acquisti, A. (2017). Beyond the Turk: Alternative platforms for crowdsourcing behavioral research. Journal of Experimental Social Psychology, 70(3), 153–163. 10.1016/j.jesp.2017.01.006 First citation in article Crossref, Google Scholar
Peer, E., Rothschild, D., Gordon, A., Evernden, Z., & Damer, E. (2022). Data quality of platforms and panels for online behavioral research. Behavior Research Methods, 54(4), 1643–1662. 10.3758/s13428-021-01694-3 First citation in article Crossref Medline, Google Scholar
Prelec, D. (2004). A Bayesian Truth Serum for subjective data. Science, 306(5695), 462–466. 10.1126/science.1102081 First citation in article Crossref Medline, Google Scholar
Radanovic, G., & Faltings, B. (2013). A robust Bayesian truth serum for non-Binary signals. In Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI'13) (CONF, pp. 833–839). 10.1609/aaai.v27i1.8677 First citation in article Crossref, Google Scholar
Rea, S. C., Kleeman, H., Zhu, Q., Gilbert, B., & Yue, C. (2020). Crowdsourcing as a tool for research: Methodological, fair, and political considerations. Bulletin of Science, Technology & Society, 40(3-4), 40–53. 10.1177/02704676211003808 First citation in article Crossref, Google Scholar
Ross, L., Greene, D., & House, P. (1977). The "false consensus effect": An egocentric bias in social perception and attribution processes. Journal of Experimental Social Psychology, 13(3), 279–301. 10.1016/0022-1031(77)90049-X First citation in article Crossref, Google Scholar
Rouse, S. V. (2015). A reliability analysis of Mechanical Turk data. Computers in Human Behavior, 43(2), 304–307. 10.1016/j.chb.2014.11.004 First citation in article Crossref, Google Scholar
Schlag, K. H., Tremewan, J., & van der Weele, J. J. (2015). A penny for your thoughts: A survey of methods for eliciting Beliefs. Experimental Economics, 18(3), 457–490. 10.1007/s10683-014-9416-x First citation in article Crossref, Google Scholar
Schoenegger, P. (2021). Experimental philosophy and the incentivisation challenge: A proposed application of the Bayesian Truth Serum. Review of Philosophy and Psychology, 1–26. 10.1007/s13164-021-00571-4 First citation in article Crossref, Google Scholar
Schönegger, P., & Verheyen, S. (2022). Data and materials for “Taking A Closer Look At The Bayesian Truth Serum: A Registered Report”. https://osf.io/5gnzu/ First citation in article Google Scholar
Sharp, E. C., Pelletier, L. G., & Lévesque, C. (2006). The double-edged sword of rewards for participation in psychology experiments. Canadian Journal of Behavioural Science / Revue canadienne des sciences du comportement, 38(3), 269–277. 10.1037/cjbs2006014 First citation in article Crossref, Google Scholar
Spino, J., & Cummins, D. D. (2014). The ticking time Bomb: When the use of torture is and is not endorsed. Review of Philosophy and Psychology, 5(4), 543–563. 10.1007/s13164-014-0199-y First citation in article Crossref, Google Scholar
Weaver, R., & Prelec, D. (2013). Creating truth-telling incentives with the Bayesian Truth Serum. Journal of Marketing Research, 50(3), 289–302. 10.1509/jmr.09.0039. First citation in article Crossref, Google Scholar
Weaver, S., Doucet, M., & Turri, J. (2017). It's what's on the inside that counts… or is it? Virtue and the psychological criteria of modesty. Review of Philosophy and Psychology, 8(3), 653–669. 10.1007/s13164-017-0333-8 First citation in article Crossref, Google Scholar
Witkowski, J., & Parkes, D. C. (2012). A robust Bayesian Truth serum for small populations. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI’12). First citation in article Google Scholar
Zhou, F., Page, L., Perrons, R. K., Zheng, Z., & Washington, S. (2019). Long-term forecasts for energy commodities price: What the experts think. Energy Economics, 84, 104484. 10.1016/j.eneco.2019.104484 First citation in article Crossref, Google Scholar

Volume 69Issue 4July 2022

ISSN: 1618-3169eISSN: 2190-5142

History

ReceivedDecember 6, 2022
RevisedSeptember 16, 2022
AcceptedSeptember 18, 2022
Published onlineDecember 7, 2022

Licenses & Copyright

Keywords

Acknowledgments:

The authors would like to thank Indre Tuminauskaite, Miguel Costa-Gomes, Theron Pummer, and Tom Heyman for helpful comments and suggestions, as well as the extremely helpful review team at PCI RR.

Conflict of Interest:

The authors of this article declare that they have no financial conflict of interest with the content of this article. The evaluation, opportunities for promotion, and ability to obtain research funding of S. Verheyen are partly dependent on the number of articles he publishes. The ideas and opinions expressed in the manuscript are those of the authors alone, and endorsement by the University of St Andrews or Erasmus University Rotterdam is not intended and should not be inferred.

Publication Ethics:

This research has received ethics approval from the University of St Andrews (SA15351).

Authorship:

Philipp Schoenegger, Steven Verheyen, conceptualization; P. Schoenegger, data collection and analysis, manuscript draft; S. Verheyen, critical revisions. Philipp Schoenegger, Steven Verheyen, discussed findings, read, approved the final version of the manuscript.

Open Data:

This manuscript is a Stage 2 Registered Report of this Stage 1 Registered Report: https://osf.io/ztucr (Stage 1: date of in-principle acceptance: April 23, 2022, https://rr.peercommunityin.org/articles/rec?id=149. Stage 2: date of acceptance: September 16, 2022, https://rr.peercommunityin.org/articles/rec?id=218). The study was pre-registered as a Registered Report (https://osf.io/t7m4h/).

We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study. Any deviations from the preregistration are explicitly indicated in the manuscript. The data and code that support the findings of this study are available on the Open Science Framework (https://osf.io/5gnzu/; Schönegger & Verheyen, 2022). They are licensed under a Creative Commons Attribution 4.0 International License (CC-BY), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original authors and the source, provide a link to the Creative Commons license, and indicate if changes were made. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

PDF download

Funding:

Part of this work was funded by the stipend Steven Verheyen receives as Executive Editor at the Review of Philosophy and Psychology.

Verify Phone

Congrats!

Taking a Closer Look at the Bayesian Truth Serum

A Registered Report

Abstract

References

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Taking a Closer Look at the Bayesian Truth Serum

A Registered Report

Abstract

References

History

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners