Skip to main content
Open AccessReplication

Replication of “Experiencing Physical Warmth Promotes Interpersonal Warmth” by

Published Online:


We report the results of three high-powered, independent replications of Study 2 from Williams and Bargh (2008). Participants evaluated hot or cold instant therapeutic packs before choosing a reward for participation that was framed as a prosocial (i.e., treat for a friend) or self-interested reward (i.e., treat for the self). Williams and Bargh predicted that evaluating the hot pack would lead to a higher probability of making a prosocial choice compared to evaluating the cold pack. We did not replicate the effect in any individual laboratory or when considering the results of the three replications together (total N = 861). We conclude that there is no evidence that brief exposure to warm therapeutic packs induces greater prosocial responding than exposure to cold therapeutic packs.

Williams and Bargh (2008) investigated the effects of physical warmth on interpersonal warmth and prosocial behavior in two studies. Inspired by prior research that underscores the importance of interpersonal warmth on interpersonal judgments (e.g., Asch, 1946; Cuddy, Fiske, & Glick, 2008; but see Nauts, Langner, Huijsmans, Vonk, & Wigboldus, 2014), Williams and Bargh hypothesized that exposure to physically warmer temperatures would lead to more positive judgments of strangers and increase prosocial decision making. In their first study, participants briefly held a coffee cup containing either warm or iced coffee. In line with predictions, participants who held the warm beverage judged a target individual to have a warmer personality (i.e., more generous and caring) compared to participants who held the cold beverage.

The second study involved participants ostensibly conducting a product evaluation and subsequently making choices that could be construed as prosocial or as self-interested. The key manipulation was whether participants evaluated either a hot or cold instant therapeutic gel pack. Following the evaluation, participants made a choice that was framed either as a prosocial gift for a friend or as a personal treat. Williams and Bargh (2008) observed that those who had evaluated the warm heat pad were more likely to make the prosocial choice (OR = 3.52, 95% CI = [1.06, 11.73]). More specifically, 75% of the participants who evaluated a cold pack selected a reward for themselves, whereas 46% of the participants who evaluated a warm pack did the same (analyzed N = 50). The conclusion from this work was that experiences of physical warmth unconsciously impact our impressions of others and prosocial behavior. The basic idea is that physical feelings of warmth translate to greater interpersonal warmth.

The Williams and Bargh (2008) paper was published in a prestigious journal (Science), and the paper has been highly cited (more than 470 times according to Google Scholar, more than 160 citations in Web of Science). The findings also received coverage in the popular press (e.g., Bartlett, 2013; Tierney, 2008), and, despite not having been directly replicated, has impacted subsequent research investigating how experiences of hot and cold can prime other behaviors (e.g., Bargh & Shalev, 2012; Kang, Williams, Clark, Gray, & Bargh, 2011; Leander, Chartrand, & Bargh, 2012; Williams, Huang, & Bargh, 2009). In this paper, we seek to replicate the findings of Study 2 from Williams and Bargh (2008).


Power Analysis and Sampling Plan

Based on the effect size from the original study and requiring statistical power of .95 with an alpha level of .05, we estimated the required sample size to be 300 participants (Epicalc; Chongsuvivatwong, 2012). This is a conservative estimate, allowing for the detection of a smaller effect than that observed in the original study (see Table A1, online supplementary materials). Three independent replications were conducted, each with a target sample size of 300. Replications took place at two US locations (as in the original study): Kenyon College, Michigan State University, and one UK location: University of Manchester, following agreed upon procedures.

Materials and Procedure

We preregistered the study proposal on the Open Science Framework (OSF) website and followed the procedures of the original study as closely as possible, with some minor modifications. For example, the choice of rewards offered varied depending on local availability. We also used different brands of therapeutic packs, again due to availability. Additionally, in all three replications, research assistants were blind to participants’ assigned temperature conditions, to reduce experimenter expectancy effects. Full details of the procedures at each location can be found in the online supplemental materials.

Researchers set up tables and testing areas at each event, and passersby were approached to take part in a product-evaluation study. Participants were brought to the testing area, where they were separated from the researcher by partitions. Once the consent form was signed, participants were given a questionnaire booklet. The cover page served to hide the second page that instructed the participant which of two boxes in front of them they should open; one box contained a hot pack, and one contained a cold pack. The cover page ensured that researchers were blind to the temperature pack condition to which participants were assigned.

On the questionnaire (see online supplementary materials), participants evaluated the effectiveness of either the hot/cold pack on a scale ranging from 1 = not at all to 7 = extremely and indicated to what extent they would recommend the product to their family, friends, or strangers on the same 7-point scale. (In the original study, participants indicated whether they would or would not recommend the product to their family, friends, or strangers as a dichotomous choice.) Finally participants estimated the internal temperature of the gel pack in degrees Celsius (UK site) or Fahrenheit (US sites). The first four questions were included in the original study to support the initial cover story, and the final question was intended as a manipulation check.

Once participants completed these questions, the questionnaire directed them to place the therapeutic gel pack product back in its original box. This also served to ensure that researchers remained unaware of the participant’s condition. The next page of the questionnaire included the main dependent variable, which consisted of the reward choice.

Each participant then completed a short funneled debriefing questionnaire (see online supplementary materials), which allowed us to evaluate whether the participant was suspicious of the study or guessed the underlying hypothesis. Once the participant had completed the funneled debriefing, they were brought away from the testing area and were given their chosen reward together with a page explaining the true nature of the study.

All three sites used the same 4″ × 5″ hot and cold instant therapeutic packs. Brand names were obscured from the packs with black marker. Hot packs were HeatMax brand hand and body warmers; cold packs were Dynarex brand (see Figure 1 ).

Figure 1. Example of therapeutic packs used in the study. Both packs were 4″ × 5″ size.


Demographic information for all participants is reported in Table 1 .

Table 1. Demographics for the three study locations

Kenyon College

Participants were 306 individuals from the Mt. Vernon, Ohio community area (N = 289) and Kenyon College psychology research pool participants (N = 17). Community participants were recruited in the outdoors from a local summertime festival in June and July of 2013. Kenyon College participants completed the study indoors. The rewards for participating in the study were a Snapple beverage (available immediately) or a voucher for a local cupcake shop worth $2 (located within walking distance of the data collection site).

Michigan State University

Participants were 250 individuals recruited at various indoor and outdoor locations on the Michigan State University campus during October and November 2013. Two hundred and fifty was the maximum number of participants that could be collected given the available time for data collection. The rewards for participating in the study were a Snapple beverage (available immediately) or a voucher for an ice cream at the campus dairy store. The voucher was actually worth $2 in US currency but the cash value was not mentioned to participants.

University of Manchester

Participants (N = 305) were recruited over several days during September and October 2013, at indoor and outdoor events around the University of Manchester (Open Days, Welcome Week), with a small proportion tested at an army reserve training day (N = 13). The rewards for participating in the study were a voucher for either a fruit juice or a fruit smoothie.


The dependent variable was whether participants made a prosocial or selfish choice on the critical reward question. The temperature of the pack (hot/cold) and reward framing (i.e., the counterbalancing of each reward as prosocial or selfish) served as between-participants independent variables.


Data Preparation and Manipulation Check

Each replication study was conducted independently, and there was no discussion of results between the groups until all data were collected. The analyses reported for each study have also been verified by at least one other group. We report all data exclusions, manipulations, measures, and how we determined our sample sizes. All data are available on the OSF project page.

Participants who met any of several a priori agreed upon rules for exclusion were removed prior to analysis. Grounds for exclusion included (1) being ± 3 SD away from the mean within each condition for pack temperature estimation, (2) failing to choose a reward for participation (the key dependent measure), and (3) making a connection in the debriefing form between physical and interpersonal warmth. There were 12 Kenyon College, 13 Michigan State University, and 23 University of Manchester participants excluded from the analysis on these grounds. The N on which all analyses are based is listed in Table 1. Additional information about excluded participants is in the supplemental materials.

Participants at all three sites rated the hot pack as warmer than the cold pack: ds = 2.50 (Kenyon), 2.22 (Michigan State), 2.61 (Manchester), suggesting that the manipulation was effective. These effect sizes are similarly large to those obtained in the original study (d = 2.98; Williams & Bargh, 2008). Descriptive statistics are in Table 2 . The effectiveness item and the three recommendation items had high intercorrelations (Cronbach’s α = .95 (Kenyon), .93 (Michigan State), and .93 (Manchester)) and were averaged together into a scale. There was no evidence of consistent differences across sites regarding this scale (see Table 2). Michigan State participants rated the cold pack as more effective/recommendable than the hot pack (d = 0.42); Kenyon College and University of Manchester participants did not distinguish between the packs on this scale (ds = 0.05 and 0.20, respectively). Williams and Bargh’s participants completed dichotomous recommendation ratings. However, on the single effectiveness item, Williams and Bargh (2008) found that the cold pack was rated more effective than the warm pack (d = 0.93).

Table 2. Ratings of hot and cold packs

Kenyon College

A chi-square test of pack temperature (cold vs. hot) on selfish behavior for each reward frame was conducted.1 The analysis was significant for the “Snapple is selfish” framing though in the opposite direction to that predicted by Williams and Bargh (2008), χ2 (1) = 5.276, p = .022. In this framing condition, 61.3% of participants who evaluated the hot pack made the selfish choice, whereas 42.5% of participants who evaluated the cold pack did the same. The analysis was not significant for the “Cupcake is selfish” framing, χ2 (1) = 0.259, p = .611. In this framing condition, 41.1% of participants who evaluated the hot pack made the selfish choice; 37.0% of participants who evaluated the cold pack did the same. Collapsing across framing conditions, the analysis yielded no evidence that participants exposed to cold packs were more selfish than participants exposed to warm packs. The chi-square value was statistically significant, χ2 (1) = 4.00, p = .045, OR = 0.61, 95% CI = [0.38, 0.98], though it was in the opposite direction of the original Williams and Bargh (2008) prediction with 51.4% of hot pack participants making the selfish choice compared to 39.7% of cold pack participants.2

An additional set of exploratory analyses were conducted to probe the robustness of the results under different selection conditions. First, the analysis was repeated with an additional 10 participants’ data removed. These participants experienced a variety of procedural problems that made us question whether they should stay in the analysis (see supplementary materials). The previously reported findings hold with these 10 participants removed (they also hold with all available data included – i.e., no exclusions). Further, we tested whether restricting the participants to only the community participants would matter; the findings held with just this subset of participants. Finally, we examined whether June 2013 participants (taken on an unseasonably cool evening) would differ from July 2013 participants (a much warmer evening). The findings did not achieve statistical significance in either month examined in isolation, but the pattern of results appeared similar in both months.

A chi-square analysis predicting prosocial behavior from pack temperature separately for men and women was conducted. Among women, 38.0% made the selfish choice after exposure to the cold pack, compared to 53.8% after exposure to the warm pack, OR = 0.53, 95% CI = [0.28, 0.99], a statistically significant difference. Among men, 41.8% made the selfish choice after exposure to the cold pack, compared to 47.8% after exposure to the warm pack, which was not statistically significant, OR = 0.78, 95% CI = [0.40, 1.55]. Thus the pattern of the results appeared similar for men and women, but perhaps the pattern was somewhat more pronounced for women participants.

Overall, these analyses indicate that we failed to replicate the findings of Williams and Bargh (2008). In fact, in one framing condition, the predicted effect was in the opposite direction, such that participants who held the warm pack were actually more selfish than participants who held the cold pack.

Michigan State University

An identical set of analyses was conducted. The results revealed that there was no effect of pack temperature on selfishness of choice in either framing condition (“Snapple is Selfish”: χ2 (1) = 0.234, p = .629; “Ice Cream is Selfish”: χ2 (1) = 0.019, p = .889) or collapsed across framing, χ2 (1) = 0.039, p = .842, OR = 0.92, 95% CI = [0.56, 1.53]. In the “Snapple is selfish” framing, 44.1% of participants in the hot pack condition and 39.7% of participants in the cold pack condition made the selfish choice. In the “ice cream is selfish” framing, 62.7% of participants in the hot pack condition and 63.9% of participants in the cold pack condition made the selfish choice. Collapsed across framing conditions, 53.4% of hot pack and 52.1% of cold pack participants made the selfish choice.

The inclusion of the data from the 13 removed participants did not change this result. Furthermore, the hypothesis was not supported for either male or female participants examined in isolation. Thus, these data also failed to replicate the original Williams and Bargh (2008) results.

University of Manchester

We originally intended to counterbalance which items were framed as the self-interested and prosocial options, but due to a printing error, fruit juice was always the self-interested option in the warm condition, whereas fruit smoothie was always the self-interested option in the cold condition. This modification should not impact our results as (1) the items (fruit juice, smoothie) were chosen for their similarity, (2) the specifics of the reward item are not theoretically relevant, only whether participants make the prosocial or the self-interested choice, and (3) no effect of temperature condition on type of reward was observed in the results of the other two replication sites reported in this paper. We therefore proceeded with a chi-square test of pack temperature on prosociality of choice.

Analysis revealed that 64.2% of people made the selfish choice overall, but that this choice was not significantly related to the temperature priming condition: χ2 (1) = 1.10, p = .295, OR = 0.77, 95% CI = [0.47, 1.26]. Specifically, 61.2% (N = 85) chose the selfish response in the cold condition, and 67.1% (N = 96) chose the selfish option in the warm condition. The inclusion of the data from the 23 removed participants did not change this result. Thus, the overall result did not replicate the original finding. Instead, the observed numerical trend was slightly in the opposite direction to that predicted.

Given the large sample it is possible that subsets of the participants displayed the pattern predicted by the original study, but that the pattern did not generalize to the overall sample. With this in mind, we examined whether significant effects of the temperature manipulation could be observed if we divided the sample by whether participants took part indoors (N = 157) or outdoors (N = 125), were native (N = 220) or non-native (N = 62) speakers of English, or were male (N = 139) or female (N = 143). There were no significant effects of temperature condition for any of these binary groupings.

Omnibus Analysis

The data from the three replication sites were combined into one chi-square analysis to determine the impact of pack temperature on reward choice (selfish vs. prosocial). There was no significant effect, although the result approached statistical significance in the opposite direction of that predicted by Williams and Bargh (2008), χ2 (1) = 3.402, p = .065, OR = 0.77, 95% CI = [.58, 1.02]. The results are displayed in Figures 2 and 3 .

Figure 2. Proportion of participants who made the selfish choice across studies. Error bars represent 95% CI. MSU = Michigan State University.
Figure 3. Odds ratios (OR) for tendency to make the selfish choice after exposure to cold versus warm temperatures. Values on the right hand side represent ORs and 95% CIs (lower and upper values). An OR of 1.0 indicates a null effect (i.e., even odds of selfish responding).


Williams and Bargh (2008) found that participants who previously held a hot pack made a more prosocial choice than participants who previously held a cold pack. We attempted three high-powered, independent replications of this original study, but we did not replicate the original result. We found no indication that participants who held warm packs were more prosocial than participants who held cold packs when prosocial actions were defined as opting for a token reward gift for a friend as opposed to a treat for the self. In our samples, the effect was (not significantly) in the opposite direction, such that participants who evaluated a cold pack were marginally more prosocial than participants who evaluated a hot pack, but this effect did not reach statistical significance at the p < .05 level. To summarize, we did not replicate the original result.

There may be several reasons for why we did not observe significant effects in these replications. One possibility is expectancy effects, which have previously been suggested as explanations following failures to replicate other social priming effects (see e.g., Klein et al., for a recent discussion of expectancy effects in experimental studies). The effect observed by Williams and Bargh may have been due, in part, to unconscious cues given by the researcher. In the original study, the researcher interacted directly with participants as they received their hot/cold packs, and so it is possible that unplanned cues were transmitted during this brief exchange (e.g., giving cues to behave more prosocially if participants were in the hot condition). In our study, the researchers were not aware of the condition the participants were in, at least until the debriefing procedure took place (and only then if participants verbally revealed details of their condition), and so could not provide unconscious cues consistent with the study predictions.

Of course there are many other possible explanations for why effects were found in the original study and not in the replication attempts (e.g., small sample sizes of original studies, random variations in the data, influence of unknown moderators). However, it is important to emphasize that the current results do not suggest that there are no influences of temperature on people’s behavior or that the current and related effects in the hot and cold priming literature are false positives. In the first case, there are many other examples demonstrating links between temperature change and behavioral outcomes, although the general tendency has been to find links between increased aggression and higher temperatures (e.g., Anderson, 2001), rather than higher temperatures being associated with more prosocial behaviors. In the second case, while it is clear that the temperature priming effect observed by Williams and Bargh (2008) cannot be reliably observed using highly similar procedures, it is important that evidence for any given priming effect in the literature should be considered on its own merits; effects should be investigated and replicated independently and not automatically dismissed beforehand. In short, we suggest more work is needed on this topic and conclude that the current results suggest some degree of added caution is needed when considering whether exposure to hot or cold temperatures impacts prosocial behavior. More broadly, there is a need for greater specification of the theoretical underpinnings and limitations of priming effects by researchers (Cesario, 2014) and more details of experimental procedures and analyses conducted (see, e.g., Klein et al., 2012; Nosek, Spies, & Motyl, 2012 for detailed suggestions in this vein). In this way, we can look forward to building a more robust social psychology for the future.

Note From the Editors

A commentary and a rejoinder on this paper are available (Lynott et al., 2014; Williams, 2014; doi: 10.1027/1864-9335/a000205).

1In the preregistered version of the study we indicated that we would use logistic regressions to analyze the data, following the analyses employed by Williams and Bargh (2008). However, we felt that, given the complexities of interpreting the significant interaction of the logistic regression analysis in the 2008 study, chi-square analyses provided a clearer test of the data, with more easily interpretable results. Nonetheless, analyzing the data from the replication studies using logistic regressions yields the same patterns as the chi-square analyses reported.

2For comparison purposes, in Williams and Bargh’s (2008) data the effect of the temperature pack manipulation was statistically significant in the “ice cream is selfish” framing condition, χ2 (1) = 6.032, p = .014 (% selfish in the cold condition = 92.9%; % selfish in the hot condition = 50.0%). The analysis was not significant in the “Snapple is selfish” framing condition, χ2 (1) = .729, p = .527 (% selfish in the cold condition = 50.0%; % selfish in the hot condition = 42.9%). The analysis was statistically significant collapsing across framing, χ2 (1) = 4.327, p = .038 (% selfish in the cold condition = 75.0%; % selfish in the hot condition = 46.2%).


The first and second authors contributed equally to this manuscript. This research was funded by a grant to the first and second authors from the Center for Open Science. Designed research: D. L., K. S. C., B. D., R. E. L., K. O.; Performed research: D. L., K. S. C., J. W., L. C.; Analyzed data: D. L., K. S. C., J. W., B. D.; Wrote initial draft: D. L., K. S. C.; Provided editorial comments: D. L., K. S. C., L. C., M. B. D., R. E. L., K. B., J. W. The authors declare no conflict of interest with the content of this article. The authors would like to give enormous thanks to Pat Watkinson and Jackie Harte for help with data collection at Manchester University and to Melanie Crank and Manchester University Student Union for help in finding space to run the study. We also thank Simon Garcia at Kenyon College for his assistance in evaluating the temperature of the therapeutic packs, as well as Katie Finnigan, Melek Spinel, Olivia Siulagi, and Maureen Hirt for their assistance in data collection. Finally, we would like to thank Lawrence Williams for helpful comments on the replication proposal and for making his data freely available. All materials, data, images of the procedure, and the preregistered design are available at

Katherine S. Corker, Department of Psychology, Kenyon College, Gambier, OH, 43022, USA,
Dermot Lynott, Department of Psychology, Lancaster University, Lancaster LA1 4YF, UK,