Skip to main content
Open Access

Survey Design Moderates Negativity Bias but not Positivity Bias in Self-Reported Job Stress

Results From a Randomized Split Ballot Experiment

Published Online:https://doi.org/10.1027/1015-5759/a000806

Abstract

Abstract: Self-report measures are both frequently used and criticized in studies of job-related stress. The question remains whether affective dispositions lead to biased assessments. In this study, we examine the extent to which survey characteristics are susceptible to bias by the characteristics of the person making the assessment. Participants (N = 1,509) in an online split ballot experiment were randomly assigned to report their job stressors using a 2 (task vs. person-related items) × 2 (frequency vs. agreement response format) factorial design. Participants high in neuroticism or negative affectivity, but not positive affectivity, reported more job stressors when responding to person-related items compared to task-related items. Individuals high in neuroticism reported more job stressors when assessed with agreement compared to frequency response format. However, the response format did not alter the relationship between self-reported job stressors and positive or negative affectivity. Findings indicate how survey design can reinforce affectivity bias in the assessments of job stressors. If an assessment is intended to evaluate objective circumstances rather than subjective experiences at work (e.g., the presence of general risk factors within psychosocial risk assessment), it is recommended to employ condition-related questionnaires with task-related item wordings and frequency response formats.

Self-report measures have a long tradition in job stress research. At the same time, self-reports of job stress have been criticized for being prone to bias: Bias describes how a variable, that is neither the cause nor the effect of a particular intended construct, distorts the assessment of that construct (Spector & Brannick, 1995; Spector et al., 2000). Self-report measures of stress are thus the result not only of the variance of the intended construct, i.e. differences in the amount of job stress and its error variance, but also of biasing factors that create variance in the measure beyond variance that is explained by the underlying theoretical construct. Responses to self-report measures for example may reflect personality characteristics that determine more general outlooks on life (Semmer et al., 2004). Thus, self-reported job stress is both a result of the actual amount of stress at work and a result of the characteristics of the person making the assessment.

Bias by dispositional characteristics is of particular concern when decisions about whether or not to employ measures for occupational health and safety are derived from self-reported job stress. The German Occupational Health and Safety Act (§5 ArbSchG), for example, mandates risk assessments of the working conditions that are independent of interindividual characteristics; still, employees’ self-reports of their working conditions are commonly used in psychosocial risk assessments at work. In the context of psychosocial risk assessment, a key distinction is therefore whether employees provide information about their work (i.e., stressors) or whether they provide information about themselves (i.e., strain; Rau, 2010).

One of the most commonly investigated effects is negativity bias, which assumes that individuals high in neuroticism or negative affectivity tend to report more job stressors even in the absence of objective stress (Semmer et al., 2004; Spector et al., 2019; Watson et al., 1987). The exact role of negative affectivity in the assessment of job stress and its observed correlations with strain is not clear, as negative affectivity may have both biasing as well as substantial effects on self-reported job stress (Spector et al., 2019; Spector et al., 2000). As a consequence, the question of how to deal with negative affectivity in job stress research was debated controversy: While some authors recommended not to control for negative affectivity (Spector et al., 2000), others suggested including negative affectivity in stress-strain analyses as well as measures of positivity bias that may lead to underreporting of stressful working conditions (Judge et al., 2000). However, the extent to which characteristics of the survey instrument are related to potential biasing factors remains largely unknown. Knowledge about the method effects of survey instruments is crucial to improve the measurement quality of the intended constructs.

In this contribution, we therefore suggest a third way of dealing with bias in self-reported measures of job stress and that is to build assessments in a way that they are more robust against potential biases. We aim to contribute to the theory about what drives measurement variance by experimentally investigating how two main characteristics of a job stressor measurement, that is, item wording and response format, interact with employees’ dispositional characteristics, that is, neuroticism and affectivity, as potential biasing factors. The results of our study advance the field of knowledge on how questionnaire design can buffer or reinforce bias in the assessment of self-reported job stress. This information is crucial for researchers who are designing measurement instruments as well as for practitioners, particularly in occupational safety and health management, who need to decide between existing measurement instruments.

Measurement Bias and Questionnaire Design

The measure-centric approach to method variance (Spector et al., 2019) is a recent re-conceptualization of method variance, suggesting that bias is not due to the measurement per se, but rather each measured construct can be affected by a dedicated set of biasing factors that is unique to the measured construct and that determines the extent to which an observed correlation is inflated or deflated. In testing the assumptions made by the measure-centric approach to method variance, Spector et al. (2023) showed that five different potential biasing factors (hostile attribution bias, negative affectivity, mood, neutral object satisfaction, and social desirability) show differential associations with three job stress measures. The actual mechanism by which certain attributes of job stress assessments are susceptible to dedicated biasing factors is however still an open research question. In this contribution, we therefore discuss two attributes of job stress measures, that is, item wording and response format, and their relationship with biasing factors.

In their seminal work, Campbell and Fiske (1959) already discussed the item format of measurement instruments as a source of common method variance. A common shortcoming of self-report measures of stress and strain is their overlap in item wordings: Items intended to operationalize stress frequently incorporate the appraisal of job characteristics as stressful (Semmer et al., 2004). The common variance components of stress and strain are thus not a result of a generalized biasing effect caused by self-report measures, but can rather be explained with variance caused by the theoretical construct, which was introduced in both measurements simultaneously. To avoid this confounding, a common suggestion is to operationalize job stress with items that are condition-related and non-evaluative of the situation (Rau, 2010; Semmer et al., 2004). In a pilot study, Lang et al. (2020) showed that correlations between neuroticism and job stress increased when using person-related rather than task-related item wording. The authors assumed that including keywords like “I” or “my” in the assessment would establish a reference to the working conditions of the person making the assessment. The extent to which task-related versus person-related item wordings are more susceptible to bias should therefore be reflected in the correlation of the respective job stress measure with personality characteristics. In replicating and extending the finding from this pilot study, we, therefore, hypothesize that correlations of job stress with neuroticism (Hypothesis 1a, H1a) as well as with negative affectivity (Hypothesis 1b, H1b), and positive affectivity (Hypothesis 1c, H1c) will be higher if job stress is surveyed with person-related rather than with task-related items.

In terms of response formats, it is assumed that agreement formats are particularly prone to response patterns, such as the tendency to extreme response categories or the tendency to agree with items (Baumgartner & Steenkamp, 2001; Weijters et al., 2010). While studies on the relationship between organizational citizenship behavior and counterproductive work behavior found increased correlations between the two constructs when agreement rather than frequency response formats were used (Dalal, 2005; Spector et al., 2010), Spector and Nixon (2019) did not find substantial effects of response format on common method variance of stress and strain measures. In addition, Marfeo et al. (2014) suggested frequency response formats perform better in assessing specific behaviors or situations, whereas agreement response formats better suit assessments of subjective attitudes and affect as well as feelings of work-related behavioral health functioning. In particular, this last finding supposes that agreement response formats are more appropriate for eliciting strain, while more descriptive statements about the frequency of certain events at work are better suited to measuring stress. We therefore hypothesize that correlations of job stress with neuroticism (Hypothesis 2a, H2a) as well as with negative affectivity (Hypothesis 2b, H2b), and with positive affectivity (Hypothesis 2c, H1c) will be higher if job stress is surveyed with agreement rather than with frequency response formats.

Methods

Experimental Procedure

We used a randomized split ballot design to assign participants to one of four experimental conditions that differed on two properties of the job stress measure, that is, item wording and response format: Condition 1 used the original job stress measure with task-related item wording, for example, “Within the activity regular recovery breaks are taken” and a 4-point frequency response format ranging from 1 = at no time or some of the time to 4 = most or all of the time. For Condition 2, we reworded the items of the original job stress measure to be person-related instead of task-related, for example, “Within my activity I can take regular recovery breaks” while maintaining the frequency response format. Condition 3 used task-related item wording but changed the response options to an agreement format ranging from 1 = I strongly agree to 4 = I strongly disagree. Condition 4 used person-related item wording and agreement response format. Response options were displayed to participants without numerical coding. Item wordings and response options of all experimental conditions are provided in the supplementary material (Pauli & Lang, 2023). Each experimental condition was open for participation until the completion of 400 responses. Participants in one experimental condition were excluded from participation in further experimental conditions.

Sample Description

Participants were recruited via Amazon Mechanical Turk (MTurk; Buhrmester et al., 2011). We used Cloud Research to target 1,600 MTurk workers; resulting in a return of 1,660 participants due to oversampling (see supplementary material 14 for sample size calculation; Pauli & Lang, 2023). Participation was open only to MTurk workers resident in the United States of America and currently employed. 151 participants were removed due to insufficient effort responding (see supplementary material 14 for dropout reasoning and analysis; Pauli & Lang, 2023), which led to a final sample of N = 1,509 participants with a mean age of 40.9 years (SD = 11.0), 51.1% were female, 62.5% worked in the private sector, 27.9% worked in the public sector, and 9.2% worked in the nonprofit sector. Average weekly working time was 39.6 hr (SD = 8.26), 88.9% were full-time employed, and 9.1% were part-time employed. Participants removed from the final sample did not show considerable differences in these characteristics.

Measures

Job Stressors were measured using the English version of a 33-item questionnaire (Kuczynski et al., 2020) that assessed job characteristics with eight subscales related to the work environment, social relations with colleagues as well as with supervisors, work intensity, task clarity, work continuity, decision latitude, and emotional challenges. The response format is a 4-point frequency scale ranging from 1 = at no time or some of the time to 4 = most or all of the time with labels not only for the endpoints but for all response categories. Job characteristics were coded in a way that higher values indicate more stressful working conditions.

Neuroticism was measured with eight items of the Big Five Inventory subscale (John & Srivastava, 1999). Participants indicated on a 5-point rating scale (1 = disagree strongly to 5 = agree strongly, labels for all response categories) the extent to which they agree with various statements. An example item is: “I see myself as someone who gets nervous easily”. In the present sample, reliability was high (α = .91, ω = .92).

Positive and Negative Affectivity were measured using the Positive and Negative Affect Schedule (PANAS; Watson et al., 1987). Participants indicated on a 5-point intensity scale (1 = very slightly or not at all to 5 = extremely, labels for all response categories) the extent to which certain feelings and emotions generally apply to them. PANAS measures positive affectivity and negative affectivity with 10 items each. Sample items for positive affectivity are “attentive,” “inspired,” or “proud”. Sample items for negative affectivity are “distressed,” “irritable,” or “afraid”. In the present sample, reliability was high for both positive affectivity (α = .93, ω = .93) and negative affectivity (α = .92, ω = .92).

Analytic Strategy

We calculated Pearson-correlations of neuroticism, negative affectivity, and positive affectivity with a mean index of all 33 stressor items across both experimental conditions as well as with eight job stressors subscales. We used Benjamini-Hochberg corrected z-tests to identify significant differences in these correlations from independent samples and linear regression modeling to investigate the moderation effects of questionnaire design on the relationship between job stress and personality characteristics. Marginal means and simple slopes were estimated with emmeans version 1.8.5 (Lenth, 2023) in R version 4.2.3 (R Core Team, 2021).

Results

Effects of Item Wording on Self-Reported Job Stress – Testing of Hypothesis 1

Across all participants, the mean for neuroticism was 2.5 (SD = 1.01), for negative affectivity 1.54 (SD = 0.65), and for positive affectivity 3.3 (SD = 0.85). Table 1 shows the correlations of different job stressors with neuroticism, negative affectivity, and positive affectivity separated by task-related and person-related item wording.

Table 1 Comparing Pearson-correlations of job stressors and biasing factors across different item wordings

The first two rows of Table 1 compare the mean index of all 33 items of the job stress measure for task-related and person-related item wording. When items are phrased task-related, the correlation between job stress and neuroticism (r = .28, p < .001) is smaller compared to person-related item wording (r = .41, p < .001); this difference between the two sample correlation coefficients is statistically significant (z = 2.03, p = .02), after applying Benjamini-Hochberg correction for multiple testing. The same pattern is found in the correlations of job stress with negative affectivity (z = 3.39, p < .001), but not with positive affectivity (z = 0.15, p = .44). In addition, linear regression analyses with neuroticism or negative affectivity predicting job stress indicate that more variance in job stress is explained when using person-related rather than task-related item wordings (ΔR2 = .09 for neuroticism; ΔR2 = .17 for negative affectivity; see supplementary material 5; Pauli & Lang, 2023). Thus, Hypotheses 1a and 1b are accepted whereas Hypothesis 1c is rejected. In the following section, we provide in-depth analyses of the structure of the relationships between item wording, job stressors, and biasing factors.

For all eight job stressor characteristics, the correlation between job stressors and neuroticism is higher when job stressors were measured with person-related item wording. For five of these eight job stressor characteristics, this difference is statistically significant. Differences were nonsignificant for the work environment, work continuity, and emotional challenges. The same pattern is found for the correlations of the job stressor characteristics with negative affectivity. When decision latitude is measured with person-related item wording, the correlation with positive affectivity is significantly higher compared to task-related measurement. For the remaining seven of the eight subscales, there is no significant difference in correlations with item wording and positive affectivity.

We build separate moderation models for neuroticism, negative affectivity, and positive affectivity in which we first entered the potential biasing factor (Step 1), followed by the experimental condition, that is, a dichotomous variable for 0 = task-related and 1 = person-related item wording (Step 2) and the interaction term for the experimental condition and biasing factor.

Table 2 shows a significant interaction between neuroticism and item wording on job stress (β = 0.23, p = .02) as well as significant interaction between negative affectivity and item wording on job stress (β = 0.27, p = .01). The interaction of positive affectivity and item wording does not have significant effects on job stress (β = 0.10, p = .50). The estimated job stress for individuals assessed with task-related item wording and high neuroticism is 1.80 (95% CI [1.74, 1.85]). For individuals with task-related item wording and low neuroticism, the estimated job stress is 1.59 (95% CI [1.54, 1.65]). This difference is bigger for individuals assessed with person-related item wording: For individuals high in neuroticism, the estimated job stress is 1.84 (95% CI [1.79, 1.89]); for individuals assessed with person-related item wording and low neuroticism, the estimated job stress is 1.51 (95% CI [1.45, 1.56]). The same pattern is shown for the estimated marginal means of job stressors at different levels of negative affectivity across experimental conditions, whereas estimated marginal means of job stressors at different levels of positive affectivity do not substantially differ across experimental conditions (see supplementary material 1; Pauli & Lang, 2023).

Table 2 Results from hierarchical regression analyses on the moderation effect of item wording on the relationship between job stress and biasing factors

Simple slope analyses (see supplementary material 2; Pauli & Lang, 2023) indicate that for individuals assessed with task-related item wording, each unit increase in neuroticism is associated with a 0.096 unit increase in job stress (b = 0.096, 95% CI [0.060, 0.132]). For individuals assessed with person-related item wording, each unit increase in neuroticism is associated with a 0.157 unit increase in job stress (b = 0.157, 95% CI [0.124, 0.190]). Thus, the slope of neuroticism on job stress for individuals assessed with task-related item wording is significantly smaller than for person-related item wording (difference = −0.061, p = .02); in other words, neuroticism affects self-reports of job stress especially when these are worded with personal reference compared to when they are worded with reference to the working conditions. The same pattern is shown for negative affectivity, where the slope of negative affectivity on job stress for task-related item wording is significantly smaller than for person-related item wording (difference = −0.113, p = .01). The difference in slopes of positive affectivity on job stress for task-related item wording and person-related item wording is not significant (difference = −0.020, p = .51). The plot of the interaction shows that the relationship between job stressors and neuroticism (Figure 1) as well as negative affectivity (Figure 2) depends on item wording.

Figure 1 Moderating effect of item wording on the relationship between job stressors and neuroticism. Vertical bars indicate point-wise 95% confidence intervals for the mean.
Figure 2 Moderating effect of item wording on the relationship between job stressors and negative affectivity. Vertical bars indicate point-wise 95% confidence intervals for the mean.

Effects of Response Format on Self-Reported Job Stress– Testing of Hypothesis 2

Table 3 shows the correlations of different job stressors with neuroticism, negative affectivity, and positive affectivity separated by frequency and agreement response formats and row-wise z-test comparisons of correlation coefficients across experimental conditions.

Table 3 Comparing Pearson-correlations of job stressors and biasing factors across different response formats

With the frequency response format, the correlation between job stressors and neuroticism (r = .28, p < .001) is slightly smaller compared to the agreement response format (r = .32, p < .001); however, this difference between the two sample correlation coefficients is not statistically significant (z = 0.60, p = .27), after applying Benjamini-Hochberg correction for multiple testing. The same pattern is found in the correlations of job stressors with negative affectivity (z = 0.31, p = .38). Correlations of job stressors with positive affectivity are slightly higher with frequency response format (r = −.29, p < .001) compared to agreement response format (r = −.27, p < .001). However, this difference is not significant (z = 0.03, p = .38). According to linear regression analysis (results not shown), negligibly more variance in job stress is explained when using agreement rather than frequency response formats (ΔR2 = 0.02 for neuroticism; ΔR2 = 0.01 for negative affectivity). There seems no clear and statistically validated pattern for the effect of the response format on the relationship between job stressors and biasing factors, which is why Hypotheses 2a, 2b, and 2c are rejected. In the following section, we provide in-depth analyses of the structure of the relationships between response format, job stressors, and biasing factors.

According to the results from hierarchical linear regression analyses in Table 4, both response format and potential biasing factors, that is, neuroticism, negative affectivity and positive affectivity explain significant variation in job stressors (Step 2). The moderation effect of response format on the relationship between job stressors, and neuroticism is small and significant only at a more liberal α-level (β = 0.17, p = .10) and is not significant for negative affectivity (β = 0.06, p = .53) and positive affectivity (β = −0.07, p = .62). Considering different levels of neuroticism, the difference in estimated marginal means of job stress measured with frequency response format (1.80 − 1.61 = 0.19) is slightly smaller compared to the difference in estimated marginal means of job stress measured with agreement response format (2.01 − 1.73 = 0.28). The same pattern, albeit less pronounced, is visible for job stressors at different levels of negative affectivity and for positive affectivity, where the difference in estimated marginal means of job stress measured with frequency response format (1.79 − 1.60 = 0.19) is slightly smaller compared to the difference in estimated marginal means of job stress measured with agreement response format (2.00 − 1.78 = 0.22) (see supplementary material 3; Pauli & Lang, 2023).

Table 4 Results from hierarchical regression analyses on the moderation effect of response format on the relationship between job stress and biasing factors

The slope of neuroticism on job stress for the frequency-response format is significantly smaller than for the agreement-response format (difference = −0.045, p = .10), considering a more liberal α-level of .10, that is, individuals high in neuroticism report more job stress when assessed with agreement rather than frequency response format (see Figure 3).

Figure 3 Moderating effect of Response Format on the relationship between job stressors and neuroticism. Vertical bars indicate point-wise 95% confidence intervals for the mean.

Both marginal means and simple slope tables do not support any interaction of negative and positive affectivity with response format (see supplementary material 4; Pauli & Lang, 2023). However, participants in the agreement response format group report more job stress compared to participants in the frequency response format group, unconditional of personality characteristics.

Additional analyses of the interaction effects of the questionnaire and dispositional characteristics with all eight stressor subscales are provided in supplementary materials 6–13 (Pauli & Lang, 2023).

Discussion

Self-report measures of job stress are suspected of being prone to bias by the characteristics of the person making the assessment. According to the measure-centric approach to method variance (Spector et al., 2019) measurements of theoretical constructs are affected by unique sets of biasing factors, while it is still unclear how bias is related to measurement characteristics. The present study was designed to investigate what drives measurement variance in the assessment of job stress and to what extent characteristics of a job stress measure are susceptible to bias by characteristics of the person making the assessment. Since employee self-reports are a common foundation for psychosocial risk assessments, providing measurements that are as unbiased as possible is a prerequisite for effective occupational health and safety measures.

In a randomized split ballot experiment, we varied item wording and response format of a self-report job stress measure and compared correlations of job stress with neuroticism, negative and positive affectivity across experimental conditions. In showing that individuals high in neuroticism report more job stress when assessed with person-related rather than task related item wordings, this study replicates findings from a pilot study by Lang et al. (2020) and extends prior knowledge on survey design effects by showing that individuals high in negative affectivity report more job stress when assessed with person-related rather than task-related item wordings as well. Moreover, both neuroticism and negative affectivity explain more variance in job stress when assessed with person-related rather than task-related item wordings. Even minor changes in item wordings – “The activity requires …” versus “My activity requires me to …” – lead to significant differences in the measurement properties of the survey instruments. The strength of the bias however varies across different types of job stressors: While moderate to strong correlations are found across the mean index of all job stressors, correlations with individual job stress subscales are lower. Notably, this finding was language invariant, that is, reproduced in two independent study populations and across different language versions of the survey instrument, as Lang et al. (2020) used a sample from Germany whereas this study used an American sample. In both studies, the relevance of personality characteristics for the outcome of that self-report measurement was increased when job stressors were assessed with person-related rather than task-related item wordings. Our interpretation of these findings is that the use of person-related keywords such as “I” and “my” introduces variance components into the job stress assessment which provide shared variance among the assessment of job stressors as well as personality characteristics. This stimulates statements about both employees’ working conditions as well as statements about themselves, and ultimately leads to a confounding of stress and strain assessments.

Contrary to expectations, there were only minor to no differences in the associations of job stress and personality traits across different response formats. Individuals high in neuroticism reported more job stress when assessed with agreement rather than frequency response formats. However, this effect was comparatively small and therefore only significant at a more liberal α-level < .10 and absent for the interaction of job stress with negative affectivity and positive affectivity. Notably, intercepts of the relationship between job stress and neuroticism as well as negative and positive affectivity marginally differed across response formats, that is, job stress was slightly higher in the agreement response format condition. Given that objective work analyses provide more conservative estimates of job stress compared to subjective self-reports (Semmer et al., 1999), this finding may be interpreted in the way that frequency response formats better suit assessments of objective work conditions rather than self-reports.

Although item wording and response format were randomized in the study, the biasing factors, that is, personality characteristics, could not be randomized. Thus, one cannot be sure that the interaction is causal in the sense that some hypothetical intervention on the biasing factors (e.g., reducing negative affect) would actually alter the effects of item wording and response format on reported job stress. The authors thank the anonymous reviewer for pointing out this important aspect. Still, the correlation between a third variable (e.g., personality characteristics) with the magnitude of the causal effect of another variable (e.g., survey characteristics) is important information (Rohrer & Arslan, 2021). To this end, our findings are of special interest to (a) researchers developing measures of job stress and deciding between which item wording and response format to use and (b) researchers wanting to apply existing measures and who are in charge of deciding upon which one to choose. There might be cases in which employees’ subjective interpretation of their work situation is of interest. Many times, however, self-reported job stress is used to infer about employees’ actual working conditions, for example, in the context of psychosocial risk assessment. In these cases, work characteristics should be described as objectively as possible. According to the results of the present study, assessments of job stress are correlated with characteristics of the person making the assessment especially when person-related item wordings and agreement response formats are used. In order to reduce subjective bias in the assessment of working conditions, task-related item wordings and frequency response formats should be preferred.

In comparing correlations of different response formats with personality characteristics, our study provides insights into what drives measurement variance beyond the findings of previous studies on the effect of response format on stressor-strain relationships (Dalal, 2005; Spector et al., 2010; Spector & Nixon, 2019). It was suggested that biasing factors such as negative affectivity or mood, while probably influencing constructs related to emotions, might be irrelevant for non-affective constructs (Spector & Nixon, 2019). For example, job stress related to affect, for example, emotional challenges at work or social relations with colleagues and supervisors, should be more susceptible to bias compared to job stressors that focus on characteristics of the task, for example, work intensity or characteristics of the physical work environment. However, our results do not provide evidence for this assumption. Neither the content of the specific work stressor nor whether specific job stressors were surveyed with agreement or frequency response formats accounts for significant differences in the correlation of the respective stressors with personality characteristics.

Interestingly, no significant effect of item wording or response format was found for the association of job stressors with positive affectivity. Prior studies showed that positive affectivity may well provide a means of coping with stressful situations (Schenk et al., 2018). Accordingly, positive affectivity may interact with stressor-strain relationships (Jex & Spector, 1996), but not with the appraisal of stressful situations. One explanation might be that positive affectivity is not as sensitive as negative affectivity to personal addresses (as in person-related item wordings) and that people high in positive affectivity per se hold back their personality more than people high in negative affectivity. Nevertheless, positivity bias can also be a significant confounding variable in the assessment of job stress, for example, through self-deception of stressful events in order to maintain positive attitudes (Judge et al., 2000).

A limitation of our study is that differences in item wording and response format were restricted to correlations with biasing factors without studying their impact on stressor-strain relationships. Given the proportion of variance in job stress explained by personality characteristics varies across different questionnaire designs, it can be assumed that these variance components are also found in measures of strain. Beyond the mere control of biasing factors in stressor-strain studies, experimental studies on the influence of different questionnaire designs on stressor-strain correlations can therefore provide valuable contributions to the understanding of the sources of method variance. Beyond disentangling within from between group variance components in multi-level research prominent in studies of job stress, we were able to randomly assign participants to different item wordings and response formats and thus minimize the effects of potentially confounding variables, which in turn increases the likelihood that observed differences are due to our experimental manipulation of the independent variable. In addition, it was beyond the scope of this contribution to investigate whether job stressor assessments vary in their psychometric properties across experimental conditions. We suspect that common proportions of variance of personality characteristics are found at the level of the individual items of the respective job stressor subscales. A task for future research should therefore be to investigate differences in measurement quality of job stressor measures when controlling for potential biasing factors.

Conclusion

Affective dispositions can bias measurements of self-reported job stress. Survey design, that is, characteristics of the job stressor measure, can reinforce this effect. Self-report measures of job stress should therefore keep stressor assessments as condition-related and value-neutral as possible, especially when striving for objective assessments of working conditions. Our results suggest this can be achieved using survey instruments that apply task-related rather than person-related question wordings and potentially frequency instead of agreement response formats.

References

  • Baumgartner, H., & Steenkamp, J.-B. E. (2001). Response styles in marketing research: A cross-national investigation. Journal of Marketing Research, 38(2), 143–156. https://doi.org/10.1509/jmkr.38.2.143.18840 First citation in articleCrossrefGoogle Scholar

  • Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6(1), 3–5. https://doi.org/10.1177/1745691610393980 First citation in articleCrossrefGoogle Scholar

  • Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81–105. https://doi.org/10.1037/h0046016 First citation in articleCrossrefGoogle Scholar

  • Dalal, R. S. (2005). A meta-analysis of the relationship between organizational citizenship behavior and counterproductive work behavior. The Journal of Applied Psychology, 90(6), 1241–1255. https://doi.org/10.1037/0021-9010.90.6.1241 First citation in articleCrossrefGoogle Scholar

  • Jex, S. M., & Spector, P. E. (1996). The impact of negative affectivity on stressor-strain relations: A replication and extension. Work and Stress, 10(1), 36–45. https://doi.org/10.1080/02678379608256783 First citation in articleCrossrefGoogle Scholar

  • John, O. P., & Srivastava, S. (1999). The Big Five trait taxonomy: History, measurement, and theoretical perspectives. In L. A. PervinO. P. JohnEds., Handbook of personality: Theory and research (2nd ed., pp. 102–113). The Guilford Press. First citation in articleGoogle Scholar

  • Judge, T. A., Erez, A., & Thoresen, C. J. (2000). Why negative affectivity (and self-deception) should be included in job stress research: Bathing the baby with the bath water. Journal of Organizational Behavior, 21(1), 101–111. https://doi.org/10.1002/(SICI)1099-1379(200002)21:1<101::AID-JOB966>3.0.CO;2-Q First citation in articleCrossrefGoogle Scholar

  • Kuczynski, I., Mädler, M., Taibi, Y., & Lang, J. (2020). The assessment of psychosocial work conditions and their relationship to well-being: A multi-study report. International Journal of Environmental Research and Public Health, 17(5), Article 1654. https://doi.org/10.3390/ijerph17051654 First citation in articleCrossrefGoogle Scholar

  • Lang, J., Pauli, R., Lazic, A., & Kuczynski, I. (2020). Der Einfluss von Neurotizismus auf zwei Formulierungsvarianten einer psychischen Belastungsmessung: ein randomisiertes Split-Ballot-Experiment [The influence of neuroticism on two variants of a measure of mental stress: A randomized split-ballot experiment]. In Deutsche Gesellschaft für Arbeitsmedizin und Umweltmedizin (DGAUM)Eds., 60. Wissenschaftliche Jahrestagung der DGAUM. Conference Proceedings (pp. 71–72). Gentner. First citation in articleGoogle Scholar

  • Lenth, R. V. (2023). emmeans: Estimated marginal means, aka least-squares means. R package version 1.8.5. https://CRAN.R-project.org/package=emmeans First citation in articleGoogle Scholar

  • Marfeo, E. E., Ni, P., Chan, L., Rasch, E. K., & Jette, A. M. (2014). Combining agreement and frequency rating scales to optimize psychometrics in measuring behavioral health functioning. Journal of Clinical Epidemiology, 67(7), 781–784. https://doi.org/10.1016/j.jclinepi.2014.02.005 First citation in articleCrossrefGoogle Scholar

  • Pauli, R., & Lang, J. (2023). Data and supplementary materials for “Survey design moderates negativity bias but not positivity bias in self-reported job stress. Results from a randomized split ballot experiment”. https://doi.org/10.17605/OSF.IO/FYG2R First citation in articleCrossrefGoogle Scholar

  • R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. "http://www.R-project.org/ First citation in articleGoogle Scholar

  • Rau, R. (2010). Questioning or observation or both together? Which instruments should be used when psychic work load and strain have to be analyzed? Zentralblatt für Arbeitsmedizin, Arbeitsschutz und Ergonomie, 60(9), 294–301. https://doi.org/10.1007/BF03344299 First citation in articleCrossrefGoogle Scholar

  • Rohrer, J. M., & Arslan, R. C. (2021). Precise answers to vague questions: Issues with interactions. Advances in Methods and Practices in Psychological Science, 4(2), 1–19. https://doi.org/10.1177/25152459211007368 First citation in articleCrossrefGoogle Scholar

  • Schenk, H. M., Jeronimus, B. F., van der Krieke, L., Bos, E. H., de Jonge, P., & Rosmalen, J. G. M. (2018). Associations of positive affect and negative affect with allostatic load: A lifelines cohort study. Psychosomatic Medicine, 80(2), 160–166. https://doi.org/10.1097/PSY.0000000000000546 First citation in articleCrossrefGoogle Scholar

  • Semmer, N. K., Grebner, S., & Elfering, A. (2004). Beyond self report: Using observational physiological and situation based measures in research on occupational stress. In P. L. PerreweD. C. GansterEds., Research in occupational stress and well being, volume 3. Emotional and physiological processes and positive intervention strategies (1st ed., pp. 205–263). JAI. First citation in articleGoogle Scholar

  • Semmer, N. K., Zapf, D., & Dunckel, H. (1999). Instrument zur streßbezogenen Tätigkeitsanalyse ISTA [An instrument for stress-related activity analyses]. In H. DunckelEd., Handbuch psychologischer Arbeitsanalyseverfahren. Mensch, Technik, Organisation, 14 (pp. 179–204). Vdf Hochschulverlag. First citation in articleGoogle Scholar

  • Spector, P. E., Bauer, J. A., & Fox, S. (2010). Measurement artifacts in the assessment of counterproductive work behavior and organizational citizenship behavior: Do we know what we think we know? Journal of Applied Psychology, 95(4), 781–790. https://doi.org/10.1037/a0019477 First citation in articleCrossrefGoogle Scholar

  • Spector, P. E., & Brannick, M. (1995). The nature and effects of method variance in organizational research. In C. L. CooperI. T. RobertsonEds., International review of industrial and organizational psychology (pp. 249–274). Wiley. First citation in articleGoogle Scholar

  • Spector, P. E., Gray, C. E., & Rosen, C. C. (2022). Are biasing factors idiosyncratic to measures? A comparison of interpersonal conflict, organizational constraints, and workload. Journal of Business and Psychology, 38, 983–1002. https://doi.org/10.1007/s10869-022-09838-8 First citation in articleCrossrefGoogle Scholar

  • Spector, P. E., & Nixon, A. E. (2019). How often do I agree: An experimental test of item format method variance in stress measures. Occupational Health Science, 3(2), 125–143. https://doi.org/10.1007/s41542-019-00039-z First citation in articleCrossrefGoogle Scholar

  • Spector, P. E., Rosen, C. C., Richardson, H. A., Williams, L. J., & Johnson, R. E. (2019). A new perspective on method variance: A measure-centric approach. Journal of Management, 45(3), 855–880. https://doi.org/10.1177/0149206316687295 First citation in articleCrossrefGoogle Scholar

  • Spector, P. E., Zapf, D., Chen, P. Y., & Frese, M. (2000). Why negative affectivity should not be controlled in job stress research: Don’t throw out the baby with the bath water. Journal of Organizational Behavior, 21(1), 79–95. https://doi.org/10.1002/(SICI)1099-1379(200002)21:1<79::AID-JOB964>3.0.CO;2-G First citation in articleCrossrefGoogle Scholar

  • Watson, D., Pennebaker, J. W., & Folger, R. (1987). Beyond negative affectivity. Journal of Organizational Behavior Management, 8(2), 141–158. https://doi.org/10.1300/J075v08n02_09 First citation in articleCrossrefGoogle Scholar

  • Weijters, B., Geuens, M., & Schillewaert, N. (2010). The individual consistency of acquiescence and extreme response style in self-report questionnaires. Applied Psychological Measurement, 34(2), 105–121. https://doi.org/10.1177/0146621609338593 First citation in articleCrossrefGoogle Scholar