Skip to main content
Open AccessOriginal Article

Introducing the scr-OLBI

Examination of a Short Measure for Assessing Burnout Using Item Response Theory

Published Online:https://doi.org/10.1027/2698-1866/a000029

Abstract

Abstract. The Greek version of the Oldenburg Burnout Inventory (OLBI), used to assess exhaustion and disengagement from work, was analyzed using item response theory analyses to investigate the dimensionality and the psychometric properties of the measure’s items. The OLBI was administered to 617 Greek employees, of whom 314 also participated in the validation study. The results indicated that four negatively keyed items from the original measure exhibited excellent psychometric properties (item/test information functions) and were used for the construction of a shorter version of the OLBI. The scr-OLBI, composed of the first letters of the word “screening,” was tested for differential item functioning between male and female employees; no bias was detected in relation to gender. Our results reveal that the scr-OLBI is a reliable and valid indicator of work-related burnout, which appears to be functionally equivalent to the original version both theoretically and empirically, yet exhibits the advantages of a short measure.

Occupational burnout has received increasing attention worldwide (Demerouti & Bakker, 2008). Burnout is characterized by two dimensions: exhaustion and disengagement from work. Exhaustion refers to the consequences of intense physical, affective, and cognitive effort at work, resulting from prolonged exposure to high job demands. Exhausted people feel excessively stressed and unwell at work. Similarly, disengagement refers to individuals’ negative attitudes toward work goals, their unwillingness to maintain their position, and feelings of depersonalization. Disengaged individuals feel less challenged by their work and are more prone to resign (Demerouti et al., 2005).

Research distinguishes between clinical and subclinical burnouts (Demerouti et al., 2005; Schaufeli et al, 2001). The clinical facet refers to severe impairment due to personal distress, diminished performance, symptoms of severity, and inability to perform at work (e.g., Oosterholt et al., 2014). In contrast, the subclinical facet refers to a mild form of unwell-being that does not prevent, rather challenge, individuals during work hours. One measure developed for assessing the subclinical facet of burnout is the Oldenburg Burnout Inventory (OLBI; Demerouti et al., 2003). The OLBI measures work-related feelings of exhaustion and disengagement from work. It is a freely available, self-report instrument with 16 items, including eight positively and eight negatively worded items (Demerouti et al., 2010). The OLBI burnout score is obtained by summing up the scores for each subscale, with higher scores indicating more intense work-related burnout. The OLBI has been adapted in several countries, including the United States (e.g., Halbesleben & Demerouti, 2005), Portugal, Brazil (e.g., Campos et al., 2012), the Netherlands (e.g., Demerouti & Bakker, 2008), Malaysia (e.g., Subburaj & Vijayadurai, 2016), and Greece (e.g., Reis et al., 2015). Previous research focused exclusively on factor analysis techniques and yielded divergent results regarding factor structure and item characteristics. Halbesleben and Demerouti (2005) were the first to raise concerns about the measure’s dynamics, focusing on items' characteristics. A major concern is the low factor loadings of certain items and the possible creation of artificial factors based on wording and/or function that may be a source of common method bias.

Methodological Considerations

Demerouti and Bakker (2008) administered the OLBI to Dutch employees and found significant factor cross-loadings on four items. The researchers suggested that the content of these items created disturbances to the nomonological network of burnout. Deficiencies regarding the construct validity of the measure are also reported by Campos et al. (2012) due to the factor cross-loadings of certain items. The proposed 2-factor model showed reasonable fit, but a modified version with the deletion of certain items exhibited a much better fit to their data. Reis et al. (2015) examined the factorial structure and measurement invariance of the OLBI across German and Greek students. The researchers used a modified version of the OLBI (focusing on academic burnout) to assess the construct’s equivalence. The results confirmed the proposed 2-factor model for the Greek version. However, discrepancies were found for the German disengagement subscale due to context-specific issues that need further investigation. Mahadi et al. (2018) examined the factor structure of the OLBI in the Malay population and found unacceptable fit for both the proposed 2-factor and 1-factor models. The researchers pointed out items with poor factor loadings on their proposed construct and performed a stepwise removal approach with the deletion of seven items. Their modified version of the OLBI included nine items and showed acceptably better fit to the data compared to the original measure. Another shorter version was examined on an independent sample (Mészáros et al., 2020); their modified version comprised five items on each factor (both positively and negatively keyed). Factor analysis of this modified version of the OLBI gave prominence to certain discrepancies, underscoring the need for the use of more elaborated techniques.

Discrepancies in the factor structure of the OLBI may be due to the measure’s theoretical background. As suggested by Demerouti and Bakker, the OLBI was originally constructed to measure “burnout and its hypothetical opposite state of work engagement” (2008, p. 75). According to the authors, the subscales of exhaustion and disengagement consist of items that refer to their opposite dimensions, namely vigor and dedication at work; their score is assessed by summing responses from the positive-keyed items without recoding them. Research on work engagement confirmed the nomonological distinction from burnout and new measures on work engagement are now available (e.g., Schaufeli et al., 2001).

Standardized questionnaires with multiple items represent the state of the art in psychometrics, as they are valid and reliable measures for assessing the construct under consideration (e.g., Kruyen et al., 2012). However, in surveys and research, long tests can be time-consuming, can increase respondent fatigue, and can frustrate individuals with attention deficits, resulting in potentially unreliable or missing data (Greer & Liu, 2016). In contrast, short instruments can be considered more specific and offer certain advantages for research purposes: They are quick and inexpensive, provide direct and valid estimates of a trait, and are appropriate for participants with limited attention capabilities. While a short instrument is advantageous in terms of cost, time, and applicability, there are also challenges in its construction. Therefore, examination of the item properties of the OLBI is considered imperative.

Research Objectives

This study adds to the existing literature and has two objectives. The first objective is to provide a shorter tool for measuring occupational burnout that uses a more elaborated technique, namely item response theory (IRT). IRT provides item/test information curves and category thresholds that are fundamental in scrutinizing the measure’s items. To our knowledge, no research applied IRT to examine the full spectrum of the OLBI. Gustavsson et al. (2010) used IRT to analyze only the positively keyed items (eight items), with data from Swedish nurses. This study builds on the findings of Gustavsson and colleagues and proposes a shorter yet more concise tool for assessing subclinical burnout. This shorter measure is named scr-OLBI, which is composed of the first letters of the word “screening” and is intended to be a valuable tool for research purposes. Our second objective is to demonstrate the validity and internal reliability of the instrument by examining its relationship to gender, education, and leadership practices with a focus on ethical behaviors and attitudes.

IRT Analysis

IRT is widely applied to evaluate item parameters of unidimensional constructs (Greer & Liu, 2016). There are several different types of IRT models for assessing factor structure and item properties, all of which represent the nonlinear relation between individuals’ trait level and their probability of choosing a certain response category for each item (Embretson & Reise, 2000). We used two different models in our analysis: the Rasch rating scale model (RRSM; Andrich, 1978), which is also a polytomous Rasch model, and the graded response model (GRM; Samejima, 1997). The RRSM is a latent structure model for polytomous responses to a set of test items. This model is considered an extension of the Rasch model (for dichotomous responses). According to the RRSM, each item of the measure is described only by a j threshold difficulty parameter (βij), where j is the number of response options-1 per i items. The βij specifies the response category difficulty parameter and is interpreted as the trait value (θ) indicated by the probability of responding in or above a given j threshold. The GRM is an extension of the 2-parameter logistic model; each item of the measure is described by both a j threshold difficulty parameter (βij) and a discrimination parameter (a) as well, where j is the number of response options-1 per i items. The a signifies the relationship between observed item response and unobserved latent trait and characterizes the slope of the item’s characteristic curve. The discrimination parameter reports the degree to which item response categories distinguish among trait levels; it is constant for all thresholds within an item yet varies across items within the same measure. The GRM was chosen as opposed to alternative polytomous models (e.g., partial credit model) because this model is considered a more natural model for rating scales and provides the best fit to many measures examined (e.g., Khorramdel & von Davier, 2016). Further details about these models can be found in Embretson and Reise (2000). Building on previous studies, the preceding procedures will be used to check the item properties of the OLBI and assist in the construction of a new shorter, yet concise, instrument.

Burnout and Ethical Leadership

Burnout is associated with various occupational behaviors and outcomes. Individuals exhibiting burnout symptoms present deterioration of their job performance, self-esteem, and social interaction and exhibit higher levels of stress and defective conceptions of self and others, e.g., their leaders (e.g., Demerouti et al., 2003). Leadership is an important determinant of employees’ work-related health outcomes, involving work well-being and job satisfaction. Consequently, research on leadership practices and employees’ well-being is considered critical.

The literature has shown that leaders who are perceived as honorable, honest, and true to their occupation and express a genuine concern for their employees are likely to influence employee’s burnout (e.g., Mitropoulou et al., 2019). Ethical leaders are characterized by their ethical practices and by the promotion of their practices to employees through role modeling and observational learning. They show genuine concern about multiple stakeholders, community welfare, and sustainability issues. They are morally obliged to ethical standards, care about the consequences of their actions, and promote work strategies that are fully in line with their ethical values. Ethical leaders act as role models and conduct their occupational and personal life with integrity, honesty, sincerity, trustworthiness, and humility.

Ethical leadership, measured by a general ethical leadership factor, is found to have a mediating effect on employees’ burnout (e.g., Demerouti et al., 2003; Mitropoulou et al., 2019). When employees discern deception, unfairness, and dishonesty, they are more likely to exhibit feelings of exhaustion and depersonalization from work. Conversely, leaders who sustain equal opportunities for advancement, support employees, communicate about ethics, and clarify responsibilities, expectations, and performance goals are more likely to mitigate burnout symptoms (e.g., Kalshoven et al., 2011). Nonetheless, the concept of ethical leadership has proven to be more complicated than originally expected. Ethical leadership is considered a multifaceted construct including both personality traits and behavioral patterns. Therefore, to further examine the association of burnout and ethical leadership, we employ a more elaborated, multidimensional model of ethical leadership, which includes seven dimensions (Kalshoven et al., 2011). We predict that burnout will exhibit a negative association with all ethical leadership dimensions.

Finally, to demonstrate divergent validity, we examined certain demographic variables considered unrelated to burnout, such as gender, education, and organizational tenure (e.g., Demerouti & Bakker, 2008). In general, employees tend to feel disengaged and exhausted from their work, regardless of their age, education, and tenure. Therefore, we expect to find nonsignificant correlations among burnout, demographic variables, and organizational tenure.

Methods

Participants and Procedure

In total, 617 Greek employees have participated in this study with a mean age of 47.16 years (SD = 9.01, age range 25–72 years), of whom 66% held a university degree and 49.1% are female. Participants are occupied into three different organizations (manufacture, public services, and sales). From our total sample, 314 employees were also asked to fulfill an ethical leadership instrument. Data were collected between July 2013 and March 2015 as part of a larger research project and are publicly available (Mitropoulou, 2022a). Snowball and cluster sampling procedures were used for data collection, including organizations from various industries (e.g., sale, manufacture, and public services). In regard to the cluster sampling process, medium/large organizations located in Greece were reached via e-mail by the first researcher through the website https://www.greatplacetowork.gr (n = 40). Three organizations accepted participation (response rate 7.5%). The first one is a public service organization, while the other two are private organizations, dealing with manufacturing and marketing; all three organizations employed more than 100 individuals and consisted of several branches. Organizations received the link that included the study’s measures and were asked to distribute it to their employees. Employees’ participation was voluntary; participants were informed about the research purpose prior to the completion and received an electronic or paper-and-pencil copy of the questionnaire. Due to the combinatorial use of the cluster sampling procedure, the response rate could not be determined. Ethical approval was obtained by the ethics committee of the University of Crete.

Measures

Burnout was measured with the Greek version of the OLBI (Demerouti et al., 2010), which consists of 16 items: Eight items assess exhaustion and the other eight assess disengagement from work. Each item is rated by the degree of agreement regarding occupational burnout, with responses ranging from 1 (strongly agree) to 4 (strongly disagree). Internal consistency reliability is α = .78 for the exhaustion subscale and α = .68 for the disengagement subscale.

Table 1 Descriptives, reliability, and validity indices for the scr-OLBI

Ethical leadership was measured with the self-report 38-item Ethical Leadership at Work questionnaire (ELW; Kalshoven et al., 2011). Participants rated their agreement regarding their direct leader’s ethical behaviors and attitudes with responses rating from 1 (strongly disagree) to 5 (strongly agree). The ELW consists of seven dimensions, namely people orientation, fairness, power sharing, concern for sustainability, ethical guidance, role clarification, and integrity (see Table 1).

Table 2 Fit indices for the item properties assessment models in Exhaustion and Disengagement subscales

Organizational tenure was measured by asking participants to state how many years they retain their occupation. Research included information about gender, age, and education.

Data Analysis

Data were analyzed with the R (R Core Team, 2021) and the Stata 16 software (StataCorp, 2019). Before starting our analysis, dimensionality was assessed. We compared the fit indices of different models of IRT to assess dimensionality and item properties for the OLBI. In regard to dimensionality, 2-factor solutions were compared: a unidimensional model and a 2-factor polytomous model using the R package mirt (Chalmers, 2012). The mirt package permits the estimation of multidimensional item response theory parameters for confirmatory models by using the maximum likelihood method and begins with the computations of a matrix of quasi-polychoric correlations. For the assessment of the item properties, we estimated the polytomous RRSM using the TAM package (Robitzsch et al., 2020) and the GRM with the mirt package. Differential information function was conducted with DIFAS 5.0 (Penfield, 2005). The R syntax for all analyses is publicly available at https://osf.io/3tskm.

Results

Descriptive Statistics and Item Analysis

Means, SDs, and item correlations are publicly available at https://osf.io/9n6gk. All items exhibit small-to-moderate interitem correlations apart from Item 15, which reveals insignificant correlations with most items. Dimensionality was also assessed. A unidimensional model and a 2-dimensional model were tested, and fit indices were examined. We used five goodness-to-fit indices across models to determine which provides the best fit: The Bayesian information criterion (BIC) and the Akaike information criterion (AIC), with value differences greater than 10 indicating model fit difference and lower values representing a better fit of the model (Kang et al., 2009), the −2-log likelihood (−2LL), where χ2 distribution with degrees of freedom equals to the difference in the number of parameters between the most complex model and the model with fewer parameters, the root mean square error of approximation (RMSEA) and the standardized root mean residual (SRMR) fit indices, with values < .05 indicating acceptable fit (Maydeu-Olivares, 2013). The analysis revealed a significantly better fit for the 2-factor model [AIC = 21,497.43, BIC = 21,847.06, −2LL = 10,669.75, df = 15] in comparison to a unidimensional burnout model [AIC = 21,930.37, BIC = 22,213.56, −2LL = 10,901.18, df = 18]. Yet, examination of the absolute fit indices revealed unacceptable fit to the data for the 2-factor model [RMSEA = .07, 95% CI (.06, .08), SRMR = .09] and the unidimensional model [RMSEA = .12, 95% CI (.13, .09), SRMR = .09], with both RMSEA indices being over the recommended threshold (Maydeu-Olivares, 2013). Also, discrepancies of the items’ factor loadings were evident. Specifically, Item 14 loaded to its noncorresponding factor and Items 9, 11, and 15 yielded significant cross-loadings.

Next, we evaluate the measurement quality of the OLBI. One major advantage of IRT is that the assumptions made can be tested by means of goodness-of-fit measures (see Khorramdel & von Davier, 2016). We used two different models in our analysis: the RRSM (Andrich, 1978) and the GRM (Samejima, 1997). The RRSM examines whether the item discrimination parameter α is equal across all items, while the GRM pertains that item discrimination parameter α is not equal across all items and differences between each of the response categories are not the same across all items. Both models were tested separately for each of the two subscales of the OLBI. The analysis revealed a significantly better fit for the less restrictive GRM compared to the restrictive RRSM; items are best defined by both difficulty and discrimination parameters (see Table 2).

Table 3 shows the distribution of responses and the item parameters for the 16 OLBI items. Discrimination parameter α represents the item’s slope; all items have moderate to high α, with Items 12, 8, and 4 (exhaustion) and Items 7, 3, and 11 (disengagement) exhibiting the highest discrimination parameters. The difficulty parameter β represents the part of the latent continuum θ where the thresholds of item categories are located. Difficulty parameters were found to vary across items. Items’ modal values are located in the first three response options, whereas the highest category is rarely chosen, except for Items 2 and 4, which appear to have evenly distributed pattern of responses. In general, the positive parameter thresholds for most items indicate that the measure discriminates in the higher regions of the latent continuum, with the two subscales having similar patterns of difficulty, apart from Item 13, due to its lower (negative values) difficulty parameters thresholds.

Table 3 Parameters and response thresholds for all items

Development of scr-OLBI

In accordance with our first objective, indicators that discriminate highly among burnout respondents are preferred for inclusion in this short measure, since such indicators will accurately differentiate among individuals. The analysis revealed that Item 12 (α = 2.90) and Item 8 (α = 2.09) from the exhaustion subscale and Item 7 (α = 2.04) and Item 3 (α = 1.43) from the disengagement subscale fulfill these criteria. However, further examination of their wording reveals that only Item 7 is positively keyed while the other items are negatively keyed items. Consequently, to avoid potential method factor bias due to item wording (i.e., Demerouti & Bakker, 2008), we decided to exclude Item 7 from the final selection and include Item 11 (α = 1.40) instead for two reasons. Item 11 exhibits the third highest discrimination parameter in the subscale, and second, it is negatively keyed, hence theoretically more relevant to burnout.

To further assess the suitability of these four items (12, 8, 3, and 11), we decided to examine the information functions at both item and test levels. Figure 1 shows the item information function (IIF) for all subscale items. The selected items exhibit the highest IIF’s in each subscale. Examination of their item characteristic curves also reveals that their category response curves are mostly evenly endorsed. Accordingly, Figure 2 shows the test information function (TIF) for the original 8-item and the modified 2-item subscales of OLBI. Comparison of TIFs for the full spectrum of burnout and the new shorter version of the construct reveals similar patterns of information and standard error. Interestingly, the SE values are very low for both subscales across most trait values.

Figure 1 Test and items’ information function (as proposed in a subscale arithmetic order).
Figure 2 Test information functions and standard errors for both subscales.

We also examined the factor structure, reliability, and differential item functioning (DIF) of the four items. Fit indices revealed a better fit to the data [AIC = 5,259.93, BIC = 5,330.73, RMSEA = .07, 95% CI (.02, .12), SRMR = .04], with RMSEA remaining over the recommended threshold of <0.05 (Maydeu-Olivares, 2013). Internal consistency reliability is α = .74. DIF exists when respondents with the same trait show differing response probabilities in a given item category due to any form of group connection that is irrelevant to the construct measured and may lead to systematic measurement error (Kleinman & Teresi, 2016). Since our objective is to assess occupational burnout among male and female employees, we ensure that such measurement bias does not exist. Accordingly, the selected items were analyzed with gender as the grouping differential factor. DIF is assessed by the weighted and unweighted estimates of the DIF effect variance, the standard error estimators of these variances, and the ratio of each DIF effect variance estimate over its respective standard error estimator (Penfield, 2005). Weighted, unweighted, and standard error estimates exhibit nonsignificant differences for each pair of items, revealing nonsignificant variations among male and female employees.

Preliminary Validity Analysis

In accordance with our second objective, we examined the associations between the scr-OLBI, ethical leadership dimensions, and demographic variables (e.g., gender, age, education, and tenure). The correlation results are provided in Table 2. Validation analysis of the scr-OLBI confirmed all hypothesized correlations. In terms of convergent validity, scr-OLBI was found to correlate negatively with all ethical leadership behaviors. Finally, in terms of divergent validity, the scr-OLBI did not correlate significantly with employees’ age, gender, education, and organizational tenure.

Discussion

The purpose of the current study was to validate a short tool for assessing subclinical burnout (Demerouti & Bakker, 2008). We used IRT to evaluate the original measure’s item characteristics by means of dimensionality and item properties. The analyses revealed that two items from exhaustion and disengagement subscales adhere to high psychometric standards. The GRM results suggest that these items perform adequately well; they provide considerable, above-average levels of information on exhaustion and disengagement from work. The weighted and unweighted estimates of DIF effect variance, the standard error estimates of these variances, and the ratio of each DIF effect variance estimate over its respective standard error estimator revealed that DIF for male and female employees is negligible. Employees respond to the four items of the short version invariantly, and no systematic bias is present regarding their gender differences. Finally, the scr-OLBI did not reveal loss on reliability or model fit; the short version of burnout maintains similar reliability indices and comparable information to the original measure of burnout.

The scr-OLBI provides a cost-effective instrument for assessing subclinical occupational burnout that at present can be used purely for research purposes. The combination of its psychometric soundness and brevity increases the attractiveness of the scr-OLBI as a valuable tool for organizational burnout studies. With only four items, the scr-OLBI can be used when employee burnout is not the focus but is considered in conjunction with other occupational symptoms. In addition, the scr-OLBI does not differentiate between male and female employees and can be used indifferently in working environments.

Despite its advantages, this study has several limitations. The most important limitation is the lack of evidence on the clinical use of the scr-OLBI. Research data derive solely from healthy individuals; no information is gathered for the clinical diagnosis of burnout syndrome, and no conclusions about the clinical measurement accuracy of scr-OLBI can be drawn from our results. Consequently, the utility of the scr-OLBI is limited to research. Second, our data pertain exclusively Greek participants employed in different organizations. Although the sample size is sufficient for the analysis, replication of our findings in specific occupational settings is needed. To a similar vein, cross-cultural studies are considered crucial, especially since OLBI has been translated into several different languages. It is recommended to examine the generalizability of our results to other populations of interest. Moreover, to further strengthen its convergent validity, the scr-OLBI should be related to different occupational behaviors and attitudes. We suggest the use of instruments relevant to well-being and other variables such as stress, health behaviors, and other characteristics, since their examination will improve the validation of the scr-OLBI. We encourage future studies to use scr-OLBI and investigate its accuracy in diagnosing burnout syndrome in clinical samples.

In conclusion, the Greek version of the scr-OLBI is a brief instrument that appears to be valid and reliable for assessing burnout invariably in Greek-speaking employees. Taking into account the measure’s easiness for administration, the scr-OLBI is suitable for organizations and researchers who typically face time pressures yet wish to study employees’ burnout at heavy industries or multilevel organizations. Future research is needed to consolidate and generalize these findings to different populations and countries.

References

  • Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573. 10.1007/BF02293814 First citation in articleCrossrefGoogle Scholar

  • Campos, J. A. D. B., Carlotto, M. S., & Marôco, J. (2012). Oldenburg Burnout Inventory – Student Version: Cultural adaptation and validation into Portuguese. Psicologia: Reflexão e Critica, 25, 709–718. 10.1590/S0102-79722012000400010. First citation in articleCrossrefGoogle Scholar

  • Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. 10.18637/jss.v048.i06 First citation in articleCrossrefGoogle Scholar

  • Demerouti, E., & Bakker, A. B. (2008). The Oldenburg Burnout Inventory: A good alternative to measure burnout and engagement. In J. Halbesleben (Ed.), Stress and burnout in health care (pp. 65–78). Nova Sciences. 10.4236/psych.2013.41010 First citation in articleCrossrefGoogle Scholar

  • Demerouti, E., Bakker, A. B., Vardakou, I., & Kantas, A. (2003). The convergent validity of two burnout instruments: A multitrait-multimethod analysis. European Journal of Psychological Assessment, 19(1), 12–23. 10.1027//1015-5759.19.1.12 First citation in articleLinkGoogle Scholar

  • Demerouti, E., Mosterd, K., & Bakker, A. B. (2010). Burnout and work engagement: A thorough investigation of the independency of both constructs. Journal of Occupational Health Psychology, 15(3), 209–222. 10.1037/a0019408 First citation in articleCrossrefGoogle Scholar

  • Demerouti, E., Verbeke, W. J. M. I., & Bakker, A. B. (2005). Exploring the relationship between a multidimensional and multifaceted burnout concept and self-rated performance. Journal of Management, 31(2), 186–209. 10.1177/0149206304271602 First citation in articleCrossrefGoogle Scholar

  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Erlbaum. First citation in articleGoogle Scholar

  • Greer, F., & Liu, J. (2016). Creating short forms and screening measures. In K. SchweizerC. DiStefano (Eds.), Principles and methods of test construction: Standards and recent advances (pp. 272–287). Hogrefe Publishing. First citation in articleGoogle Scholar

  • Gustavsson, J. P., Hallsten, L., & Rudman, A. (2010). Early career burnout among nurses: Modeling a hypothesized process using an item response approach. International Journal of Nursing Studies, 47(7), 864–875. 10.1016/j.ijnurstu.2009.12.007 First citation in articleCrossrefGoogle Scholar

  • Halbesleben, J. R. B., & &Demerouti, E. (2005). The construct validity of an alternative measure of burnout: Investigating the English translation of the Oldenburg burnout inventory. Work & Stress, 19(3), 208–220. 10.1080/02678370500340728 First citation in articleCrossrefGoogle Scholar

  • Kalshoven, K., Den Hartog, D. N., & De Hoogh, A. H. B. (2011). Ethical leadership at work questionnaire (ELW): Development and validation of a multidimensional measure. The Leadership Quarterly, 22(1), 51–69. 10.1016/j.leaqua.2010.12.007 First citation in articleCrossrefGoogle Scholar

  • Khorramdel, L., & von Davier, M. (2016). Item response theory as a framework for test construction. In K. SchweizerC. Distefano (Eds.), Principles and methods of test construction: Standards and recent advancements (pp. 52–80). Hogrefe Publishing. First citation in articleGoogle Scholar

  • Kleinman, M., & Teresi, J. A. (2016). Differential item functioning magnitude and impact measures from item response theory models. Psychological Test and Assessment Modeling, 58, 79–98. First citation in articleGoogle Scholar

  • Kruyen, P. M., Emons, W. H. M., & Sijtsma, K. (2012). Test length and decision quality in personnel selection: When is short too short?. International Journal of Testing, 12(4), 321–344. 10.1080/15305058.2011.643517 First citation in articleCrossrefGoogle Scholar

  • Mahadi, N. F., Chin, R. W. A., Chua, Y. Y., Chu, M. N., Wong, M. S., Yusoff, M. S. B., & Lee, Y. Y. (2018). Malay language translation and validation of the Oldenburg burnout inventory measuring burnout. Education in Medicine Journal, 10(2), 27–40. 10.21315/eimj2018.10.2.4 First citation in articleCrossrefGoogle Scholar

  • Maydeu-Olivares, A. (2013). Goodness-of-fit assessment of item response theory models. Measurement: Interdisciplinary Research and Perspectives, 11(3), 71–101. 10.1080/15366367.2013.831680. First citation in articleCrossrefGoogle Scholar

  • Mészáros, V., Takács, S., Kövi, Z., Smohai, M., Csigás, Z. G., Tanyi, Z., Jakubovits, E., Kovács, D., Szili, I., Ferenczi, A., & Ādám, S. (2020). Dimensionality of burnout – is the Mini Oldenburg Burnout Inventory suitable for measuring separate burnout dimensions?. Mentálhigiénéés Pszichoszomatica, 21(3), 323–338. 10.1556/0406.21.2020.015. First citation in articleCrossrefGoogle Scholar

  • Mitropoulou, E. M. (2022a). OLBI dataset [Data set]. 10.17632/v4n25ncc4n.2 First citation in articleCrossrefGoogle Scholar

  • Mitropoulou, E. M. (2022b). Supplementary material for the scr-OLBI. https://osf.io/ywph4/?view_only=d3018ff812ab47ce80cd72e5b419c123 First citation in articleGoogle Scholar

  • Mitropoulou, E. M., Tsaousis, I., Xanthopoulou, D., & Petrides, K. V. (2019). Development and psychometric evaluation of the Questionnaire of Ethical Leadership (QueL). European Journal of Psychological Assessment, 36(4), 635–645. 10.1027/1015-5759/a000533 First citation in articleLinkGoogle Scholar

  • Oosterholt, B. G., Maes, J. H., Van der Linden, D., Verbraak, M. J., & Kompier, M. A. (2014). Cognitive performance in both clinical and non-clinical burnout. Stress, 17(5), 400–409. 10.3109/10253890.2014.949668 First citation in articleCrossrefGoogle Scholar

  • Penfield, R. D. (2005). DIFAS: Differential item functioning analysis system. Applied Psychological Measurement, 29(2), 150–151. 10.1177/0146621603260686 First citation in articleCrossrefGoogle Scholar

  • R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/. First citation in articleGoogle Scholar

  • Reis, D., Xanthopoulou, D., & Tsaousis, I. (2015). Measuring job and academic burnout with the Oldenburg burnout Inventory (OLBI): Factorial invariance across samples and countries. Burnout Research, 2(1), 8–18. 10.1016/j.burn.2014.11.001 First citation in articleCrossrefGoogle Scholar

  • Robitzsch, A., Kiefer, T., & Wu, M. (2020). TAM: Test analysis modules (R package version 3.5-19). https://CRAN.R-project.org/package=TAM First citation in articleGoogle Scholar

  • Samejima, F. (1997). Graded response model. In W. J. van der LindenR. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 85–100). Springer. First citation in articleCrossrefGoogle Scholar

  • Schaufeli, W. B., Bakker, A. B., Hoogduin, K., Schaap, C., & Kladler, A. (2001). On the clinical validity of the Maslach burnout inventory and the burnout measure. Psychological Health, 16(5), 565–582. 10.1080/08870440108405527 First citation in articleCrossrefGoogle Scholar

  • Subburaj, A., & Vijayadurai, J. (2016). Translation, validation and psychometric properties of Tamil version of Oldenburg Burnout Inventory (OLBI). Procedia-Social and Behavioral Sciences, 219, 724–731. 10.1016/j.sbspro.2016.05.067 First citation in articleCrossrefGoogle Scholar