Skip to main content
Open AccessOriginal Article

Psychometric Quality of the German HEXACO-60 Personality Inventory-Revised

Consistency, Validity, and Measurement Invariance of Self-Report and Observer-Report Forms

Published Online:https://doi.org/10.1027/1015-5759/a000812

Abstract

Abstract: Striving for more valid personality judgments is desirable as more sound decisions can be expected with increasing measurement accuracy. In this study, the psychometric qualities of the German HEXACO-60 personality inventory are evaluated. Extending previous studies, we examined the third-person observer-report form in addition to the first-person rating format, which allowed us to examine the psychometric quality beyond self-reports, such as cross-rater agreement, rank-order stability, structural validity, and measurement invariance. Data from 3,046 self-raters (61% female; age range: 14–90) and ratings from 2,199 well-informed acquaintances (partners, friends, or relatives of the self-raters) were analyzed. Satisfying internal consistency, 2-year and 4-year rank-order stabilities, agreement among self- and informant-raters, and consensus among informant-raters were found. Moreover, the six-factor structure was confirmed in structural equation models that incorporated the perspectives of both self-raters and informant-raters. Finally, partial strict measurement invariance was demonstrated across rater perspectives. Previous validation of the German HEXACO-60 self-report form could thus be replicated and expanded by highlighting the psychometric soundness of the third-person observer-report form and the convergent as well as discriminant validity of HEXACO trait measures across rater perspectives. The informant perspective provides valid additional benefits for the assessment of personality traits within the HEXACO framework.

The formation of accurate judgments regarding other people’s personality traits, defined as a person’s typical patterns of thought, emotion, and behavior that are relatively stable over time and consistent across situations and contexts, is an integral part of social interaction and an individual’s social reputation (Funder, 1995). Important decisions depend at least in part on the appraisal one has of others such as whom to trust, hire, date, marry (Funder, 2012), or whom to vote for (Funder & West, 1993). Striving for more valid judgments is desirable as more sound decisions can be expected with increasing accuracy (Letzring, 2008; Letzring & Human, 2014). Inferences about others’ personalities are also of concern for methodological reasons as other ratings constitute fundamental data in various lines of research (Funder 1995; Watson et al., 2000). The validity of the inferences based on these data hinges on the quality and accuracy of the judgments (Funder, 1993). Thus, accurate judgments of personality bear consequences for both the one being judged and the one making these judgments (Funder, 2012).

Questionnaires are a common method for assessing personality traits. A sound psychometric quality of a personality questionnaire is a prerequisite for accurate and useful personality judgments. Therefore, this validation study extends previous validation attempts on the 60-item version of the HEXACO Personality Inventory-Revised (HEXACO-60), which is a non-commercial personality questionnaire translated into more than 40 languages (Ashton & Lee, 2009; see also https://hexaco.org). Particularly, the psychometric properties of the German observer-report form of the HEXACO-60 and how informant reports converge with self-ratings for each of the personality traits are examined in the current study.

HEXACO-60 Development

The HEXACO-60 is a short form of the 100-item HEXACO Personality Inventory-Revised (HEXACO-100; Lee & Ashton, 2018). It was developed as a more economical instrument that can still be considered a psychometrically sound measure despite its conciseness (Ashton & Lee, 2009). It is based on the HEXACO framework which assumes six personality dimensions as necessary personality trait domains to comprehensively but also economically capture personality differences (Lee & Ashton, 2004). The dimensions (with their respective facets) are the following: Honesty-Humility (Sincerity, Fairness, Greed Avoidance, and Modesty), Emotionality (Fearfulness, Anxiety, Dependence, and Sentimentality), Extraversion (Social Self-Esteem, Social Boldness, Sociability, and Liveliness), Agreeableness (Forgivingness, Gentleness, Flexibility, and Patience), Conscientiousness (Organization, Diligence, Perfectionism, and Prudence), and Openness to Experience (Aesthetic Appreciation, Inquisitiveness, Creativity, and Unconventionality). Items of the 100-item version were evaluated according to their primary and secondary factor loadings. Items with satisfactory loadings were selected so that each of the four trait facets that constitute each of the six broad trait dimensions was assigned two or three items respectively. All items are assessed on a 5-point Likert scale ranging from strongly disagree to strongly agree (Ashton & Lee, 2009).

The HEXACO-60 proved to be a viable alternative to the longer version under constrained administration time. Ashton and Lee could replicate the six-factor structure which explained about a third of item variance in samples consisting of Canadian college students and community adults. Further, all subscales demonstrated acceptable internal consistency and showed meaningful correlations to a set of related personality constructs. Additionally, self-ratings tended to coincide with observer reports resulting in self-other correlations comparable to those among the longer HEXACO-100 scales.

Since the initial conception of the HEXACO-60, there have been translations into several languages and cross-cultural validation attempts, at least for the self-report form (e.g., Di Fabio & Saklofske, 2017; Truskauskaitė-Kunevičienė et al., 2012). The psychometric properties of the German HEXACO-60 self-report form have been subject to previous scrutiny as well (Moshagen et al., 2014). In this regard, a six-factor solution was found to be the most appropriate structure for the items, which clustered in accordance with the HEXACO framework. Additionally, the German HEXACO-60 performed satisfactorily with regard to internal consistency, test-retest reliability, and concurrent validity. More specifically, all six scales showed high correlations with the respective HEXACO-100 scales. Furthermore, the six HEXACO domain scores were meaningfully related to Big Five personality traits as measured by the German Neuroticism-Extraversion-Openness Five-Factor Inventory (NEO-FFI; Borkenau & Ostendorf, 2008). Additionally, high test-retest stabilities (i.e., correlations of individual ranks) over a 7-month period were found. In sum, there is convincing evidence for the validity of the German HEXACO-60 self-report form.

The Current Validation Study

The current study examines whether similar conclusions about the psychometric soundness of the German HEXACO-60 self-report form found in prior research can be drawn and expands previous research by validating the German observer-report form. Therefore, the psychometric properties of the German HEXACO-60 observer-report form were examined in a broad and heterogeneous sample of the general German population. Specifically, we investigated to what extent the HEXACO six-factor structure can be replicated (i.e., factorial validity) with the use of informant reports and across rater perspectives. Next to the internal consistency and test-retest stabilities of the six personality traits, consensus (i.e., correlative convergence between well-informed acquaintances’ judgments), and self-other agreement (i.e., correlative convergence between self-raters’ and well-informed acquaintances’ judgments) were investigated. Lastly, we tested for measurement invariance across self- and informant-reports.

The assessment of the psychometric properties, validity, and measurement invariance was based on the following specific research questions:

Research Question 1 (RQ1):

To what extent are the German HEXACO-60 self- and observer-report forms internally consistent?

Research Question 2 (RQ2):

What is the 2-year and 4-year rank-order stability of the HEXACO domain scores based on self- and informant reports?

Research Question 3 (RQ3):

To what extent are informants consistent in their ratings of the same targets?

Research Question 4 (RQ4):

To what extent do the self-ratings on the German HEXACO-60 self-report form coincide with ratings from well-informed others based on the observer-report form?

Research Question 5 (RQ5):

Can the factor structure of the HEXACO-60 be replicated in the current sample and with the use of informant ratings?

Research Question 6 (RQ6):

What is the level of measurement invariance (metric, scalar, or strict) between the self- and observer-report forms?

Methods

We report how we determined our sample size, all data exclusions (if any), all data inclusion/exclusion criteria, whether inclusion/exclusion criteria were established prior to data analysis, all measures in the study, and all analyses including all tested models. For all psychometrically relevant inferential tests and parameters, we report exact p-values, effect sizes, and 95% confidence or credible intervals.

Sample

The current study used openly accessible data provided by the Study of Personality Architecture and Dynamics (SPeADy; see also http://speady.de). This German longitudinal research project consists of a twin family study and an age group study. For more information on the substudies see Kandler et al. (2019) for the twin-family study and Wiechers et al. (2023) for the age groups study. We aimed to consider all available data of the age group study. The sample consisted of 3,046 participants (60.6% female) who provided answers to the HEXACO-60 in at least one of the three data sampling waves (self-rater). Prior to analyses, 33 cases were excluded as they did not meet the criteria of completing the questionnaire with item-rating missings n < 20. Self-raters were German-speaking and primarily German (96.2%). Their age ranged from 14 to 90 years with a mean of 41.05 years (SD = 18.20). This minimum age was chosen because the HEXACO can be used for measuring developmental trends from the age of at least 14 years (Ashton & Lee, 2016). For n = 883 and n = 885 targets, longitudinal self-report data across waves 1 and 2 and across waves 2 and 3, respectively, were available. N = 751 individuals provided self-report data across all assessment waves. Although the sample cannot be seen as representative of the German population, it is heterogeneous regarding age, gender, family status, educational level, and occupation. See Wiechers et al. (2023) for more detailed information on the sample characteristics and recruitment strategy and procedure.

Observer-report data were provided by 2,199 well-informed acquaintances (66.5% female, age in years: M = 39.77, SD = 17.25) of the target person. Accordingly, most informants indicated to know their target well (n = 579, 26.3%) or very well (n = 1,507, 68.5%). The length of the acquaintance- or relationship ranged from less than 1 year to 70 years (M = 20.27, SD = 14.51). Informants were mostly spouses or partners (31.1%), friends (28.7%), or relatives (35.8%). For n = 364 and n = 357 targets, longitudinal informant-report data across waves 1 and 2 and across waves 2 and 3, respectively, were available. For n = 210 individuals, longitudinal informant-report data were available across all assessment waves.

German 60-Item HEXACO Personality Inventory

Each of the six dimensions of the German version of the HEXACO-60 consists of 10 items (Moshagen et al., 2014). Each facet is measured with two to three items. The items in turn are scored on a 5-point scale ranging from strongly disagree to strongly agree (Ashton & Lee, 2009). Both, the self-report form and the observer-report form of the German HEXACO-60 were used (see https://hexaco.org for more details on the instructions and concrete scale descriptions).

Analyses

The internal consistency of the domain scales of the HEXACO-60 was estimated with the use of Cronbach’s α and McDonald’s ω. Rank-order stabilities were estimated with hierarchical structural equation models (SEM) that included latent domain and facet variables and observed item scores. Correlations of latent domain variables across measurement occasions indicated rank-order stability. An example model is shown in Figure B1 of the supplement (Wiechers & Kandler, 2023).

Informant-ratings’ consensus and self-other agreement (SOA) were also estimated with the use of hierarchical SEM for each HEXACO domain considering measurement error (see Figure B2 of the supplement for an SOA example). The domain covariances indicate SOA and consensus.

Further, the factorial validity of the HEXACO-60 self- and observer-report versions was examined by employing confirmatory factor analyses (CFAs). Root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), and Bentler’s Comparative Fit Index (CFI) were used to evaluate model fit following suggestions by Hu and Bentler (1999) and Hooper et al. (2008): Good model fit is indicated by (1) RMSEA < .06 with its confidence interval’s lower bound approaching zero and an upper bound less or equal to .08, (2) SRMR < .05, and CFI ≥ .95. For the RMSEA and the SRMR, values less than .08 are considered acceptable.

The measurement invariance (MI) of the models was tested by sequentially introducing more restrictions to the baseline (configural) model. For metric MI, factor loadings across different personality traits and measurement methods were set equal. For scalar MI, intercepts were additionally set equal. Finally, for strict MI, equal residual variances were also assumed. A cut-off of ΔCFI ≤ .01 (Chen, 2007) and of ΔMFI ≤ .02 (Cheung & Rensvold, 2002) between MI levels was chosen as an indication of invariance.

Analyses were run with the statistical software R (R Core Team, 2022) and IBM SPSS Statistics (Version 28). Missing values (targets: n = 41; informants: n = 25) were handled by pairwise deletion for estimates of reliability. For SEM analyses, Full-Information-Maximum-Likelihood (FIML) estimation procedures were utilized to handle missing values and analyze all available data. All scripts and codes are available at OSF: https://doi.org/10.17605/OSF.IO/P8HZS.

Results

Means and standard deviations of targets’ and informants’ HEXACO-60 domain scores are summarized in Table 1 (see also Supplement A for skewness, kurtosis, and domain-score intercorrelations). Composite scores of informant data were calculated for targets that were judged by more than one informant. Analyses were thus based on 24 composites of 3 informants, 451 composites of 2 informants, and 1,225 single informant data, or in other words 1,700 cases.

Table 1 Reliability estimates and descriptive statistics for self- and informant-rated HEXACO-60 Scores

Internal Consistency

Cronbach’s α and McDonald’s total ω values for the target- and informant-rated personality traits are presented in Table 1. Cronbach’s α as a lower bound of reliability ranged from .73 to .79 for targets’ self-reports and from .76 to .85 for informants’ reports. The ranges of McDonald’s ω were .76–.84 and .81–.88 for self- and informant-reports, respectively. The consistently higher ω values compared to α values indicated true-score congenericity rather than equivalence across items. Overall, the HEXACO scales showed acceptable to good internal consistency.

Rank-Order Stability

Stabilities were calculated between W1 and W2, W2 and W3 (2-year stabilities), and between W1 and W3 (4-year stabilities) based on target scores and also on informant composite scores for cases with more than one informant per target (see Table 2). Standardized covariances (i.e., correlation coefficients between latent variables) of the W1 and W2 self-rated domain scores ranged from .90 to .99. These 2-year self-report stabilities were quite comparable to those of the timespan between W2 and W3, showing the same range. The 4-year self-report stabilities ranged from .88 to .97. The respective informant-report stabilities ranged from .81 to .93 for the first interval, from .83 to .95 for the second interval, and from .76 to .88 for the full timespan. The relatively high SEM-based stability coefficients indicate high rank-order stability for all HEXACO traits after controlling for measurement error and item as well as facet specificity. Uncorrected stabilities assessed with Pearson’s r and Spearman’s ρ correlation coefficients were generally lower (see Table A3 and A4 in the supplement). The 2-year and 4-year stabilities corrected for attenuation due to unreliability (1 − ω) were on average .98, .98, and .94 for self-reports and .82, .86, and .75 for informant composite scores, respectively. Given that different informants were allowed to rate the targets at different measurement occasions, the rank-order stability values can be treated as substantial across informants’ perspectives.

Table 2 Rank-order stabilities based on hierarchical structural equation models

Consensus and Self-Other Agreement

Covariances among self- and informant-rated domains and the first and second informant, indicating SOA and consensus respectively, can be found in Figure 1. All correlations were statistically significant on p < .001 with all estimates ≥ .55. Thus, consensus and SOA as indicators for convergent validity can be treated as substantial.

Figure 1 Consensus and self-other agreement based on covariances from SEM models. 95% confidence intervals are shown in parentheses.

Confirmatory Factor Analyses

We analyzed whether a multitrait-multirater model with six trait factors according to the HEXACO framework based on both self- and informant-ratings and rater-specific method factors would result in a good fit to the data from the current sample. We investigated two models: (1) A two-indicator model with one self-report and one informant-composite score per trait for targets with more than one informant rating (see Figure 2), and (2) a three-indicator model with one self-report and two informant ratings per trait, separately (see Figure 3). In the latter model, we assigned the first informant who completed the questionnaire the number 1 and the second the number 2, comparable to previous studies that used similar data (e.g., DeYoung, 2006). A common metric for the latent factors was created by fixing their first (self-report) loading to 1. In the two-indicator model, both indicator loadings were set to 1. A correlation between the informant method factors in the three-indicator model and correlations among all six HEXACO trait factors were allowed. Additionally, two separate hierarchical models that included the item levels were run for self- and informant-reports, respectively (see Figure B3 and Tables B3 and B4 of the supplement).

Figure 2 Multitrait-multirater model with six HEXACO domains and two method factors. Standardized model parameters are shown. Manifest variables are self-ratings (s) and informant-composite-ratings (ic) regarding one of the six HEXACO traits. Residuals are omitted for simplicity. *p < .05 **p < .01 ***p < .001.
Figure 3 Multitrait-multirater model with six HEXACO domains and three method factors. Standardized model parameters are shown. Manifest variables are self-ratings (s) and ratings of the first (i1) and second informant (i2) regarding one of the six HEXACO traits. Residuals are omitted for simplicity. *p < .05 **p < .01 ***p < .001.

Two-Indicator Multitrait-Multirater Model

The fit indices indicated good model fit (see Table 3; Baseline model). Self- and informant scores also showed high loadings on their assigned latent trait domains, indicating substantial trait consistency. Standardized loadings (range: .62 to .82) are shown in Figure 2. Method factor loadings were consistently lower (range: −.17 to .51), indicating low method specificity (see also Supplement B for more details).

Table 3 Results of measurement invariance testing across self-reports and informant-rater-composite scores

Three-Indicator Multitrait-Multirater Model

This model (with two informants) also fitted the data well, as the fit indices of the Baseline model in Table 3 suggest. Figure 3 shows that standardized loadings for the HEXACO domains ranged from .62 to .82, whereas method factor loadings were consistently lower (range: −.19 to .44), indicating substantial trait consistency and low method specificity (see also Supplement B for more details).

Measurement Invariance (MI)

MI Across Self-Reports and Informant-Composites

The results of measurement invariance model testing are presented in Table 3. Since each HEXACO trait factor in this model only had two indicators with fixed loadings at 1, testing for metric MI was obsolete. There was an acceptable CFI and MFI drop when the loadings of the method factors were set to be equal across self-rater and informant-composite factors. A higher CFI and MFI difference was found when setting intercepts equal across self- and informant-reports. Estimating the Openness to Experience intercepts as unequal across rater perspectives resulted in an acceptable CFI and MFI decrease, suggesting at least partial scalar MI. Similarly, when testing for strict MI, an acceptable CFI drop was only achieved when estimating residual variances of the Conscientiousness indicators as unequal across rater perspectives. Thus, partial strict MI could be supported. Additionally, the supplement includes an MI analysis based on a hierarchical model that includes the item level.

MI Across Self-Reports and Informant-ratings

Setting HEXACO factor loadings equal across rater perspectives (self-report, informant-report 1, informant-report 2) resulted in an acceptable CFI and MFI drop (see Table 4). This implies that change in a latent HEXACO factor variable bears a similar meaning for self- and informant scores. Metric MI could therefore be assumed that restricting the factor loadings across rater perspectives to be equal did not result in a meaningfully worse fit. The latter was also true for equal loadings across method factors. Equal intercepts, however, resulted in a non-acceptable CFI and MFI drop. Unrestricting the intercepts of the self-rated Openness led to acceptable differences, indicating partial scalar MI. Strict MI could not be assumed by setting all residual variances equal across indicators. When freely estimating residual errors of self-rated Agreeableness and Conscientiousness, the decreases in CFI and MFI became acceptable. Thus, partial strict MI could be supported.

Table 4 Results of measurement invariance testing across self-reports and two informant-rater scores

Discussion

The current study aimed to evaluate the psychometric properties of the German HEXACO-60 self- and observer-report forms in a heterogeneous German sample. More specifically, the internal consistency, 2-year and 4-year rank-order stability, consensus and self-other agreement, factor structure, and measurement invariance across rater perspectives were put under scrutiny.

Internal consistency for self-reports was similar to reliability estimates found in prior research (Moshagen et al., 2014). Internal consistency for informant-rated scores was slightly higher, which is unsurprising as more reliable informant-composite scores were used where applicable. The reliability of both, however, was lower when compared to self-rated and informant-rated scores of the 100-item English HEXACO-PI-R, respectively (Ashton et al., 2014). This can be expected from shorter scales compared to scales with more items in total (10 vs. 16 items per domain). In summary, current findings suggest satisfying internal consistency of the observer-report form of the German HEXACO-60.

SEM-based rank-order stability estimates were generally very high. They tend to be smaller for informant reports but were generally comparable to a previous study based on self-report data from New Zealand (Milojev & Sibley, 2014). Rank-order stability estimates based on self- and observer-report forms raw scores were, except for Emotionality domain, lower than previously found for the German self-report form (Moshagen et al., 2014). As the retest period was considerably shorter in the former study (7.24 months) than in the current (2–4 years), this is not particularly surprising. Compared to a study that examined the 2-year rank-order stability of the Honesty-Humility, Agreeableness, and Conscientiousness domains for the HEXACO-100 (Dunlop et al., 2021), we found comparable rank-order stabilities for self-reports and the 4-year stabilities were not considerably lower. The lower stabilities that were observed for informant-rater scores compared to self-reports could in part be attributable to an exchange of raters over time because it was not required that the same informants participate in each wave. In summary, the HEXACO-60 self-reports and informant reports show solid stabilities over a considerable time span.

Informants showed substantial consensus in their assessments and targets and informants substantially agreed in their ratings in line with previous research (see Lee & Ashton, 2018; Roth & Altmann, 2019). This also indicates the psychometric soundness of the measurement instrument (De Vries et al., 2016).

The good fit of the six-factor structure model that incorporates self- and informant-ratings and the high trait-factor loadings suggests high convergent validity of trait scores and that informant-ratings serve as a meaningful additional perspective for assessing personality traits with the German HEXACO-60. The comparatively low correlations across trait factors with a maximum of .30 support the discriminant validity of HEXACO trait scores.

The partial strict measurement invariance between the self- and informant-rater perspectives indicates that the first- and third-person report versions are comparable with regard to the scale metric, their scores for a given latent trait value, and residual errors. Only in the case of Openness to Experience, do targets and informants tend to disagree in the average level of scores. Furthermore, Conscientiousness and Agreeableness self-reports tended to show higher levels of residual variance, resulting in lower estimates of reliability for self-reports on these traits compared to informant ratings. In all other cases, the German HEXACO-60 was found to be invariant across rater perspectives, at least in our study.

Findings regarding internal consistency and factor structure line up with research on the original HEXACO-60. Ashton and Lee (2009) extracted six factors where items of a given scale loaded primarily on one factor. Other language adaptations generally show similar internal consistency and support the six-factor-structure as well (e.g., Polish: Skimina et al., 2020; Italian: Di Fabio & Saklofske, 2017; Lithuanian: Truskauskaitė-Kunevičienė et al., 2012; see also https://hexaco.org). These findings taken together support the cross-cultural and cross-lingual validity of the HEXACO-60.

In conclusion, as previous research already established, the self-rater version of the German HEXACO-60 was shown to be a psychometrically sound measure. The current study expanded this notion by replicating this and highlighting at least comparable psychometric qualities of the informant-rater version. Furthermore, our study confirmed the six-factor structure when also incorporating informant ratings, demonstrating factorial, convergent, and discriminant validity of the HEXACO-60 trait measures.

We thank the whole SPeADy team as well as the study participants for spending part of their lifetime with the research project. In particular, we thank (former) team members Jantje Bollmann, Michael Papendick, Angelika Penner, Elif Yalcin, Annika Overlander, Jana Willemsen, Corinna Eickes, Fynn Plugge, Jana Instinske, Hannah Sarnizei, Paula Wundersee, Rebecca Gruzman, Pauline Hirche, Julia Schneider, Kai Tippelt, and Felix Butt for their important contributions to the current study.

References

  • Ashton, M. C., & Lee, K. (2009). The HEXACO-60: A short measure of the major dimensions of personality. Journal of Personality Assessment, 91(4), 340–345. https://doi.org/10.1080/00223890902935878 First citation in articleCrossrefGoogle Scholar

  • Ashton, M. C., & Lee, K. (2016). Age trends in HEXACO-PI-R self-reports. Journal of Research in Personality, 64, 102–111. https://doi.org/10.1016/j.jrp.2016.08.008 First citation in articleCrossrefGoogle Scholar

  • Ashton, M. C., Lee, K., & De Vries, R. E. (2014). The HEXACO honesty-humility, agreeableness, and emotionality factors: A review of research and theory. Personality and Social Psychology Review, 18(2), 139–152. https://doi.org/10.1177/1088868314523838 First citation in articleCrossrefGoogle Scholar

  • Borkenau, P., & Ostendorf, F. (2008). NEO-Fünf-Faktoren-Inventar (NEO-FFI) nach Costa und McCrae: 2., neu normierte und vollständig überarbeitete Auflage [Neo-Five-Factor-Inventory according to Costa and McCrae: 2nd newly normed and completely revised edition]. Hogrefe. First citation in articleGoogle Scholar

  • Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233–255. https://doi.org/10.1207/S15328007SEM0902_5 First citation in articleCrossrefGoogle Scholar

  • Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464–504. https://doi.org/10.1080/10705510701301834 First citation in articleCrossrefGoogle Scholar

  • De Vries, R. E., Realo, A., & Allik, J. (2016). Using personality item characteristics to predict single‐item internal reliability, retest reliability, and self–other agreement. European Journal of Personality, 30(6), 618–636. https://doi.org/10.1002/per.2083 First citation in articleCrossrefGoogle Scholar

  • DeYoung, C. G. (2006). Higher-order factors of the Big Five in a multi-informant sample. Journal of Personality and Social Psychology, 91(6), 1138–1151. https://doi.org/10.1037/0022-3514.91.6.1138 First citation in articleCrossrefGoogle Scholar

  • Di Fabio, A., & Saklofske, D. H. (2017). HEXACO-60: Primo contributo alla validazione della versione italiana.[HEXACO-60: First contribution to the validation of the Italian version] Counseling, 10(3). https://doi.org/10.14605/CS1031707 First citation in articleCrossrefGoogle Scholar

  • Dunlop, P. D., Bharadwaj, A. A., & Parker, S. K. (2021). Two-year stability and change among the honesty-humility, agreeableness, and conscientiousness scales of the HEXACO100 in an Australian cohort, aged 24–29 years. Personality and Individual Differences, 172, Article 110601. https://doi.org/10.1016/j.paid.2020.110601 First citation in articleCrossrefGoogle Scholar

  • Funder, D. C. (1993). Judgments as data for personality and developmental psychology: Error versus accuracy. In D. C. FunderR. D. ParkeC. Tomlinson-KeaseyK. WidamanEds., APA science Vols. Studying lives through time: Personality and development (pp. 121–146). American Psychological Association. https://doi.org/10.1037/10127-022 First citation in articleCrossrefGoogle Scholar

  • Funder, D. C. (1995). On the accuracy of personality judgment: A realistic approach. Psychological Review, 102(4), 652–670. https://doi.org/10.1037/0033-295X.102.4.652 First citation in articleCrossrefGoogle Scholar

  • Funder, D. C. (2012). Accurate personality judgment. Current Directions in Psychological Science, 21(3), 177–182. https://doi.org/10.1177/0963721412445309 First citation in articleCrossrefGoogle Scholar

  • Funder, D. C., & West, S. G. (1993). Consensus, self‐other agreement, and accuracy in personality judgment: An introduction. Journal of Personality, 61(4), 457–476. https://doi.org/10.1111/j.1467-6494.1993.tb00778.x First citation in articleCrossrefGoogle Scholar

  • Hooper, D., Coughlan, J., & Mullen, M. R. (2008). Structural equation modelling: Guidelines for determining model fit. Electronic Journal of Business Research Methods, 6(1), 53–60. First citation in articleGoogle Scholar

  • Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1080/10705519909540118 First citation in articleCrossrefGoogle Scholar

  • Kandler, C., Penner, A., Richter, J., & Zapko-Willmes, A. (2019). The Study of Personality Architecture and Dynamics (SPeADy): A longitudinal and extended twin family study. Twin Research and Human Genetics, 22(6), 548–553. https://doi.org/10.1017/thg.2019.62 First citation in articleCrossrefGoogle Scholar

  • Lee, K., & Ashton, M. C. (2004). Psychometric properties of the HEXACO Personality Inventory. Multivariate Behavioral Research, 39(2), 329–358. https://doi.org/10.1207/s15327906mbr3902_8 First citation in articleCrossrefGoogle Scholar

  • Lee, K., & Ashton, M. C. (2018). Psychometric properties of the HEXACO-100. Assessment, 25(5), 543–556. https://doi.org/10.1177/1073191116659134 First citation in articleCrossrefGoogle Scholar

  • Letzring, T. D. (2008). The good judge of personality: Characteristics, behaviors, and observer accuracy. Journal of Research in Personality, 42(4), 914–932. https://doi.org/10.1016/j.jrp.2007.12.003 First citation in articleCrossrefGoogle Scholar

  • Letzring, T. D., & Human, L. J. (2014). An examination of information quality as a moderator of accurate personality judgment. Journal of Personality, 82(5), 440–451. https://doi.org/10.1111/jopy.12075 First citation in articleCrossrefGoogle Scholar

  • Milojev, P., & Sibley, C. G. (2014). The stability of adult personality varies across age: Evidence from a two-year longitudinal sample of adult New Zealanders. Journal of Research in Personality, 51, 29–37. https://doi.org/10.1016/j.jrp.2014.04.005 First citation in articleCrossrefGoogle Scholar

  • Moshagen, M., Hilbig, B. E., & Zettler, I. (2014). Faktorenstruktur, psychometrische Eigenschaften und Messinvarianz der deutschsprachigen Version des 60-item HEXACO Persönlichkeitsinventars.[Factor structure, psychometric properties, and measurement invariance of the German-language version of the 60-item HEXACO Personality Inventory] Diagnostica, 60(2), 86–97. https://doi.org/10.1026/0012-1924/a000112 First citation in articleLinkGoogle Scholar

  • R Core Team. (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/ First citation in articleGoogle Scholar

  • Roth, M., & Altmann, T. (2019). A multi-informant study of the influence of targets’ and perceivers’ social desirability on self-other agreement in ratings of the HEXACO personality dimensions. Journal of Research in Personality, 78, 138–147. https://doi.org/10.1016/j.jrp.2018.11.008 First citation in articleCrossrefGoogle Scholar

  • Skimina, E., Strus, W., Cieciuch, J., Szarota, P., & Izdebski, P. (2020). Psychometric properties of the Polish versions of the HEXACO-60 and the HEXACO-100 Personality Inventories. Current Issues in Personality Psychology, 8(3), 255–278. https://doi.org/10.5114/cipp.2020.98693 First citation in articleCrossrefGoogle Scholar

  • Truskauskaitė-Kunevičienė, I., Kaniušonytė, G., Kratavičienė, R., & Kratavičiūtė-Ališauskienė, A. (2012). Psychometric properties of the Lithuanian versions of HEXACO-100 and HEXACO-60. Ugdymo Psichologija, 23, 6–14. https://www.lituanistika.lt/content/48767 First citation in articleGoogle Scholar

  • Watson, D., Hubbard, B., & Wiese, D. (2000). Self–other agreement in personality and affectivity: The role of acquaintanceship, trait visibility, and assumed similarity. Journal of Personality and Social Psychology, 78(3), 546–558. https://doi.org/10.1037/0022-3514.78.3.546 First citation in articleCrossrefGoogle Scholar

  • Wiechers, Y., & Kandler, C. (2023, September 6). Psychometric quality, validity, and measurement invariance of the self- and observer-report form of the German HEXACO-60 Personality Inventory [Supplemental material; Code]. https://doi.org/10.17605/OSF.IO/P8HZS First citation in articleCrossrefGoogle Scholar

  • Wiechers, Y., Zapko-Willmes, A., Richter, J., & Kandler, C. (2023). The longitudinal and multimodal age groups study of personality architecture and dynamics (SPeADy). Personality Science, 4, 1–24. https://doi.org/10.5964/ps.6421 First citation in articleCrossrefGoogle Scholar