Skip to main content
Free AccessEditorial

50 Facets of a Trait – 50 Ways to Mess Up?

Published Online:https://doi.org/10.1027/1015-5759/a000372

In recent years papers introducing measures containing constructs with a lower order facet structure make up a large proportion of our submissions. Different notions of facets can be found in the literature, though. The most prominent concept of facets is the idea of hierarchically structured nomological nets. This ideas has existed for a very long time. Especially in research on the structure of intelligence it has been argued that abilities could be clustered and structured hierarchically (see Ackerman, 1996, for an excellent summary). Whereas this view has been successful in the area of mental abilities, with researchers tentatively agreeing on a structure (McGrew, 2009), and empirical evidence for the validity of test scores from different levels of the hierarchy (Schmidt & Hunter, 1998; Ziegler, Dietl, Danay, Vogel, & Bühner, 2011), the idea of hierarchies is still not fully lodged within the assessment of non-cognitive personality aspects. One of the first proponents of this idea was Eysenck (1947) who suggested that human behavior could be summarized on four levels: specific responses, habits, traits and types. The lowest level of this framework can be understood as the items in a questionnaire. The next levels which we will call facets and traits or domains, have inspired research and controversy in the last decades. A conceptually different notion of facets can be found in circumplex models (Di Blas, 2000; Martínez-Arias, Silva, Díaz-Hidalgo, Ortet, & Moro, 1999). Here, facets are operationalized as the composite of two domains. One prominent proponent of this approach is the Abridged Big 5 Circumplex (AB5C) which has inspired much research (Hofstee, De Raad, & Goldberg, 1992). Most of the papers we see conceive of facets as lower order structures beneath a broader domain in sensu Eysenck. We will therefore focus on this approach and only make a few side notes on circumplex models where appropriate.

Probably the best-known model of a personality hierarchy is the facetted Five Factor Model (FFM) by Costa and McCrae (1995a). However, despite its widespread use, empirical evidence for the psychometric quality of the proposed model is still lacking for many important aspects (see below). This critique does not only hold for the facets of the FFM. Other hierarchical measures of personality, for example the HEXACO measures (Lee & Ashton, 2004; Lee & Ashton, 2006) or the IPIP (Goldberg et al., 2006) could be criticized likewise. Especially the lacking evidence for factorial validity of the assumed hierarchy using confirmatory factor analysis is striking.

Still, the idea of facets as an important aspect of human personality is becoming more and more popular (Woods & Anderson, 2016). There is one obvious reason for this phenomenon: test-criterion correlations. Even though the bandwidth-fidelity dilemma (Cronbach & Gleser, 1965) has been discussed extensively (Ones & Viswesvaran, 1996), its general problem also applies to the question of facets versus traits. If facets are conceptualized as narrower versions of a general trait, the bandwidth-fidelity debate is on full blast. Importantly though, it has been shown that personality facets can outperform their underlying traits when predicting a variety of criteria (Paunonen & Ashton, 2001, 2013; Ziegler, Bensch, et al., 2014; Ziegler, Danay, Schölmerich, & Bühner, 2010). However, as convincing as this evidence may seem, there is also evidence contradicting these findings (Salgado, Moscoso, & Berges, 2013).

To summarize, based on early theories of human personality, the idea of multifaceted personality traits has inspired the construction of many personality questionnaires. Empirical evidence supports the utility of these facets in many cases. Still, doubts remain, especially regarding other aspects such as reliability and construct validity (see below). This editorial is meant to provide a different look onto the concept of facets. Moreover problems, which should be addressed when proposing a facetted measure, are discussed.

The Concept of Facets

As outlined above, the idea of facets located beneath a more general domain often is explained based on Eysenck’s terminology. He proposed that specific responses to stimuli can be clustered into habits or what we now call facets. Thus, facets in this sense can be understood as a collection of specific behaviors with systematic interindividual differences, which occur systematically across situations and time. The next level, the trait or domain level is meant to represent clusters of such facets (see Figure 1). In other words, the trait score itself should represent the common core built by the shared variance of all facets. Whether a factor score derived from a hierarchical model is a better estimate for this compared to a sum score is another issue. The facets themselves should be described by specific behaviors that can be representative of the common core but do not necessarily need to be located there. It would be possible for a facet to include behaviors not located in the common core but specific to the trait itself. In other words, when interpreting facet and domain scores it is important to keep in mind that the domain score represents the common core of all facets whereas the facet scores can contain behaviors that are specific. Moreover, when speaking of the trait, one should specify whether the whole trait is meant or just the trait score. Clearly, based on the ideas just outlined, the trait itself can be more than the trait score implies. Thus, within a nomological net (Cronbach & Meehl, 1955; Ziegler, 2014) it should be defined which behaviors or specific responses form a facet and how different facets can be understood as representing a common core. This is represented in Figures 1 and 2 for the idea of hierarchically structured traits.

Figure 1 A trait hierarchy.
Figure 2 The nomological net of a trait with four facets depicted as a Venn diagram.

If we use this to formalize the individual response to an item according to classical test theory, we arrive at the following equation:

(1)

In other words, the answer to an item is influenced by the latent trait, the specific variance of the facet (facet specificity) and an error. This equation mirrors the idea of two sources of systematic variance, positioned at two latent yet related hierarchy levels. In terms of test score interpretations this means that a multifaceted and hierarchically structured trait score reflects the variance shared by all facets and therefore, the common core of the trait as explained above. Within each item this influence is represented by the amount of trait variance. A facet score, however, represents an area in the nomological net which is still a part of the trait but also can contain behaviors that are specific to this facet and do not strongly overlap with the other facets within this trait (see Figure 2). Within an item this systematic source of variance is represented by the variance explained by the facet. Thus, defining the nomological net of hierarchically organized traits requires defining the behaviors making up the common core but also the specific behaviors (specifics) of a trait represented by facets. Figure 2 exemplifies this idea. Here a trait and its four facets are depicted as Venn diagrams. The overlap between the facets represents the common core. It can also be seen that each facet contains specific variance representative of the trait but not shared with other facets. Moreover, there are areas of the trait not covered by any of the facets hinting at the idea that the trait is not fully explored yet.

Importantly, this differentiation into two systematic sources of variances bears great implications for the estimation of reliability and the validation strategy. The challenge during the validation process is to show that the specific variance of the facet is an essential part of the trait’s nomological net and not just variance additionally measured but actually representing a different construct. At the same time, having two sources of systematic variance complicates the interpretation of test-criterion correlations. Again, we need to stress that all of this mostly refers to hierarchically structured traits. Circumplex models, which combine facets in a different way, pose different problems.

Another important aspect we need to mention here is the notion of facets as situation or domain specific manifestations of a trait. One example for this are school-subject specific or domain specific measures of achievement motivation (Sparfeldt et al., 2015). Here the idea of a facet is combined with situational perception (Horstmann & Ziegler, 2016; Ziegler & Horstmann, 2015; Ziegler & Ziegler, 2015). In such cases it is important though, to clearly state, whether it is assumed that there actually is a lower order facet of a specific trait, like mathematical achievement motivation, or whether the facet is a combination of the trait and situational perception (Rauthmann & Sherman, 2016; Rauthmann, Sherman, & Funder, 2015). The latter case would require to extend or adjust the equation stated above to include situational perception and interactions between situational perception and the trait measured.

Facets and Reliability Estimates

Despite widespread critique (Cronbach & Shavelson, 2004; McCrae, 2015; McCrae, Kurtz, Yamagata, & Terracciano, 2011; Ziegler, Poropat, & Mell, 2014; Zinbarg, Revelle, Yovel, & Li, 2005), Cronbach’s alpha (Cronbach, 1951) still is the reliability estimate chosen in most papers published here and elsewhere. An important assumption of this reliability estimate is tau-equivalency of the items. In almost all cases, this assumption is violated. Oftentimes it is argued though that alpha still is a minimum estimate if the items are at least congeneric. In other words, if the items are at least unidimensional, alpha is a good estimator. Applying this to facets and traits introduces a problem. We know that the items are affected by two latent entities, the trait and the facet specifics. Even though assumptions of unidimensionality could still be met (Ziegler & Hagemann, 2015) if all items were affected by the two sources in a comparable fashion (Bejar, 1983), considering the loadings, often seen for facetted measures, this seems at least doubtful. However, if the assumption of unidimensional items does not hold, Cronbach’s alpha is no suitable estimate of reliability. In fact, Cronbach (1951) suggested: “Tests divisible into distinct subtests should be so divided before using the formula.” (p. 297). This alone would mean that alpha should not be used to estimate the reliability of the trait score interpretation. Here composite reliabilities should be used (Raykov & Pohl, 2013).

Moreover, the fact that there are two sources of variances is problematic for the reliability estimate of the facet score interpretation as well. The Cronbach’s alpha formula capitalizes on the correlations between items. The larger (and positive) these are, the larger the reliability estimate. However, the size of these correlations between items belonging to a facet is not only driven by the facet specific variance but also by the trait variance. Thus, the estimate could be too high. Similar criticism could be brought forward against facets within circumplex models.

The easiest way to deal with this would be to stop speaking of an estimate for facet score reliability. It would be more correct to speak of the reliability of the facet score plus the trait influence whenever using Cronbach alpha. More practical ways could be to use other estimates such as McDonald’s omega (Ziegler & Brunner, 2016). This, however, could yield reliability estimates rendering facet scores practically useless (Brunner & Süß, 2005). While this might seem devastating at first, it just stresses the importance of obtaining test-retest correlations, which might not only be better but also seem to be more consequential for the validity of a score interpretation (McCrae, 2015; McCrae et al., 2011).

Facets and Construct Validity

In order to provide evidence for construct validity it is customary to show convergent and discriminant correlations for the scores derived from a measure (Campbell & Fiske, 1959; Ziegler, 2014). As expressed by the equation above, items forming a facet scale within a hierarchically organized model should contain two systematic sources of variance, i.e. trait variance and facet specificity. Therefore, for scores from facetted measures, construct validity has to be shown for both sources. Convergent validity evidence for trait scores can be a tricky business (Miller, Gaughan, Maples, & Price, 2011; Pace & Brannick, 2010) because of the many different ways traits are being interpreted (Ziegler, Booth, & Bensch, 2013). However, paying attention to differences due to test family, such evidence can be obtained easily. Such ease is harder to find when trying to provide convergent validity evidence for facet score interpretations.

Since there is no generally agreed upon facet framework for most traits, finding convergent measures is complicated. In case of the Big Five a recent paper by Woods and Anderson (2016) could pave the way for such a framework. In any case, evidence for the convergent validity of facet score interpretations requires a detailed definition of the nomological net of the facet, the trait, and also the other facets within the trait as outlined above. Ideally, the nomological net for hierarchically structured multifaceted traits should also contain information regarding neighboring or overlapping constructs (and their facets). This seems to be the only way to theoretically justify the selection of convergent facet measures. Admittedly, this is a high bar to cross. Still, the effects could be a slimming cure for such models.

Even more important seems evidence regarding discriminant validity. There are at least two general pitfalls: (1) facet specificity is not distinct from the specificity of other facets within the same or related traits, (2) facet specificity is not really part of the trait’s nomological network but captures a different trait altogether.

The first pitfall, that is, facet specificity which is correlated with the specificity of another facet within the same trait or a related one has been acknowledged early. Costa and McCrae (1995b) for example admitted that the correlations between their FFM scores might be due to lacking discriminant validity of some of the facets. Thus, the problem has been known for a while and different solutions have been proposed. Most of these solutions somehow rely on factor analytical methods. It has to be noted that distinguishing the facets within one trait should already be regarded as evidence for discriminant validity. Here it has to be stated that confirmatory approaches such as structural equation modeling seem to be mandatory at the final stage of the validation process. Additionally, if such models only contain facets from one trait, the problem of facet specificity not being unique to one trait but also being present in facets of other traits (pitfall 1) cannot be detected. Thus, it is important to test models containing more than one trait and its facets (e.g., Beauducel & Kersting, 2002).

The second pitfall would mean that the variance captured within the facet is systematic but not part of the nomological network it is supposed to represent. An example can be found in the work by Marsh, exploring the structure of self-esteem (Marsh, 1986, 1996). His work shows that one facet of self-esteem, negative self-esteem, which had been purported actually represented differences in verbal ability due to negative item keying. Thus, the facet did not really represent the trait’s nomological network but a totally different construct. Such issues are not always easy to detect. Rigorous theoretical definitions of the facets, their anchoring within the nomological net of the trait, and the distinction from neighboring traits and their facets are an important first step. Based on these definitions it should be possible to empirically test whether the facet specificity is uniquely associated with the trait in question or simply represents a different trait altogether.

Test-Criterion Correlations as Evidence for Criterion Validity

Many papers presenting evidence for test scores’ criterion validities contain long and elaborate tables containing correlations between the scores under scrutiny and (hopefully) theoretically selected criteria. While this might do in case of scores representing distinct traits, it does not work to show that a facet score interpretation within a hierarchically structured multifaceted trait framework can be used to predict behavior. Again, we come back to the equation. The facet score contains variance due to the trait in question and facet specificity. The former variance source, the common core, is also shared by all other facets within the trait. Thus, correlations with criteria can be misleading. This would be the case if the correlations were only driven by the trait variance (common core, see Figure 2) and not the facet specific variance. The test-criterion correlations for the facet scores would still differ according to the amount of trait reflected in the facet score variance. However, the facet itself would not be needed to explain the test-criterion correlation. Only the trait score as a reflection of the common core would be needed. Thus, in order to show that the facet score has unique test-criterion correlations, it is necessary to control for the common core, that is, the overlap between the facets. This is usually done using regression analyses (Ziegler, Bensch, et al., 2014; Ziegler et al., 2010) or bifactor modelling (Leue & Beauducel, 2011; Ziegler & Bühner, 2009). Only if a facet score obtains its significant and substantial criterion relation in such an analysis, would the practical use of the facet score be advisable. We also want to remark that during this procedure it could be vital to include other traits or their facets to show that it really is the specificity of the facet in question that predicts a criterion and not a third, unwanted source of variance.

An interesting approach that also uses these ideas has been proposed by Siegling, Petrides, and Martskvishvili (2015) to identify facets which should be deleted from a trait’s nomological net. In this multistep approach extraneous and redundant facets are identified. Some of the ideas outlined above, that is, facets reflecting traits which are not part of the trait’s nomological net, also appear there.

A final note here deals with the selection of the right criteria. In order to show that a facet score “works” criteria need to be selected that reflect the specificities of the facets and not just the common core.

Editorial Guidelines

Clearly, the issues discussed above call for one general conclusion. Facetted measures need to be built on strong theoretical assumptions and not just empirically driven hunts for more scores. We already admitted that this is a high bar to cross. Still we strongly believe that in the end the benefits will outweigh the trouble. Maybe in relation to Newton’s interpretation of Occam’s razor we could say that “facets are not to be multiplied beyond necessity.”

Another important implication is the differentiation between the trait represented in the nomological net and the scores derived from the measure. We showed above that the trait score reflects a common core and facet scores a proportion of this core plus some specific variance. The trait in the nomological net, however, comprises all of the behaviors in question. It is therefore important to pay attention to the correct wording when talking about a trait, a trait score, or a facet score.

Finally, we want to suggest some editorial guidelines, papers presenting reliability or validity evidence for a hierarchically organized, multifaceted trait measure should adhere to:

  1. (1)
    The nomological net of the trait should be defined as outlined above.
  2. (2)
    Reliability estimates or their interpretation should reflect the existence of more than one source of systematic variance.
  3. (3)
    The validation strategy should be informed by the nomological net. For construct validation this means that convergent and discriminant test scores should acknowledge the facet as well as the trait level. Validity evidence regarding structure should use confirmatory methods. And, finally, validity evidence regarding test-criterion correlations should use methods that ensure that the specific relation between a facet is differentiated from the common core’s overlap with the criterion.

References

  • Ackerman, P. L. (1996). A theory of adult intellectual development: Process, personality, interests, and knowledge. Intelligence, 22, 227–257. First citation in articleCrossrefGoogle Scholar

  • Beauducel, A. & Kersting, M. (2002). Fluid and crystallized intelligence and the Berlin Model of Intelligence Structure (BIS). European Journal of Psychological Assessment, 18, 97–112. First citation in articleLinkGoogle Scholar

  • Bejar, I. I. (1983). Achievement testing: Recent advances. Beverly Hills, CA: Sage. First citation in articleCrossrefGoogle Scholar

  • Brunner, M. & Süß, H. M. (2005). Analyzing the reliability of multidimensional measures: An example from intelligence research. Educational and Psychological Measurement, 65, 227–240. First citation in articleCrossrefGoogle Scholar

  • Campbell, D. T. & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. First citation in articleCrossrefGoogle Scholar

  • Costa, P. T. & McCrae, R. R. (1995a). Domains and facets – Hierarchical personality-assessment using the revised neo personality-inventory. Journal of Personality Assessment, 64, 21–50. First citation in articleCrossrefGoogle Scholar

  • Costa, P. T. & McCrae, R. R. (1995b). Solid ground in the wetlands of personality: A reply to Block. Psychological Bulletin, 117, 216–220. First citation in articleCrossrefGoogle Scholar

  • Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334. First citation in articleCrossrefGoogle Scholar

  • Cronbach, L. J. & Gleser, G. (1965). The bandwidth-fidelity dilemma. Psychological tests and personnel decisions (pp. 97–107). First citation in articleGoogle Scholar

  • Cronbach, L. J. & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. First citation in articleCrossrefGoogle Scholar

  • Cronbach, L. J. & Shavelson, R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and Psychological Measurement, 64, 391–418. First citation in articleCrossrefGoogle Scholar

  • Di Blas, L. (2000). A validation study of the Interpersonal Circumplex Scales in the Italian language. European Journal of Psychological Assessment, 16, 177. First citation in articleLinkGoogle Scholar

  • Eysenck, H. J. (1947). The dimensions of human personality. London, UK: Routledge & Kegan Paul. First citation in articleGoogle Scholar

  • Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R. & Gough, H. G. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84–96. First citation in articleCrossrefGoogle Scholar

  • Hofstee, W. K., De Raad, B. & Goldberg, L. R. (1992). Integration of the Big Five and circumplex approaches to trait structure. Journal of Personality and Social Psychology, 63, 146–163. First citation in articleCrossrefGoogle Scholar

  • Horstmann, K. T. & Ziegler, M. (2016). Situational Perception: Its Theoretical Foundation, Assessment, and Links to Personality. In U. KumarEd., The Wiley Handbook of Personality Assessment (pp. 31–43). Oxford, UK: John Wiley & Sons. First citation in articleGoogle Scholar

  • Lee, K. & Ashton, M. C. (2004). Psychometric properties of the HEXACO personality inventory. Multivariate Behavioral Research, 39, 329–358. First citation in articleCrossrefGoogle Scholar

  • Lee, K. & Ashton, M. C. (2006). Further assessment of the HEXACO Personality Inventory: Two new facet scales and an observer report form. Psychological Assessment, 18, 182–191. First citation in articleCrossrefGoogle Scholar

  • Leue, A. & Beauducel, A. (2011). The PANAS structure revisited: On the validity of a bifactor model in community and forensic samples. Psychological Assessment, 23, 215–225. First citation in articleCrossrefGoogle Scholar

  • Marsh, H. W. (1986). Negative item bias in ratings scales for preadolescent children: A cognitive-developmental phenomenon. Developmental Psychology, 22, 37–49. First citation in articleCrossrefGoogle Scholar

  • Marsh, H. W. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70, 810–819. First citation in articleCrossrefGoogle Scholar

  • Martínez-Arias, R., Silva, F., Díaz-Hidalgo, M. T., Ortet, G. & Moro, M. (1999). The structure of Wiggins’ interpersonal circumplex: Cross-cultural studies. European Journal of Psychological Assessment, 15, 196–205. First citation in articleLinkGoogle Scholar

  • McCrae, R. R. (2015). A more nuanced view of reliability: Specificity in the trait hierarchy. Personality and Social Psychology Review, 19, 97–112. First citation in articleCrossrefGoogle Scholar

  • McCrae, R. R., Kurtz, J. E., Yamagata, S. & Terracciano, A. (2011). Internal consistency, retest reliability, and their implications for personality scale validity. Personality and Social Psychology Review, 15, 28–50. First citation in articleCrossrefGoogle Scholar

  • McGrew, K. (2009). CHC theory and the human cognitive abilities project: Standing on the shoulders of the giants of psychometric intelligence research. Intelligence, 37, 1–10. First citation in articleCrossrefGoogle Scholar

  • Miller, J. D., Gaughan, E. T., Maples, J. & Price, J. (2011). A comparison of agreeableness scores from the Big Five Inventory and the NEO PI-R: Consequences for the study of narcissism and psychopathy. Assessment, 18, 335–339. First citation in articleCrossrefGoogle Scholar

  • Ones, D. & Viswesvaran, C. (1996). Bandwidth-fidelity dilemma in personality measurement for personnel selection. Journal of Organizational Behavior, 17, 609–626. First citation in articleCrossrefGoogle Scholar

  • Pace, V. L. & Brannick, M. T. (2010). How similar are personality scales of the “same” construct? A meta-analytic investigation. Personality and Individual Differences, 49, 669–676. First citation in articleCrossrefGoogle Scholar

  • Paunonen, S. V. & Ashton, M. C. (2001). Big five factors and facets and the prediction of behavior. Journal of Personality and Social Psychology, 81, 524–539. First citation in articleCrossrefGoogle Scholar

  • Paunonen, S. V. & Ashton, M. C. (2013). On the prediction of academic performance with personality traits: A replication study. Journal of Research in Personality, 47, 778–781. First citation in articleCrossrefGoogle Scholar

  • Rauthmann, J. F. & Sherman, R. A. (2016). Measuring the situational eight DIAMONDS characteristics of situations. European Journal of Psychological Assessment, 32, 165–174. First citation in articleLinkGoogle Scholar

  • Rauthmann, J. F., Sherman, R. A. & Funder, D. C. (2015). Principles of situation research: Towards a better understanding of psychological situations. European Journal of Personality, 29, 363–381. First citation in articleCrossrefGoogle Scholar

  • Raykov, T. & Pohl, S. (2013). On studying common factor variance in multiple-component measuring instruments. Educational and Psychological Measurement, 73, 191–209. First citation in articleCrossrefGoogle Scholar

  • Salgado, J. F., Moscoso, S. & Berges, A. (2013). Conscientiousness, its facets, and the prediction of job performance ratings: Evidence against the narrow measures. International Journal of Selection and Assessment, 21, 74–84. First citation in articleCrossrefGoogle Scholar

  • Schmidt, F. L. & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262–274. First citation in articleCrossrefGoogle Scholar

  • Siegling, A. B., Petrides, K. V. & Martskvishvili, K. (2015). An examination of a new psychometric method for optimizing multi-faceted assessment instruments in the context of trait emotional intelligence. European Journal of Personality, 29, 42–54. First citation in articleCrossrefGoogle Scholar

  • Sparfeldt, J. R., Brunnemann, N., Wirthwein, L., Buch, S. R., Schult, J. & Rost, D. H. (2015). General versus specific achievement goals: A re-examination. Learning and Individual Differences, 43, 170–177. First citation in articleCrossrefGoogle Scholar

  • Woods, S. A. & Anderson, N. R. (2016). Toward a periodic table of personality: Mapping personality scales between the five-factor model and the circumplex model. Journal of Applied Psychology, 101, 582–604. First citation in articleCrossrefGoogle Scholar

  • Ziegler, M. (2014). Stop and state your intentions! Let’s not forget the ABC of test construction. European Journal of Psychological Assessment, 30, 239–242. First citation in articleLinkGoogle Scholar

  • Ziegler, M., Bensch, D., Maaß, U., Schult, V., Vogel, M. & Bühner, M. (2014). Big Five facets as predictor of job training performance: The role of specific job demands. Learning and Individual Differences, 29, 1–7. First citation in articleCrossrefGoogle Scholar

  • Ziegler, M., Booth, T. & Bensch, D. (2013). Getting entangled in the nomological net. European Journal of Psychological Assessment, 29, 157–161. First citation in articleLinkGoogle Scholar

  • Ziegler, M. & Brunner, M. (2016). Test standards and psychometric modeling. In A. A. LipnevichF. PreckelR. RobertsEds., Psychosocial skills and school systems in the 21st century (pp. 29–55). New York, NY: Springer. First citation in articleGoogle Scholar

  • Ziegler, M. & Bühner, M. (2009). Modeling socially desirable responding and its effects. Educational and Psychological Measurement, 69, 548–565. First citation in articleCrossrefGoogle Scholar

  • Ziegler, M., Danay, E., Schölmerich, F. & Bühner, M. (2010). Predicting academic success with the Big 5 rated from different points of view: Self-rated, other rated and faked. European Journal of Personality, 24, 341–355. First citation in articleCrossrefGoogle Scholar

  • Ziegler, M., Dietl, E., Danay, E., Vogel, M. & Bühner, M. (2011). Predicting training success with general mental ability, specific ability tests, and (un) structured interviews: A meta analysis with unique samples. International Journal of Selection and Assessment, 19, 170–182. First citation in articleCrossrefGoogle Scholar

  • Ziegler, M. & Hagemann, D. (2015). Testing the unidimensionality of items: Pitfalls and loopholes. European Journal of Psychological Assessment, 31, 231–237. First citation in articleLinkGoogle Scholar

  • Ziegler, M. & Horstmann, K. (2015). Discovering the second side of the coin: Integrating situational perception into psychological assessment. European Journal of Psychological Assessment, 31, 69–74. First citation in articleLinkGoogle Scholar

  • Ziegler, M., Poropat, A. & Mell, J. (2014). Does the length of a questionnaire matter? Journal of Individual Differences, 35, 250–261. First citation in articleLinkGoogle Scholar

  • Ziegler, M. & Ziegler, J. (2015). Better understanding of psychological situations: opportunities and challenges for psychological assessment. European Journal of Personality, 29, 418–419. First citation in articleGoogle Scholar

  • Zinbarg, R. E., Revelle, W., Yovel, I. & Li, W. (2005). Cronbach’s α, Revelle’s β, and McDonald’s ω H: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70, 123–133. First citation in articleCrossrefGoogle Scholar

Matthias Ziegler, Institut für Psychologie, Humboldt Universität zu Berlin, Rudower Chaussee 18, 12489 Berlin, Germany, Tel. +49 30 2093-9447, Fax +49 30 2093-9361, E-mail
Martin Bäckström, Lund University, Dept. of Psychology, Box 117, 221 00 Lund, Sweden, E-mail