Skip to main content
Free AccessEditorial

Assessing Behavior Difficulties in Students

Published Online:https://doi.org/10.1027/1015-5759/a000468

Is There a Need for a Special Issue on Assessing Behavior Difficulties in Students?

Research about assessment has increased dramatically in recent years. As argued by Greiff (2017), this is true for psychological assessment in general and clinical and educational assessment in particular. Especially in the field of school psychology, the assessment of behavior difficulties has received a considerable amount of research attention (Volpe & Chafouleas, 2011). Behavioral difficulties are related to a wide range of developmental dysfunctions in a variety of domains, such as social skills, self-regulation, executive functions, attention, information processing, motor activities, emotions, and distress (for an overview, see Garner, Kauffman, & Elliott, 2014). Behavioral manifestations range from internalizing behavior (i.e., withdrawn and inhibited symptoms) to externalizing behavior (i.e., undercontrolled and disruptive symptoms). Since any behavior is embedded in a broader context (e.g., ecological theory of human development and socialization of Bronfenbrenner, 1989), it has to be studied with respect to the relevant contextual factors.

As stated in our call for papers, the topic of behavior difficulties of students has reached a rather prominent position in education and inclusion debates in recent years. Children and youth with behavioral difficulties present tremendous challenges to both families and schools (Landrum, 2017). In order to intervene as early as possible, it is important to have appropriate instruments to assess core features of behavior difficulties. However, there are many available instruments that do not consider new developments in scale constructions (for an overview, see Danner et al., 2016). For instance, about a decade ago, measurement invariance within different subgroups such as students with special educational needs and students without special educational needs was hardly ever taken into account. Nowadays, the analysis of measurement invariance or equivalence in broader terms (e.g., Chen, 2008) is considered to be a prerequisite for comparing group differences. And yet, analyses of measurement invariance are still not applied as standard routine, and many publications in this research area do not even refer to this important asset.

This special issue aims to expand our knowledge on the assessment of behavior difficulties, introducing innovative assessments, and discussing challenges which influence the quality of measurements. Further, in our call for papers, we encouraged authors to examine the different instruments available in line with the developments in the assessment of psychological constructs. In the following paragraphs, we provide an overview of the six articles that compose this special issue. All of them have undergone a blind review process. Beforehand, we would like to thank all the anonymous reviewers for their valuable comments, which ensured the high quality of all papers accepted.

The Articles Included in This Special Issue

The first paper addresses the topic of early identification of behavioral and emotional problems in preschoolers. Based on a sample of 1,135 children aged 3–6 years, Rogge, Koglin, and Petermann (2018) first investigated the factorial structure of the German parent and teacher versions of the Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997). Second, they tested whether the screening instrument was measurement invariant across parents and teachers. The findings indicated that the original five-factor structure is disputable. In particular, the responses to the reverse-worded items showed significant cross-loading on the prosocial behavior factor and hence seemed to reflect both the presence of a specific problem and a positive response product. The results of multiple group comparisons suggested that strict measurement invariance across teachers and parents is tenable. The mean differences in SDQ scores between parents and teachers are thus comparable.

In the second paper, Hennig, Schramm, and Linderkamp (2018) assessed the behavioral symptoms of attention-deficit hyperactivity disorder (ADHD) by three different informants. More specifically, the agreement between parent, teacher, and adolescent ratings was compared in a sample of 114 adolescents with ADHD. The cross-informant disagreement on treatment outcomes was investigated in a subsample of 54 adolescents who had undergone a training and coaching intervention. The results showed moderate agreement between informants. The strongest discrepancy was found between teacher and adolescent ratings on prosocial behavior. The training was found to be more effective when there was less disagreement between teachers and adolescent ratings. The results indicated that multiple sources should be considered when evaluating treatment effects. In this context, one should be aware of cross-informant disagreement because of the risk of diminished treatment effects.

Casale et al. (2018) examined the item and scalar equivalence of the short form of the Integrated Teacher Report Form (ITRF; Volpe & Fabiano, 2013), an abbreviated school-based universal screener. The measurement invariance was analyzed in a sample of K-6 students from the US and Germany (NUS = 390; NGermany = 965). The findings showed that all invariance models (i.e., configural, metric, scalar) held. This means that (a) both factors of the abbreviated ITRF (academic productivity problems, oppositional behavior) showed the same pattern for US and German students, (b) these factors were measured equally well by eight items per scale in both samples, and (c) US and German teachers answered the 4-point Likert scale in a similar way. Although the results indicate equivalence between the samples in the two countries, other studies found evidence of high observer bias in teacher ratings. Therefore, it would be important to examine measurement invariance across samples of students with and without emotional and behavioral disorders to identify potential differences in teacher ratings between these groups.

The fourth paper, authored by Carvalho, Faria, Conceição, de Margarida Matos, and Essau (2018), focused on the assessment of callous and unemotional traits. Callous-unemotional traits, such as limited empathy, guilt, and shallow affect, are being considered as one subtyping index for conduct disorders. The aim of this study was to examine the factor structure, internal consistency, construct validity, and correlates of the Portuguese version of the Inventory of Callous and Unemotional Traits (ICU; Frick, 2003) in a Portuguese normative sample (N = 1,399) of children and adolescents between the age of 7 and 17 years. In general, the findings showed that the ICU seems to be a reliable and valid instrument for assessing callous-unemotional traits among children and adolescents. Contrary to the theoretical three-factor model, a two-factor hierarchical structure best fit the data, with the two dimensions “callous” and “uncaring” loading on one general factor. Furthermore, significant main effects for gender and interaction effects between gender and age were found. In addition to this, scalar measurement invariance among boys and girls was not established. Taken together, the findings provide support for further development of the ICU, but its structure and validity needs additional scrunity.

In the fifth contribution, Jansma, Malti, Opdenakker, and van der Werf (2018) examined the reliability and validity of an instrument assessing the anticipated emotions in scenario-based moral transgressions in a sample of children aged 6–13 years. The instrument was deemed a reliable one-factor measure of anticipated emotions in the context of moral transgressions based on domain and developmental variability. The results indicated no variability in the three hypothetical moral domains of unfairness, victimization, and omission of prosocial duties, although some variability was found in regard to the anticipated emotions across age groups. An indication of concurrent and predictive validity was given, as anticipated negative emotions were positively related to sympathy and prosocial tendencies, but no relation to antisocial tendencies was found. Thus, the research provides preliminary evidence of the reliability and validity of the instrument.

Finally, Achenbach (2018) presents a review and summary of the multi-informant and multicultural advances of the Achenbach System of Empirically Based Assessment (ASEBA; e.g., Achenbach, 2009). The issue of differences between different informants is critically reviewed, and assessment tools for multiple perspectives (teachers, parents, students, observers, interviewers, and test administrators) are presented. Cross-informant comparisons of scale scores are also pointed out using a case illustration. A detailed overview of the samples, test-retest, and internal reliabilities according to Cronbach’s α are given for the ASEBA instruments. In addition, the addressed tools offer multicultural norms, and the author stresses the need for providing multicultural norms to take into account societal and cultural differences. Again, a case illustration is used to demonstrate the necessity of multicultural norms. Finally, the author shows how decisions about interventions can be based on multi-informant and multicultural assessments.

Conclusions and Perspectives

With this special issue, we aimed to contribute to the development of measuring behavior difficulties in students. First, well-established instruments were revisited and refined. Second, a case was made for multi-informant measures, which help to provide a more comprehensive picture of specific behaviors. Third, the relevance of testing for measurement invariance was emphasized.

While the six papers of this special issue cover much ground, many important aspects have not been touched upon. For instance, longitudinal studies are especially important for understanding how behavior problems develop and how they are affected by other variables (e.g., situation-specific factors, classroom composition). However, most of the studies in this special issue were cross sectional. Furthermore, they predominantly used retrospective measures, which may be biased (retrospection effects; e.g., Stone & Litcher-Kelly, 2006). Moreover, situational factors influencing the development of behavior cannot be easily taken into account by retrospective assessment. Therefore, we encourage researchers to use instruments which measure behavior more directly and in situ. Advances in this regard would be welcomed because behavior manifestations should be conceived as person–situation interactions (rather than “just” the characteristics of people), which is still not routinely addressed in research.

This conception is also mirrored in the ever-increasing interest in person-situation research in psychology (Geiser, Götz, Preckel, & Freund, 2017). This discussion seems relevant in the field of behavioral assessment as well because behavior difficulties are not only stable dispositions of people but vary or fluctuate over different situations. For example, to better understand the behavior of a student who is described as aggressive, it is necessary to intensively assess the situations in which he or she shows or doesn’t show problem behavior, as well as closely analyze the characteristics of these specific situations. In this respect, we see the potential of innovative assessment techniques such as the Direct Behavior Rating (DBR; e.g., Christ, Nelson, Van Norman, Chafouleas, & Riley-Tillman, 2014) or the Experience Sampling Method (ESM) as intensive longitudinal measurements (e.g., Hamaker & Wichers, 2017). In view of recent technological advances such as tablets, smartphones, and sensors, we are confident that these possibilities can be readily included in the near future. For example, instead of applying a common paper-and-pencil questionnaire to assess a student’s general aggressive tendency, one could use computer-based sampling via smartphone to assess the student’s behavioral and emotional reactions in different situations and his or her momentary social context, to go back to the aforementioned example.

The advantages of these possibilities for progress monitoring of behavior are evident. As this topic is gaining increased importance for evaluating responses to interventions at school, the technological advances are welcome in practice as well. Therefore, when it comes to further developing innovative assessment techniques, attention should be paid to their applicability so that methods such as DBR and ESM can provide valuable instruments for practice. Nonetheless, it should be kept in mind that one of the fundamental problems of defining and assessing behavioral difficulties remains the question of what is considered as “normal” and what as discrepant, and how it differs from the norm (Landrum, 2017). Thus, much work remains to be done also with regard to available assessment tools, such as the updated standardization of well-established instruments to assess behavioral difficulties in students.

But let us now turn to the six papers of this special issue. Enjoy!

References

  • Achenbach, T. M. (2009). The Achenbach System of Empirically Based Assessment (ASEBA): Development, findings, theory, and applications. Burlington, VT: University of Vermont, Research Center for Children, Youth, and Families. First citation in articleGoogle Scholar

  • Achenbach, T. M. (2018). Multi-informant and multicultural advances in evidence-based assessment of students’ behavioral/emotional/social difficulties. European Journal of Psychological Assessment, 34, 127–140. https://doi.org/10.1027/1015-5759/a000448 First citation in articleLinkGoogle Scholar

  • Bronfenbrenner, U. (1989). Ecological systems theory. In R. VastaEd., Six theories of child development: Revised formulations and current issues (Vol. 6, pp. 187–249). Greenwich, CT: JAI Press. First citation in articleGoogle Scholar

  • Carvalho, M., Faria, M., Conceição, A., de Margarida Matos, G. & Essau, C. A. (2018). Callous-unemotional traits in children and adolescents: Psychometric properties of the Portuguese version of the Inventory of Callous-Unemotional Traits. European Journal of Psychological Assessment, 34, 87–96. https://doi.org/10.1027/1015-5759/a000449 First citation in articleLinkGoogle Scholar

  • Casale, G., Volpe, R. J., Daniels, B., Hennemann, T., Briesch, A. M. & Grosche, M. (2018). Measurement invariance of a universal behavioral screener across samples from the USA and Germany. European Journal of Psychological Assessment, 34, 113–126. https://doi.org/10.1027/1015-5759/a000447 First citation in articleLinkGoogle Scholar

  • Chen, F. F. (2008). What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research. Journal of Personality and Social Psychology, 95, 1005–1018. First citation in articleCrossrefGoogle Scholar

  • Christ, T. J., Nelson, P. M., Van Norman, E. R., Chafouleas, S. M. & Riley-Tillman, T. C. (2014). Direct behavior rating: An evaluation of time-series interpretations as consequential validity. School Psychology Quarterly, 29, 157–170. https://doi.org/10.1037/spq0000029 First citation in articleCrossrefGoogle Scholar

  • Danner, D., Blasius, J., Breyer, B., Eifler, S., Menold, N., Paulhus, D. L., … Ziegler, M. (2016). Current challenges, new developments, and future directions in scale construction. European Journal of Psychological Assessment, 32, 175–180. https://doi.org/10.1027/1015-5759/a000375 First citation in articleLinkGoogle Scholar

  • Frick, P. J. (2003). The inventory of callous-unemotional traits. Unpublished rating scale, University of New Orleans. First citation in articleGoogle Scholar

  • Garner P.Kauffman J.Elliott J. (Eds.). (2014). The Sage handbook of emotional and behavioral difficulties. Los Angeles, CA: Sage. First citation in articleCrossrefGoogle Scholar

  • Geiser, C., Götz, T., Preckel, F. & Freund, P. A. (2017). States and traits theories, models, and assessment. European Journal of Psychological Assessment, 33, 219–223. https://doi.org/10.1027/1015-5759/a000413 First citation in articleLinkGoogle Scholar

  • Greiff, S. (2017). The field of psychological assessment: Where it stands and where it’s going – A personal analysis of foci, gaps, and implications for EJPA. European Journal of Psychological Assessment, 33, 1–4. https://doi.org/10.1027/1015-5759/a000412 First citation in articleLinkGoogle Scholar

  • Goodman, R. (1997). The strengths and difficulties questionnaire: A research note. Journal of Child Psychology and Psychiatry, 38, 581–586. First citation in articleCrossrefGoogle Scholar

  • Hamaker, E. L. & Wichers, M. (2017). No time like the present: Discovering the hidden dynamics in intensive longitudinal data. Current Directions in Psychological Science, 26, 10–15. https://doi.org/10.1177/0963721416666518 First citation in articleCrossrefGoogle Scholar

  • Hennig, T., Schramm, S. A. & Linderkamp, F. (2018). Cross-informant disagreement on behavioral symptoms in adolescent ADHD and its impact on treatment effects. European Journal of Psychological Assessment, 34, 79–86. https://doi.org/10.1027/1015-5759/a000446 First citation in articleLinkGoogle Scholar

  • Jansma, D., Malti, T., Opdenakker, M.-Ch. & van der Werf, G. (2018). Assessment of anticipated emotions in moral transgressions. European Journal of Psychological Assessment, 34, 97–212. https://doi.org/10.1027/1015-5759/a000467 First citation in articleLinkGoogle Scholar

  • Landrum, T. J. (2017). Emotional and behavioral disorders. In J. M. KauffmanD. P. HallahanP. C. PullenEds., Handbook of special education (2nd ed.). New York, NY: Routledge. First citation in articleGoogle Scholar

  • Rogge, J., Koglin, U. & Petermann, F. (2018). Do they rate in the same way? Testing of measurement invariance across parent and teacher SDQ ratings. European Journal of Psychological Assessment, 34, 69–78. https://doi.org/10.1027/1015-5759/a000445 First citation in articleLinkGoogle Scholar

  • Stone, A. A. & Litcher-Kelly, L. (2006). Momentary capture of real-world data. In M. EidE. DienerEds., Multimethod measurement in psychology (pp. 61–72). Washington, DC: American Psychological Association. First citation in articleGoogle Scholar

  • Volpe, R. J. & Chafouleas, S. M. (2011). Assessment of externalizing behavioral deficits. In M. A. BrayT. J. KehleEds., The Oxford handbook of school psychology (pp. 284–311). New York, NY: Oxford University Press. First citation in articleGoogle Scholar

  • Volpe, R. J. & Fabiano, G. A. (2013). Daily behavior report cards: An evidence-based system of assessment and intervention. New York, NY: Guilford Press. First citation in articleGoogle Scholar

Carmen L. A. Zurbriggen, Faculty of Educational Science, University of Bielefeld, Universitätsstrasse 25, 33501 Bielefeld, Germany,
Susanne Schwab, School of Education, University of Wuppertal, Rainer-Gruenter-Straße 21, 42119 Wuppertal, Germany,
Anke de Boer, Department of Special Needs Education & Youth Care, University of Groningen, Grote Rozenstraat 38, 9712 TJ Groningen, The Netherlands,
Ute Koglin, Department of Special Education and Rehabilitation, Carl v. Ossietzky University Oldenburg, 26111 Oldenburg, Germany,