Skip to main content
Open AccessOriginal Article

Measuring Anxiety Symptoms – Psychometric Properties of the Dutch Version of the Lehrer–Woolfolk Anxiety Scale Questionnaire

Published Online:https://doi.org/10.1027/2698-1866/a000063

Abstract

Abstract:Introduction: The Lehrer–Woolfolk Anxiety Symptom Questionnaire (LWASQ) is a self-report questionnaire based on the tripartite model of Lang (1971) for measuring treatment progress. However, so far little is known about its psychometric qualities. Two studies were conducted to get more clarity on the factor structure and reliability of the LWASQ. Method: Factor structure and internal consistency of the LWASQ were assessed using a sample of 2,117 patients with anxiety disorders. Test–retest reliability was measured with a three-week interval, using a sample of 49 people in a general population. We aimed to measure construct validity with a third sample of patients with anxiety disorders, but were not able to due to a small sample size. Results: Factor analysis confirmed the three known factors of the LWASQ, i.e., somatic complaints, cognitive problems, and behavioral complaints. Both internal consistency for all three subscales (r = .852–.927) and test–retest reliability were found to be good (r = .88). Conclusion: Psychometric properties of the LWASQ are promising, but further validation is needed to draw more definitive conclusions regarding its usefulness in a clinical population.

Lang’s tripartite model of fear (1971) states that anxiety can be divided into a cognitive, behavioral, and somatic response system. These three dimensions are weakly correlated with each other and may change independent of each other in the course of treatment. Treatments for anxiety and the definitions of anxiety disorders are often based on all three of these modalities. Therefore, to evaluate the progress of anxiety treatments, it is important to have questionnaires that measure these three modalities of fear. This way we can measure if there is progress on all three modalities on their own. The Lehrer–Woolfolk Anxiety Symptom Questionnaire (LWASQ; Lehrer & Woolfolk, 1982), a 36-item self-report questionnaire that is developed to measure general anxiety symptoms, seems suitable for this goal, as it consists of three subscales: somatic complaints, cognitive problems, and behavioral complaints. The somatic subscale measures somatic symptoms of anxiety, the cognitive subscale refers to a tendency to worry and ruminate, and the behavioral subscale measures avoidance of social situations. The three subscales are correlated with each other. Rose and Devine (2022) mention the LWASQ briefly as a self-report questionnaire for generic anxiety, which can be used for screening purposes and outcome assessments. The LWASQ is the only questionnaire mentioned by Rose and Devine (2022) that measures all three modalities as different constructs. However, little is known about the psychometric properties, as only two studies reported on the psychometric qualities of this questionnaire (Lehrer & Woolfolk, 1982; Scholing & Emmelkamp, 1992). In the Netherlands, the Dutch version of the LWASQ is used within the biggest outpatient mental health care provider in the country, PsyQ, for measuring symptoms of anxiety as part of Routine Outcome Measurement. Because of this widespread use of the LWASQ, it is important that the psychometric properties of the Dutch version are studied.

Lehrer and Woolfolk (1982) designed the original questionnaire. They started with two lists of questions derived from different questionnaires, the Minnesota Multiphasic Personality Inventory (MMPI; Hathaway & McKinley, 1951) and the Spielberger State-Trait Anxiety Inventory (STAI; Spielberger et al., 1970) and the authors own clinical experience. The questions covered anxiety symptoms in the three modalities of somatic complaints, cognitive problems, and behavioral complaints. Each item was rated on a 9-point Likert scale ranging from never to almost always. They tested the first list in their first study with 60 items on 451 undergraduate psychology students. The second study used a list with 112 items that was tested on a mixed population (289 adult night school students, 70 neurotic patients with anxiety problems, and 67 stress workshop participants) using principal component analysis. In this analysis they found three expected factors: a somatic complaints factor, a behavioral complaints factor, and a cognitive complaints factor. They created the LWASQ with 36 items that loaded more than .5 on the rotated factors of Study 2. Examples of items are “I feel dizzy,” “I try to avoid social gatherings,” and “I picture some future misfortune.” The three factors correlated more with each other than expected (range between .47 and .66), which means they are not completely independent of each other. Split-half reliability was between r = .83 and r = .85 for all three factors in Study 1 and between r = .91 and r = .93 for all three factors in Study 2. Other psychometric properties were measured in following studies: Scholing and Emmelkamp (1992) translated the LWASQ and changed the 9-point Likert scale to a 5-point Likert scale to make it easier for patients to score the items. Using a confirmatory factor analysis (CFA), they confirmed the factor structure for the Groninger Angst Schaal (GAS), the Dutch translation of the LWASQ. Correlations between the three factors ranged from .38 to .62 in the different groups of social phobic patients, normal population, and adolescents. They further found that internal consistency ranged from α = .83 to α = .92 for the three subscales in all three groups.

As Scholing and Emmelkamp (1992) state, the results of the construct validity that Lehrer and Woolfolk (1982) found are contradictory. They found results that support that there is no differentiation between the three subscales. For example, the Trait Anxiety Scale of the STAI (Spielberger et al., 1970) and the IPAT Anxiety Inventory (Krug et al., 1976) correlate significantly with all three subscales of the LWASQ, while the somatic subscale of the Symptom Checklist-90 (SCL-90; Arrindell & Ettema, 2005) and the Eysenck (Eysenck & Eysenck, 1963) introversion subscale showed correlations that corroborate a difference between the three subscales. Scholing and Emmelkamp (1992) also studied convergent and divergent validity with other self-report measures such as the Dutch versions of the SCL-90 (Arrindell & Ettema, 2005), Fear Questionnaire (FQ; Arrindell & Emmelkamp, 1984), and the Social Cognitions Inventory (SCI; Van Kamp & Klip, 1981). The subscale somatic complaints of the SCL-90 correlated strongly (.75 and .84) with the somatic subscale of the LWASQ. Correlations with the other subscales of the LWASQ were at least .27 lower. The behavioral subscale was compared with FQ social phobia and the SCI. As expected, the correlation between the FQ social phobia scale and the behavioral subscale was high (.67). Contrary to their hypothesis, the behavioral subscale also correlated substantially with the SCI (.52). However, the cognitive subscale had a relatively low correlation with the SCI (.42). This could be explained by the fact that the SCI primarily measures cognitive complaints linked to social phobia, while the LWASQ cognition subscale also refers to other cognitive complaints as seen in a generalized anxiety disorder or obsessive-compulsive disorder. This can be seen by the strong correlations of the cognitive subscale with the depression and the interpersonal sensitivity subscales of the SCL-90 (.72 and .74). In conclusion, the somatic complaints factor possessed good convergent and divergent validity. The behavioral and cognitive subscales are more ambiguous.

In sum, the psychometric properties are inconsistent. The factor structure of the LWASQ and the Dutch translation seem sound but are not tested on a population with diverse anxiety disorders. Lehrer and Woolfolk (1982) only used a small sample of anxious patients, and it is not clear if the anxious patients who participated were diagnosed with an anxiety disorder, whereas Scholing and Emmelkamp (1992) studied a normal population (n = 103), an adolescent population (n = 650), and a social phobic sample (n = 108). They found that the LWASQ was able to discriminate between social phobic patients and a normal population but suggested further research in a mixed anxiety sample to assess if it is possible to discriminate between different anxiety disorders using this questionnaire. However, the construct validity seems poor. The results found by Lehrer and Woolfolk (1982) contradict each other, and the measures used do not all measure the same construct that the LWASQ measures. Scholing and Emmelkamp (1992) also found ambiguous results. The somatic complaints factor seems to have good divergent and convergent validity. The behavioral complaints and cognitive problems factors show good convergent validity but low divergent validity. This could be explained by the fact that Scholing and Emmelkamp (1992) only used patients with a social phobia while the questionnaires focuses on all kinds of aspects of anxiety that are not always part of social phobia. Furthermore, both studies did not measure test–retest reliability.

The goal of the present study was to evaluate the psychometric properties of the Dutch version of the LWASQ in a mixed anxiety sample. With this in mind, we conducted three sub studies. In the first study, we examined the factor structure using a confirmatory factor analysis. Next to that we examined the internal consistency with McDonald’s ω. Both measures were performed in a clinical sample of patients with an anxiety disorder. We expect to find the same three factors as in both previous studies and expect them to be moderately correlated. The second study measured the test–retest reliability in a normal population. We expect the test–retest reliability to be good (r > .80). The third study measures the convergent and divergent validity in a clinical sample of patients with an anxiety disorder. However, given the small sample size, we could not draw any conclusions for this study and decided not to mention the outcomes in this paper.

Study 1: Factor Structure and Internal Consistency in a Clinical Sample of Patients With an Anxiety Disorder

Study Design

We used pretreatment routine outcome measurement (ROM) data of PsyQ since 2015, the year the LWASQ was introduced in the ROM of PsyQ. In the Netherlands, routine outcome measurement is required; therefore, every patient who is referred to PsyQ gets an e-mail with the ROM. The ROM is sent every three months to measure therapy progress. These data are anonymized and used for research if patients gave consent (this is asked in the first questionnaire). Data we used for this study were requested from the data processing department of PsyQ, and before data were sent to us, a data protection impact assessment was executed. That is, to ensure no privacy laws are broken using these data, a data protection officer reviewed the study design and approved the use of the data. The ROM data were anonymized before they were sent to the researchers, and only the complete data of the LWASQ at first intake or first treatment appointment at the department of anxiety disorders, diagnosis, and gender were provided to the researchers. There were no missing values.

Subjects

The ROM measurements of 2,165 patients were made available to the researchers by the data processing department of PsyQ. For unknown reasons, 48 patients had filled in two pretreatment measurements, of which the first administered one was included. Of the 2,117 patients who were included, 63.4% were women. Two third of patients had an anxiety disorder (53.3%) or an OCD or OCD-related disorder (13.7%) as main diagnosis. In the remaining one third of the patients, there was a comorbid anxiety or OCD disorder present when the LWASQ was administered. Of this subgroup of patients with a comorbid anxiety or OCD disorders, the main DSM-5 diagnoses were mood disorders (9.9%), personality disorders (6.2%), PTSD or PTSD-related disorders (4.1%), ADHD (1.6%), eating disorders (0.4%), or other disorders (3.4%). Group characteristics are presented in Table 1.

Table 1 Group characteristics of a clinical sample of patients with an anxiety disorder in Study 1

Analysis and Results

We conducted a CFA using maximum likelihood in JASP Team (2023); see Figure 1 for our hypothesized model. We used the following benchmarks: The CFI should be larger than .9 (Hu & Bentler, 1999), the RMSEA point estimate and the upper bound of the 95 percent confidence interval should be smaller than 0.05 (Browne & Cudeck, 1992; Jöreskog & Sörbom, 1993), and the SRMR should be smaller than .08 (Hu & Bentler, 1999). As Heene et al. (2011) and Greiff and Heene (2017) point out, the model fit as described by Hu and Bentler (1999) within social studies might not work, so we also looked at global misfit of the model. Next to that we used the indications of Little (2013), t < .85 is poor, t between .85 and .90 is mediocre, t between .90 and .95 is acceptable, t between .95 and 99 is very good, and t > .99 is outstanding. Factors such as somatic complaints, cognitive problems, and behavioral complaints were based on the studies of Lehrer and Woolfolk (1982) and Scholing and Emmelkamp (1992). We expect a moderate correlation between the three factors. We expect Items 1, 2, 4, 7, 10, 13, 14, 18, 20, 23, 29, 30, 31, 33, 34, and 35 to load on the factor somatic complaints. We expect Items 3, 6, 9, 12, 17, 22, 25, 26, and 28 to load on factor behavioral complaints. Finally, we expect Items 5, 8, 11, 15, 16, 19, 21, 24, 27, 32, and 36 to load on the factor cognitive problems.

Figure 1 Hypothesized confirmatory factor analysis model.

First, we checked the assumption of normal distribution. All QQ plots showed a normal distribution on all the items. The factor model was significant (χ2 = 6,286.95; p < .001). The fit of the factor model was good with respect to RMSEA (.067) and SRMR (.060), but the CFI was too low (.84). We checked the modification indices as suggested by Heene et al. (2011) and Greiff and Heene (2017). These indicated some residual correlations between different items. We checked these correlations on content. When the correlations were logical on clinical level, we added these to the model. For example, Item 8 “I cannot get some thought out of my mind” and Item 21 “I cannot get some pictures or images out of my head” showed residual correlations. If we examine these items on content, we see considerable overlap, or Item 1 “My throat gets dry” and Item 2 “I have difficulty in swallowing.” We can imagine if someone has a dry throat swallowing will also be difficult. Based on content, we added these correlations. In the end, we added nine residual correlations, and these were Items 8–21, 1–2, 24–27, 23–33, 29–33, 18–20, 4–35, 26–28, and 11–27. After adding these correlations, the CFI was 0.9, the RMSEA (.054) and SRMR (.052) still showed an acceptable fit, and the model remained significant (χ2 = 4,199.725; p < .001). The three factors correlated moderately with each other and are presented in Table 2. As shown in Table 3, the nonstandardized factor loadings of all items were high on the expected factors, although the modification indices showed that Item 36 (“I have an uneasy feeling”) showed a fit on both the factor cognitive problems and the factor behavioral complaints. We chose to include the item in the cognitive factor for the following studies, following the previous articles. These results show that our hypothesis is probable, but the model with the new residual correlations must be replicated in another sample.

Table 2 Correlation coefficients between the three factors of the Dutch version of the Lehrer–Woolfolk Anxiety Scale Questionnaire
Table 3 Nonstandardized factor loadings of the Dutch version of the Lehrer–Woolfolk Anxiety Scale Questionnaire

To examine the internal consistency, McDonald’s ω was calculated for the different factors using JASP Team (2023). All three subscales had a good reliability: ω = .93 for the somatic complaints subscale, ω = .90 for the behavioral complaints subscale, and ω = .85 for the cognitive problems subscale.

Study 2: Test–Retest Reliability in a Normal Population

Study Design

To examine the test–retest reliability, 56 participants from the general population, who were not in treatment for psychological problems, were asked to fill in the LWASQ twice, with a time interval between the two assessments of 3 weeks. This was done to rule out both memory effects and risk of changes in life circumstances that could affect responses to the questions. Participants were included by snowball sampling, i.e., the authors asked relatives to complete the measurements themselves, and to ask friends and relatives to complete the measurements. We aimed for 52 measurements based on Shoukri et al. (2004), and they advise this sample size to have a reliable test–retest reliability measure with two assessments and an aimed correlation higher than r = .8.

Subjects

In total, 49 people completed both measurements. Seven people did not complete the second measurement. Table 4 shows the descriptive and demographic information. Mage of the population was 37.1 (SD = 16.9). Two third (67.3%) of the participants were women. Almost everyone (90.1%) had a minimum education level of high school, and 81.7% were fully employed, 40% were married, 16.7% were living together, 31.7% were single, and 6.7% were divorced at the time of the administration.

Table 4 Group characteristics of the normal population sample in Study 2

Analysis and Results

Pearson’s correlation coefficient was estimated to assess the test–retest reliability by examining the correlations between the total scores and the scores on the subscales of the LWASQ. Table 5 presents the M, SDs, McDonald’s ω, and the correlations of all scales. The internal consistency reliability (McDonald’s ω) is good for all scales on both measurements. Pearson’s correlation coefficient for the total score was .88. The subscales of behavioral complaints, somatic complaints, and cognitive problems were .82, .90, and .75, respectively. Despite these correlations, we see a difference in mean scores between both measurements. However, the changes per subscale remain within the norm groups that Scholing and Emmelkamp (1992) found and are not clinical significant changes. We explain these changes as average fluctuations within a person’s normal anxiety level. All scores suggest good test–retest reliability.

Table 5 Mean scores and Pearson’s correlation coefficient of test–retest reliability of the Lehrer–Woolfolk Anxiety Scale Questionnaire

Discussion

In this paper, two studies into the psychometric properties of the LWASQ within both a mixed anxiety population and a healthy sample were described.

First, a CFA was conducted. The results confirmed that the three-factor solution (somatic complaints, behavioral complaints, and cognitive problems), as found by Lehrer and Woolfolk (1982) and Scholing and Emmelkamp (1992), was an acceptable fit. However, one item (Item 36 “I have an uneasy feeling”) showed a fit on more than one factor. Whereas it loaded on all three factors in the Scholing and Emmelkamp (1992) study, this item showed a fit on both cognition and behavior in our findings, indicating that it might represent a general underlying anxiety construct instead of one specific factor. As such, it might be considered to exclude this item. Due to a few modifications in the CFA model, it is important that the factor structure is studied within a new sample to confirm these findings. Furthermore, in the first study, internal reliability was determined with MacDonald’s ω, and in the second study, test–retest reliability was measured. Both measures of all three subscales were found to be good. The correlation in Study 2 was based on a sample of N = 49. While this is enough for a point estimate that is relatively stable, it is not enough to have sufficient power to detect small to medium effects. Therefore, these results need to be replicated in a future study.

Limitations and Recommendations

There are some limitations to these studies. One limitation is the small sample size used in the study into the test–retest reliability. Also, inclusion of both a clinical sample and a general population sample in both studies would have enabled the definition of norm scores and clinical ranges for the questionnaire. Finally, the absence of results of the third study into the construct validity is an important limitation. The construct validity is needed to make statements about the validity and usability of the questionnaire. Future research should therefore focus on measuring construct validity, defining norm scores for a general anxiety population, and on the question if the LWASQ is able to differentiate between different anxiety diagnoses.

Conclusions

From our studies, only provisional conclusions can be drawn about the psychometric properties of the Dutch translation of the LWASQ. The internal reliability and test–retest reliability seem promising, and the results of the factor structure seem acceptable. However, these results should be replicated in future studies to draw more definitive conclusions. Finally, more studies into the construct validity are needed before use of the LWASQ in clinical practice can be recommended.

References

  • Arrindell, W. A., & Emmelkamp, P. M. (1984). Phobic dimensions: I. Reliability and generalizability across samples, gender and nations: The fear survey schedule (FSS-III) and the fear questionnaire (FQ). Advances in Behaviour Research and Therapy, 6(4), 207–253. 10.1016/0146-6402(84)90001-8 First citation in articleCrossrefGoogle Scholar

  • Arrindell, W., & Ettema, J. (2005). Handleiding bij een multidimensionele psychopathologie-indicator. Symptom Checklist SCL-90 [Manual of a multidimensional psychopathology indicator. Symptom Checklist SCL-90]. Ipskamp Drukkers. First citation in articleGoogle Scholar

  • Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21(2), 230–258. 10.1177/0049124192021002005 First citation in articleCrossrefGoogle Scholar

  • Eysenck, H. J., & Eysenck, S. B. (1963). Eysenck Personality Inventory [Database record]. APA PsycTests. 10.1037/t02711-000 First citation in articleCrossrefGoogle Scholar

  • Greiff, S., & Heene, M. (2017). Why psychological assessment needs to start worrying about model fit. European Journal of Psychological Assessment, 33(5), 313–317. 10.1027/1015-5759/a000450 First citation in articleLinkGoogle Scholar

  • Hathaway, S. R., & McKinley, J. C. (1951). Minnesota Multiphasic Personality Inventory; manual, revised. Psychological Corporation. First citation in articleGoogle Scholar

  • Heene, M., Hilbert, S., Draxler, C., Ziegler, M., & Bühner, M. (2011). Masking misfit in confirmatory factor analysis by increasing unique variances: A cautionary note on the usefulness of cutoff values of fit indices. Psychological Methods, 16(3), 319–336. 10.1037/a0024917 First citation in articleCrossrefGoogle Scholar

  • Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modelling: A Multidisciplinary Journal, 6(1), 1–55. 10.1080/10705519909540118 First citation in articleCrossrefGoogle Scholar

  • JASP Team. (2023). JASP (Version 0.17.2) [Computer software]. First citation in articleGoogle Scholar

  • Jöreskog, K. G., & Sörbom, D. (1993). LISREL 8: Structural equation modeling with the SIMPLIS command language. Lawrence Erlbaum Associates. First citation in articleGoogle Scholar

  • Krug, S. E., Scheier, I. H., & Cattell, R. B. (1976). Handbook for the IPAT anxiety scale. Institute for Personality and Ability Testing. First citation in articleGoogle Scholar

  • Lang, P. J. (1971). The application of psychophysiological methods to the study of psychotherapy and behavior modification. In A. E. BerginS. L. Garfield (Eds.), Handbook of psychotherapy and behavior change (pp. 75–125). Wiley. First citation in articleGoogle Scholar

  • Lehrer, P. M., & Woolfolk, R. L. (1982). Self-report assessment of anxiety: Somatic, cognitive, and behavioral modalities. Behavioral Assessment, 4(2), 167–177. First citation in articleGoogle Scholar

  • Little, T. D. (2013). Longitudinal structural equation modeling. Guilford Press. First citation in articleGoogle Scholar

  • Rose, M., & Devine, J. (2022). Assessment of patient-reported symptoms of anxiety. Dialogues in Clinical Neuroscience, 16(2), 197–211. 10.31887/DCNS.2014.16.2/mrose First citation in articleCrossrefGoogle Scholar

  • Scholing, A., & Emmelkamp, P. M. (1992). Self report assessment of anxiety: A cross validation of the Lehrer Woolfolk Anxiety Symptom Questionnaire in three populations. Behaviour Research and Therapy, 30(5), 521–531. 10.1016/0005-7967(92)90036-g First citation in articleCrossrefGoogle Scholar

  • Shoukri, M. M., Asyali, M., & Donner, A. (2004). Sample size requirements for the design of reliability study: Review and new results. Statistical Methods in Medical Research, 13(4), 251–271. 10.1191/0962280204sm365ra First citation in articleCrossrefGoogle Scholar

  • Spielberger, C. D., Gorsuch, R. L. & Lushene, R. E. (1970). The State-Trait Anxiety Inventory (Test Manual). Consulting Psychologists Press. First citation in articleGoogle Scholar

  • Van Kamp, I., & Klip, E. (1981). Cognitieve aspecten van subassertief gedrag [Cognitive aspects of subassertive behavior]. Gedragstherapeutisch Bulletin, 14, 45–56. First citation in articleGoogle Scholar