Detection of Sex Differential Item Functioning in the Cornell Critical Thinking Test
Abstract
Critical thinking (CT) can be described as the conscious process a person does to explore a situation or a problem from different perspectives. Accurate measurement of CT skills, especially across subgroups, depends in part on the measurement properties of an instrument being invariant or similar across those groups. The assessment of item-level invariance is a critical component of building a validity argument to ensure that scores on the Cornell Critical Thinking Test (CCTT) have similar meanings across groups. We used logistic regression to examine differential item functioning by sex in the CCTT-Form X. Results suggest that the items function similarly across boys and girls with only 5.6% (4) of items displaying DIF. This implies that any mean differences observed are not a function of a lack of measurement invariance and supports the validity of the inferences drawn when comparing boys and girls on scores on the CCTT.
References
1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
. (1995). Beginning the dialog: Thoughts on promoting critical thinking. Teaching in Psychology, 22, 6–7.
(in press ). Are crosscultural comparisons of personality profiles meaningful? Differential item and facet functioning in the revised NEO personality inventory. Journal of Personality and Social Psychology.1993). Multivariate group comparisons of variable systems: MANOVA and structural equation modeling. Psychological Bulletin, 114, 174–184.
(2001). Gender differences in personality traits across cultures: Robust and surprising findings. Journal of Personality and Social Psychology, 81, 322–331.
(2006). An “infusion” approach to critical thinking: Moore on the critical thinking debate. Higher Education Research and Development, 25, 179–193. doi 10.1080/ 07294360600610420
(1993). Critical thinking assessment. Theory into Practice, 32, 179–186.
(2005). Cornell critical thinking tests. Seaside, CA: The Critical Thinking Co.
(2009). The effect of developing reflective thinking on metacognitional awareness at primary education level in Turkey. Reflective Practice, 10, 683–695.
(1994). Gender differences in personality: A meta-analysis. Psychological Bulletin, 116, 429–456.
(1989). Reliability. In , Educational measurement (3rd ed., pp. 105–146). New York: Macmillan.
(2007). A synthesis of 15 years of research on DIF in language testing: Methodological advances, challenges, and recommendations. Language Assessment Quarterly, 4, 113–148.
(2007). Detection of crossing differential item functioning: A comparison of four methods. Educational and Psychological Measurement, 67, 565–582.
(2008, March). When under the influence of noninvariant factor loadings, does computation method of the factor score matter? Paper presented at the American Educational Research Association, New York, NY.
(2007). Iterative purification and effect size use with logistic regression for DIF detection. Educational and Psychological Measurement, 67, 373–393.
(2007, April). The influence of differential item functioning on multisample confirmatory factor analysis. Paper presented at the National Council on Measurement in Education conference, Chicago, IL.
(1978). The relationship of a measure of critical thinking ability to personality variables and to indicators of academic achievement. Educational and Psychological Measurement, 38, 1181–1187
(2006). Promoting thinking skills in education. London Review of Education, 4, 291–302. doi 10.1080/ 14748460601044005
(1998). Teaching critical thinking for transfer across domains. American Psychologist, 53, 449–455.
(1999). Teaching for critical thinking: Helping college students develop the skills and dispositions of a critical thinker. New Directions for Teaching and Learning, 80, 69–74.
(1991). Fundamentals of item response theory. Newbury Park, CA: Sage.
(2004). Differential item functioning detection and effect size: A comparison between logistic regression and Mantel-Haenszel procedures. Educational and Psychological Measurement, 64, 903–915.
(1988). Differential item performance and the Mantel-Haenszel procedure. In , Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum.
(2001). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329–349.
(2006). Validation. In , Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education/Praeger.
(1990). Critical thinking among college and graduate students. The Review of Higher Education, 13, 167–186.
(2005). Developing critical thinking skills in nursing students by group dynamics. The Internet Journal of Advanced Nursing Practice, 7, Retrieved from www.ispub.com/ostia/index.php?xmlFile Path=journals/ijanp/vol7n2/skills.x ml
(2009). Impact of differential item functioning on subsequent statistical conclusions based on observed test score data. Psicológica, 30, 343–370.
(2007). Critical thinking and learning. Educational Philosophy and Theory, 4, 339–349. doi 10.1111/j.1469-5812.2007.00343.x
(2010). Changes in critical thinking skills following a course on science and pseudoscience: A quasiexperimental study. Teaching of Psychology, 37, 85–90.
(1990). Levels of intellectual development and associated critical thinking skills in college students. Journal of College Student Development, 31, 538–547.
(1996). Applied linear Statistical Models (4th ed). Boston, MD: WCB McGraw-Hill.
(2008). Evaluation of Halpern’s “Structural Component” for improving critical thinking. The Spanish Journal of Psychology, 11, 266–274.
(1988). Research needed on critical thinking. Canadian Journal of Education, 13, 125–137.
(2006). The Critical Thinking Reading and Writing Test. Tomales, CA: Foundation for Critical Thinking.
(1993). The role of gender-related processes in the development of sex differences in self-evaluation and depression. Journal of Affective Disorders, 29, 97–128.
(2007). Looking for critical thinking in online threaded discussions. Journal of Educational Technology Systems, 35, 241–260.
(2010). The examination of critical thinking styles of university students (TRNC Sample). Procedia Social and Behavioral Sciences, 9, 864–868.
(2007). Generic critical thinking infusion and course content learning in introductory psychology. Journal of Instructional Psychology, 34, 972–987.
(2006). Using effect sizes for research reporting: Examples using item response theory to analyze differential item functioning. Psychological Methods, 11, 402–415.
(2005). Efficiency of the Mantel, generalized Mantel-Haenszel, and logistic discriminant function analysis methods in detecting differential item functioning for polytomous items. Applied Measurement in Education, 18, 313–350.
(1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.
(1986). Beyond group mean differences: The concept of item bias. Psychological Bulletin, 99, 118–128.
(2010). CAT instrument technical information. Retrieved from www.tntech.edu/images/stories/cp/cat/reports/CAT_Technical_Information_V7.pdf
. (2003). Critical thinking in distance education and traditional education. The Quarterly Review of Distance Education, 4, 401–407.
(2009). Watson-Glaser II Critical Thinking Appraisal technical manual and user’s guide. San Antonio, TX: Pearson. Retrieved from www.talentlens.com/en/downloads/supportmaterials/WGII_Technical_Manual.pdf
(1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
(