Detection of Differential Item Functioning
Using Decision Rules Based on the Mantel-Haenszel Procedure and Breslow-Day Tests
Abstract
This study analyzes Differential Item Functioning (DIF) with three combined decision rules and compares the results with the variation of the Mantel-Haenszel procedure (vaMH) proposed by Mazor, Clauser, and Hambleton (1994). One decision rule combines the Mantel-Haenszel procedure (MH) with the Breslow-Day test of trend in odds ratio heterogeneity (BDT), having performed the Bonferroni adjustment, as Randall Penfield proposed. The second uses both MH and BDT without the Bonferroni adjustment. The third combines MH with the Breslow-Day test for homogeneity of the odds ratio without the Bonferroni adjustment. The three decision rules yielded satisfactory results, showed similar power, and none of them detected DIF erroneously. The second rule proved to be the most powerful in the presence of nonuniform DIF. Only in the presence of uniform DIF with the smallest difference of difficulty parameters, was there evidence of vaMH’s superiority.
References
1999). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
. (2009). Erroneous detection of nonuniform DIF using the Breslow-Day test in a short test. Quality and Quantity. International Journal of Methodology, 43, 35–44.
(2004). Un estudio acerca del funcionamiento diferencial no uniforme del ítem.
([A study about nonuniform differential item functioning] Metodología de las Ciencias del Comportamiento, Volumen Especial. 7–10.1980). Statistical methods in cancer research. Volume I. The analysis of case-control studies. Lyon, France International Agency for Research on Cancer (IARC Scientific Publication No. 32).
(1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.
(1998). Using statistical procedures to identify differential functioning test items. Educational Measurement: Issues and Practice, 17, 31–44.
(1994). The effects of score group width on the Mantel-Haenszel procedure. Journal of Educational Measurement, 31, 67–78.
(2000). Detección del funcionamiento diferencial de los items no uniforme: Comparación de los métodos Mantel-Haenszel y regresión logística.
([Detection of nonuniform DIF: Mantel-Haenszel and logistic regression methods] Psicothema, 12, 220–225.1998). Uniform DIF and DIF defined by differences in item response functions. Journal of Educational and Behavioral Statistics, 23, 244–253.
(2004). Differential item functioning detection and effect size: A comparison between logistic regression and Mantel-Haenszel Procedures. Educational and Psychological Measurement, 64, 903–915.
(1988). Differential item performance and the Mantel-Haenszel procedure. In , Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum.
(1989). Applied logistic regression. New York, NY: Wiley.
(1988). An exploratory study of the applicability of item response theory methods to the Graduate Management Admissions Test (GMAC Occasional Papers). Princeton, NJ: Graduate Management Admissions Council.
(1993). A new procedure for detection of crossing DIF/bias. Paper presented at the annual meeting of the American Educational Research Association Atlanta.
(1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719–748.
(1994). Identification of nonuniform differential item functioning using variation of the Mantel-Haenszel procedure. Educational and Psychological Measurement, 54, 284–291.
(1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105–108.
(1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143.
(1996). Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20, 252–274.
(2003). Applying the Breslow-Day test of trend in odds ratio heterogeneity to the analysis of nonuniform DIF. The Alberta Journal of Educational Research, 49, 231–243.
(2005 ). Bday: Computational program for the detection of DIF by the Breslow-Day tests, the Mantel-Haenszel procedures, and combined decision rules. Unpublished manuscript.1988). The area between two item characteristic curves. Psychometrika, 53, 284–291.
(1993). A comparison of logistic regression and Mantel-Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17, 105–116.
(1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.
(1998). EZDIF: Detection of uniform and nonuniform differential item functioning with Mantel-Haenszel and Logistic Regression Procedures. Applied Psychological Measurement, 22, 391.
(1997). PARDSIM Parameter and Response Data Simulation.
(Software St. Paul, MN Assessment System Corporation.2007). Three generations of DIF analysis: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4, 223–233.
(