Statistics for detecting DIF among multiple groups: a simulation study

When psychological tests are used to compare scores across different groups it is crucial to test for Differential Item Functioning (DIF) to guarantee the comparability of those scores. Typically DIF studies have focused on two groups. However in many cases (e.g. cross-cultural research) there are more than two groups (e.g. cultures) to compare. We carried out a simulation study that illustrates how generalized-Mantel-Haenszel statistics (Fidalgo & Madeira, 2008) and Confirmatory Factor Analysis with latent Mean and Covariance Structure (CFA-MACS) (Sörbom, 1974) can be used for DIF evaluation among multiple groups. Both approaches permit, through a single significance test, simultaneous evaluation of DIF in several groups. We manipulated the number of groups, type of DIF, sample size, and number of response categories. Some of the largest differences between both approaches were found when the number of response categories was small. The implications and the main advantages of each procedure were discussed.