Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST), Cochran's Z test, and log-linear smoothing with these methods in DIF detection accuracy at a number of small-sample and ability distribution combinations. Effects of item parameters and DIF magnitudes are also investigated. Results show that when ability distributions between groups are identical, Type I error for these DIF methods can be adequately controlled at all sample sizes, and their power to detect a large amount of unidirectional DIF can be tolerably high (power >.6) when sample size is not too small (at least 100 per group). When ability distributions are different, Type I inflation is higher for easier items and larger sample sizes, and power depends on DIF direction. Log-linear smoothing with SIBTEST tends to lower both Type I error rate and power. The effect of smoothing with Cochran's Z test is not as consistent. Implications of the findings are discussed.
All Science Journal Classification (ASJC) codes
- Social Sciences (miscellaneous)
- Psychology (miscellaneous)