Small-Sample DIF Estimation Using SIBTEST, Cochran's Z, and Log-Linear Smoothing

Pui-wa Lei, Hongli Li

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST), Cochran's Z test, and log-linear smoothing with these methods in DIF detection accuracy at a number of small-sample and ability distribution combinations. Effects of item parameters and DIF magnitudes are also investigated. Results show that when ability distributions between groups are identical, Type I error for these DIF methods can be adequately controlled at all sample sizes, and their power to detect a large amount of unidirectional DIF can be tolerably high (power >.6) when sample size is not too small (at least 100 per group). When ability distributions are different, Type I inflation is higher for easier items and larger sample sizes, and power depends on DIF direction. Log-linear smoothing with SIBTEST tends to lower both Type I error rate and power. The effect of smoothing with Cochran's Z test is not as consistent. Implications of the findings are discussed.

Original languageEnglish (US)
Pages (from-to)397-416
Number of pages20
JournalApplied Psychological Measurement
Volume37
Issue number5
DOIs
StatePublished - Jul 1 2013

Fingerprint

Sample Size
trend
ability
Group
Public Opinion
Economic Inflation
inflation
performance

All Science Journal Classification (ASJC) codes

  • Social Sciences (miscellaneous)
  • Psychology (miscellaneous)

Cite this

@article{1f321278558348d3819abdb57c84a718,
title = "Small-Sample DIF Estimation Using SIBTEST, Cochran's Z, and Log-Linear Smoothing",
abstract = "Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST), Cochran's Z test, and log-linear smoothing with these methods in DIF detection accuracy at a number of small-sample and ability distribution combinations. Effects of item parameters and DIF magnitudes are also investigated. Results show that when ability distributions between groups are identical, Type I error for these DIF methods can be adequately controlled at all sample sizes, and their power to detect a large amount of unidirectional DIF can be tolerably high (power >.6) when sample size is not too small (at least 100 per group). When ability distributions are different, Type I inflation is higher for easier items and larger sample sizes, and power depends on DIF direction. Log-linear smoothing with SIBTEST tends to lower both Type I error rate and power. The effect of smoothing with Cochran's Z test is not as consistent. Implications of the findings are discussed.",
author = "Pui-wa Lei and Hongli Li",
year = "2013",
month = "7",
day = "1",
doi = "10.1177/0146621613478150",
language = "English (US)",
volume = "37",
pages = "397--416",
journal = "Applied Psychological Measurement",
issn = "0146-6216",
publisher = "SAGE Publications Inc.",
number = "5",

}

Small-Sample DIF Estimation Using SIBTEST, Cochran's Z, and Log-Linear Smoothing. / Lei, Pui-wa; Li, Hongli.

In: Applied Psychological Measurement, Vol. 37, No. 5, 01.07.2013, p. 397-416.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Small-Sample DIF Estimation Using SIBTEST, Cochran's Z, and Log-Linear Smoothing

AU - Lei, Pui-wa

AU - Li, Hongli

PY - 2013/7/1

Y1 - 2013/7/1

N2 - Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST), Cochran's Z test, and log-linear smoothing with these methods in DIF detection accuracy at a number of small-sample and ability distribution combinations. Effects of item parameters and DIF magnitudes are also investigated. Results show that when ability distributions between groups are identical, Type I error for these DIF methods can be adequately controlled at all sample sizes, and their power to detect a large amount of unidirectional DIF can be tolerably high (power >.6) when sample size is not too small (at least 100 per group). When ability distributions are different, Type I inflation is higher for easier items and larger sample sizes, and power depends on DIF direction. Log-linear smoothing with SIBTEST tends to lower both Type I error rate and power. The effect of smoothing with Cochran's Z test is not as consistent. Implications of the findings are discussed.

AB - Minimum sample sizes of about 200 to 250 per group are often recommended for differential item functioning (DIF) analyses. However, there are times when sample sizes for one or both groups of interest are smaller than 200 due to practical constraints. This study attempts to examine the performance of Simultaneous Item Bias Test (SIBTEST), Cochran's Z test, and log-linear smoothing with these methods in DIF detection accuracy at a number of small-sample and ability distribution combinations. Effects of item parameters and DIF magnitudes are also investigated. Results show that when ability distributions between groups are identical, Type I error for these DIF methods can be adequately controlled at all sample sizes, and their power to detect a large amount of unidirectional DIF can be tolerably high (power >.6) when sample size is not too small (at least 100 per group). When ability distributions are different, Type I inflation is higher for easier items and larger sample sizes, and power depends on DIF direction. Log-linear smoothing with SIBTEST tends to lower both Type I error rate and power. The effect of smoothing with Cochran's Z test is not as consistent. Implications of the findings are discussed.

UR - http://www.scopus.com/inward/record.url?scp=84878449206&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878449206&partnerID=8YFLogxK

U2 - 10.1177/0146621613478150

DO - 10.1177/0146621613478150

M3 - Article

VL - 37

SP - 397

EP - 416

JO - Applied Psychological Measurement

JF - Applied Psychological Measurement

SN - 0146-6216

IS - 5

ER -