Content uploaded by Jordan Journal of Applied Science -Humanities Series
Author content
All content in this area was uploaded by Jordan Journal of Applied Science -Humanities Series on Oct 16, 2024
Content may be subject to copyright.
Jordan Journal of Applied Science - Humanities Series
Applied Science Private University
2024, Vol 38(1)
e-ISSN: 2708-9126
https://doi.org/10.35192/jjoas-h.v38i1.651
Research Article
Impact of the Type and Percentage of Differential Item Functioning on the
Ability Parameter of Individuals According to Parametric and Nonparametric
Models of IRT
Issra Al-Khatib 1*, Hassan Al-Omari 1.
1The University of Jordan, Amman, Jordan.
ARTICLE INFO
*Corresponding author:
The University of Jordan, Amman, Jordan.
Email: asra32915@gmail.com.
Article history:
Received 10 Feb 2022
Accepted 18 Apr 2022
Published 01 Jan 2024
Abstract
This study aimed to investigate the impact of the type and percentage of Differential
Item Functioning (10%, 20%, and 30%) on estimating individuals’ abilities according to the
three-parameter model and Mokken’s non-parametric model of item response theory. The
experimental method was used to address the study's questions by applying hypothetical tests,
each consisting of 60 double-response items, generated using the WinGen software, to a sample
of 2,000 hypothetical individuals for each experimental condition. The study results showed that
differences in the ability parameter between the three-parameter logistic model and Mokken's
non-parametric model were not statistically significant across all experimental conditions
related to the type and percentage of Differential Item Functioning. There were also no
differences in the ability parameter due to the type of model according to the experimental
conditions related to the type and percentage of differential performance. The study
recommended using Mokken's non-parametric model when seeking the highest degree of
reliability in the test.
Keywords: Differential Items Functioning, Abilities of Individuals, Three-Parameter Logistic
Model, Moken Model.
WinGen
- 22 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Schmidt &Hunter,
1998
Binet
(DIF)DIP DIFRyan &Chiu, 2001
DIF
Stark, et al., 2004
; Gruijter &Kamp, 1994 ,et al.Camilli,
2005 Reynolds &Lowe, 2009; Warn, et al., 2014
IRT
Parametric)Nonparametric
Raju &Ellis, 2002
IRT
Park, 2010
Hambleton &rogers, 1986; Swaminathan &rogers, 1990; pae,
2019; Raquel, 2011; Karami &Salmani Nodoushan, 2004
UDIF
NUDIF
- 23 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Hanson, 1998
Non Parametric Item Response Theory Models
Van der Linden &Hambleton, 1997
Sijtsma &Molenaar, 2002
One-dimensionality
θ
Local Independence
Monotonicity
θ
Double Monotonicity
− Monotone Homogeneous Model Mokken
Sijtsma, 1998
θSijtsma
&Molenaar, 2002
− Double Monotonicity Model
Sijtsma, 1998
Sijtsma
&Molenaar, 2002
Sijtsma &Verweij,
1992
Fitzpatrick & Wendy, 2001
− Maximum Likelihood Estimation Methods
- 24 -
https://doi.org/10.35192/jjoas-h.v38i1.651
-∞, ∞
−
Bayesian Estimation Method
Garre &Vermunt, 2006
bias & Differential Item Functioning
Dorans & Holland, 1993Williams, 1997
ICCS
Gierl, et al. 2000
Camili &Shepard,1994
−
Chung &Huisu, 2004
−
uncorrected
B
BIAS1997 ,et al.Rebecca,
−
2
G
Likelihood – Ratio Test
- 25 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Uiterwijk
&Vallen, 2005
Uniform Differential Item Functioning
Chung &Huisu, 2004
Non-Uniform Differential Item Functioning
θ
θ
Uiterwijk &Vallen, 2005
- 26 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Haughbrook, 2020
(Woodcock-Johnson III Picture Vocabulary scale)
DIF
SSRS-TDIF
KIDS
WJPVDIF SSRS-TDIF
WJPV DIF
"
RMSEA, NCP, AIC,
SRMR ,CFI
(RMSEA, NCP, AIC, SRMR, CFI)
AlQuraan &ALKuwaiti, 2017
.
- 27 -
https://doi.org/10.35192/jjoas-h.v38i1.651
NCDIF
(2013)
(Gabriel, 2012)
.
- 28 -
https://doi.org/10.35192/jjoas-h.v38i1.651
AERA,
APA &NCME, 1999
Croudace &Brown, 2012; Van de Vijver &Tanzer,
2004
Croudace &
Brown, 2012
Acar
&Kelecioglu, 2010
(Gabriel, 2012
α
α
α
WingenBilog-Mg
TestGraf
SPSS
- 29 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Group Focal
)
Group Referenced
.
−
−
−
− Simulated DataWinGen
(Han &Hambleton, 2007)
- 30 -
https://doi.org/10.35192/jjoas-h.v38i1.651
WinGen
WinGen
−
−
−
WinGen
−
SPSS
- 31 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Tanaka
Local-DependentLDID
Q3
U6
U12
U18
N6
N12
N18
BILOG-MG
Item Response Theory
−
Test-Graf
α
BILOG-MG
- 32 -
https://doi.org/10.35192/jjoas-h.v38i1.651
F
p>0.05
α
”α
Test-Graf
.
.
F
- 33 -
https://doi.org/10.35192/jjoas-h.v38i1.651
p>0.05
α
”α
t
.
t
t
α
t
p>0.05
p>0.05
α
p>0.05
Invariance
- 34 -
https://doi.org/10.35192/jjoas-h.v38i1.651
.
.
.
(79).
.
(186).
. (1996).
. .
.
.
.
(8).
.
.
- 35 -
https://doi.org/10.35192/jjoas-h.v38i1.651
References
Acar, T., & Kelecioglu, H. (2010). Comparison of Differential Item Functioning Determination
Techniques: HGLM, LR and IRT-LR.
Educational Sciences: Theory and Practice, 10
(2),
639-649.
AERA, APA, & NCME. (1999).
The standards for educational and psychological testing
.
Washington, DC: AERA Publications Sales.
Alquraan, M., & Alkuwaiti. (2017). Differential item functioning in students rating of teaching
effectiveness surveys in higher education according to academic disciplines: Data from a
Saudi university.
Studies Psychological and Educational of J, 4
(11), 780-770.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In
F. M. Lord & M. R. Novick (Eds.),
Statistical theories of mental test scores
(pp. 397-479).
Reading, MA: Addison-Wesley.
Camilli, G., & Shepared, L. (1994).
Methods for identifying bias test item
. Sage Publications.
Chung, W., & Huisu, Y. (2004). Effects of average signed area between two item characteristic
curves and purification procedures on the DIF detection via the Mantel-Hanzel method.
Applied Measurement in Education, 17
(2), 113-144.
Croudace, T., & Brown, A. (2012). Measurement invariance and differential item functioning.
Short Course in Applied Psychometrics, Peterhouse College, 10-12.
Dorans, N., & Holland, E. (1993). DIF detection and description Mantel-Haenszel standardization.
Educational Testing Service
. (QAT24225) ED287526.
Fitzpatrick, A., & Wendy, M. (2001). The effects of test length and sample size on the reliability
and equating of tests composed of constructed response items.
Applied Measurement in
Education, 14
(1), 31-57.
Gabriel, E. L. (2012). Detection and classification of DIF types using parametric and
nonparametric methods: A comparison of the IRT-likelihood ratio test, crossing-SIBTEST,
and logistic regression procedures [Unpublished dissertation]. University of South Florida.
Garre, F., & Vermunt, J. (2006). Avoiding boundary estimates in latent class analysis by Bayesian
posterior mode estimation.
Behaviormetrika, 33
(1), 43-59.
Gierl, M. J., Jodoin, M. G., & Ackerman, T. A. (2000). Performance of Mantel-Haenszel,
Simultaneous Item Bias Test, and logistic regression when the proportion of DIF items is
large [Paper presentation]. Annual Meeting of the American Educational Research
Association (AERA), New Orleans, LA.
Gruijter, D., & Kamp, L. (2005). Statistical test theory for education and psychology. Retrieved
December 30, 2021, from www.leidenuniv.nl/gruijterdnmde.
Hambleton, K., & Rogers, J. (1986). Technical advances in credentialing examinations.
Evaluation
& the Health Professions, 9
(2), 205-229.
Hambleton, R., & Swaminathan, H. (1985).
Item response theory: Principles and application
.
Boston, MA: Kluwek NIJ Publishing.
- 36 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Hambleton, R. (1990). Item response theory: Introduction and bibliography.
Psicothema, 2
(1), 97-
107.
Han, T., & Hambleton, K. (2007). User's manual: WinGen (Center for Educational Assessment
Report No. 642). Amherst, MA: University of Massachusetts, School of Education.
Hanson, A. (1998). Uniform DIF and DIF defined by differences in item response functions.
Journal of Educational and Behavioral Statistics, 23
(2), 112-129.
Haughbrook, R. (2020). Exploring racial bias in standardized assessments and teacher-reports of
student achievement with differential item and test functioning analyses [Doctoral
dissertation]. The Florida State University.
Karami, H., & Salmani Nodoushan, A. (2011). Differential item functioning (DIF): Current
problems and future directions.
Online Submission, 5
(3), 133-142.
Lord, F., & Novick, M. (1980).
Application of item response theory to practical testing problems
.
Hillsdale, NJ: Lawrence Erlbaum Associates.
Pae, I. (2004). Gender effect on reading comprehension with Korean EFL learners.
System, 32
(2),
265-281.
Park, C. (2010). Examining the relationship between differential item functioning and differential
test functioning.
Language Testing, 23
(4), 475-496.
Raju, N., & Ellis, B. (2002). Differential item and test functioning.
Jossey-Bass
, 156–188.
Raquel, M. (2019). The Rasch measurement approach to differential item functioning (DIF)
analysis in language assessment research. In
Quantitative data analysis for language
assessment, 1
, 103-131.
Rebecca, Z., Dorothy, T., & John, M. (1997). Descriptive and inferential procedures for assessing
differential item functioning in polychromous items.
Applied Measurement in Education,
10
(4), 321-344.
Reynolds, R., & Lowe, A. (2009). The problem of bias in psychological testing.
School
Psychology Handbook
, 332-374.
Ryan, E., & Chiu, S. (2001). An examination of item context effects, DIF, and gender DIF.
Applied
Measurement in Education, 14
(1), 73-90.
Schmidt, F., & Hunter, J. (1998). The validity and utility of selection methods in personnel
psychology: Practical and theoretical implications of 85 years of research findings.
Psychological Bulletin, 124
(2), 262-274.
Sijtsma, K., & Hamker, B. T. (2000). A taxonomy of IRT models for ordering of persons and items
using simple sum scores.
Journal of Educational and Behavioral Statistics, 25
(2), 391-415.
Sijtsma, K., & Molenaar, I. (2002).
Introduction to nonparametric item response theory
. Sage
Publications.
Sijtsma, K. (1998). Methodology review: Nonparametric IRT approaches to the analysis of
dichotomous item scores.
Applied Psychological Measurement, 22
(3), 3-31.
- 37 -
https://doi.org/10.35192/jjoas-h.v38i1.651
Sijtsma, K., & Verweij, A. C. (1992). Mokken scale analysis: Theoretical considerations and an
application to transitivity tasks.
Applied Measurement in Education, 5
(4), 355-373.
Stark, S., Chernyshenko, O. S., & Drasgow, F. (2004). Examining the effects of differential item
functioning and differential test functioning on selection decisions: When are statistically
significant effects practically important?
Journal of Applied Psychology, 2
(6), 117-135.
Swaminathan, H., & Rogers, J. (1990). Detecting differential item functioning using logistic
regression procedures.
Journal of Educational Measurement, 127
(4), 361-370.
Uiterwijk, H., & Vallen, T. (2005). Linguistic sources of item bias for second-generation
immigrants in Dutch tests.
Language Testing, 22
(2), 211-234.
Van De Vijver, F., & Tanzer, K. (2004). Bias and equivalence in cross-cultural assessment.
European Review of Applied Psychology, 54
(2), 119-135.
Van Der Linden, W. J., & Hambleton, R. K. (1997). Item response theory: Brief history, common
models, and extensions. In
Handbook of modern item response theory
(pp. 1-28). New
York, NY: Springer New York.
Warne, T., Yoon, M., & Price, J. (2014). Exploring the various interpretations of “test bias”.
Cultural Diversity and Ethnic Minority Psychology, 20
(4), 570-578.
Williams, S. (1997). The "unbiased" anchor: Bridging the gap between DIF and item bias.
Applied
Measurement in Education, 10
(3), 253-267.