Gender Differential Item Functioning (GDIF) Analysis in Iran's University Entrance Exam
DOI:
https://doi.org/10.24853/elif.3.1.49-68Keywords:
Gender Differential Item Functioning analysis (GDIF), Bias, Dimensionality, Fairness, Rasch ModelAbstract
The significant aspect of validity defines what the test score actually and potentially represents, especially to the causes of invalidity concepts of fairness, bias, injustice, and inequity. The Differential Item Functioning (DIF) examines the test items to define test fairness and to examine the validity of educational tests. If gender plays a major role in the testing items, this will lead to bias. This research examines the validity of a test for high-stakes and discusses gender's role as a bias in different linguistic tests, to explore validity and DIF analytics. To get a DIF analysis, the Rasch model had been used as a university entry requirement for English language studies for five thousand people taking part, who'd been randomly selected from a group of examiners participating in the National University Entrance Exam for Foreign Languages (NUEEFL), i.e., English literature, Teaching, and Translation. The test results indicated that the test scores are not free of construct-irrelevant variance, and certain inaccurate items have been modified following the fit statistics guidelines. Overall, NUEEFL's fairness was not clarified. These findings had been some advantage to test designers, stakeholders, administrators, and teachers through that kind of psychometric test. Then it suggested the future administering criteria and bias-free tests and teaching materials.References
Alavi, , Ali, , & Amirian, . (2011). Academic Discipline DIF in an English Language Proficiency Test. Journal of English Language Teaching and Learning, 7(5), 39–66. http://noo.rs/KIrXf
Aryadoust, V., Goh, C. C. M. & Kim, L. O. (2011). An investigation of differential item functioning in the MELAB listening test. Language Assessment Quarterly, 8(4), 361–385.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford, UK: Oxford University Press.
Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing . Journal of the Royal Statistical Society, 57(1), 289–300. https://doi.org/10.2307/2346101
Boone, W. J., Yale, M. S., & Staver, J. R. (2014). Rasch Analysis in the Human Sciences. Springer Science, and Business Media. https://doi.org/10.1007/978-94-007-6857-4
Boyle, J. (1987). Sex differences in listening vocabulary. Language Learning, 37(2), 273-284.
Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational Measurement (4th ed., Vol. 4, pp. 221-256). Westport, CT: American Council on Education & Praeger.
Camilli, G., & Penfield, D. A. (1997). Variance estimation for differential test functioning based on the Mantel-Haenszel log-odds ratio. Journal of Educational Measurement, 34, 123–139.
Carlton, S. T., & Harris, A. M. (1992). Characteristics associated with differential item functioning on the Scholastic Aptitude Test: Gender and majority/minority group comparisons. Princeton, NJ: Educational Testing Service.
Cohen, L. (1979). Approximate expressions for parameter estimates in the Rasch model. The British Journal of Mathematical and Statistical Psychology, 32, 113-120.
Cole, N. S. (1997). The ETS gender study: How females and males perform in educational settings. Princeton, NJ: Educational Testing Service.
Dunne, D. W. (2015). Cautions Issued About High-Stakes Tests | Education World. Education World. https://www.educationworld.com/a_issues/issues110.shtml
Furr, M. R., & Bacharach, V. R. (2007). Psychometrics: An Introduction. Thousand Oaks, CA: SAGE.
Holland, P. W., & Wainer, H. E. (2012). Differential item functioning. London, UK: Routledge.
Kane, M. T. (2013). Validating the Interpretations and Uses of Test Scores. Journal of Educational Measurement, 50(1), 1–73. https://doi.org/10.1111/jedm.12000
Karami, H. (2010). A differential item functioning analysis of a language proficiency test: an investigation of background knowledge bias. Unpublished MA Thesis, University of Tehran.
Karami, H. (2011). Detecting gender bias in a language proficiency test. International Journal of Language Studies, 5(2), 27-38.
Karami, H. (2015). A closer look at the validity of the University Entrance Exam: Dimensionality and generalizability. (Unpublished Ph.D dissertation, University of Tehran).
Kunnan, A. J. (2010). Test fairness and Toulmin's argument structure. Language Testing, 27(2), 183–189.
Ledesma, R. D., Valero-Mora, P., & Macbeth, G. (2015). The Scree Test and the Number of Factors: a Dynamic Graphics Approach. The Spanish Journal of Psychology, 18, E11. https://doi.org/10.1017/sjp.2015.13
Li, H., & Suen, H. (2013). Detecting native language group differences at the subskills level of reading: A differential skill functioning approach. Language Testing. 30, 273-298. https://doi.org/10.1177/0265532212459031.
Lin, J., & Wu, F. (2003). Differential performance by gender in foreign language testing. Paper presented at the annual meeting of the national council on measurement in education (Chicago, IL.).
Linacre, J. M. (1991-2006). A user’s guide to Winsteps® Ministep Rasch-model computer programs. Retrieved January, 10, 2007, from http://www.winsteps.com/aftp/winsteps.pdf
Linacre, J. M. (2006). Data variance explained by measures. Rasch Measurement Transactions, 20, 1045–1047.
Linacre, J. M. (2012). A user’s guide to Winsteps [User’s manual and software]. Retrieved from http://www.winsteps.com/winsteps.htm.
Linacre, J. M. (2016a). Winsteps® Rasch measurement computer program User's Guide. Beaverton, Oregon: Retrieved from http://www.winsteps.com/
Linacre, J. M. (2016b). Winsteps® (Version 3.92.1) [Computer Software]. Beaverton, OR: Winsteps.com. Retrieved from http://www.winsteps.com/
Messick, S. J. (Ed.). (2013). Assessment in higher education: Issues of access, quality, student development, and public policy. Routledge, Taylor and Francis Group.
Mirzaei, A., Hashemian, M., & Tanbakooei, N. (2012). Do Different Stakeholders’ Actions Transform or Perpetuate Deleterious High-Stakes Testing Impacts in Iran?. . The 1st Conference on Language Learning & Teaching: An Interdisciplinary Approach (LLT –IA). https://www.sid.ir/en/Seminar/ViewPaper.aspx?ID=24946
Mohammad, S., Amirian, R., Alavi, S. M., & Fidalgo, A. M. (2014). Detecting Gender DIF with an English Proficiency Test in EFL Context. Iranian Journal of Language Testing, 4(1), 187–203.
Pae, T. (2004). Gender effect on reading comprehension with Korean EFL learners. System, 32(2), 265–281.
Pae, H. (2011). Differential item functioning and unidimensionality in the Pearson Test of English Academic. Pearson Education Ltd.
Ramsey, P. A. (1993). Sensitivity review: The ETS experience as a case study. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 367-388). Hillsdale, NJ: Erlbaum.
Raîche, G., Walls, T. A., Magis, D., Riopel, M., & Blais, J.-G. (2012). Non-Graphical Solutions for Cattell’s Scree Test. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(1), 23–29. https://doi.org/10.1027/1614-2241/a000051
Rasch Measurement Forum. (2017). Retrieved from http://raschforum.boards.net/.
Rezaee, A. A., & Shabani, E. (2010). Gender differential item functioning analysis of the University of Tehran English Proficiency Test. Pazhuhesh-e Zabanha-ye Khareji. 56. 89-108.
Rezai-Rashti, G., & Moghadam, V. (2011). Women and higher education in Iran: What are the implications for employment and the ‘‘marriage market’’? International Review of Education, 57, 419–441.
Roever, C., & McNamara, T. (2006). Language Testing: The Social Dimension. International Journal of Applied Linguistics, 16(2). https://doi.org/10.1111/j.1473-4192.2006.00117.x
Ryan, K., & Bachman, L. (1992). Differential item functioning on two tests of EFL proficiency. Language testing, 9(1), 12-29.
Sadeghi, S. (2014). High-stake Test Preparation Courses: Washback in Accountability Contexts. Journal of Education & Human Developmrnt, 3(1), 17–26.
Salehi, M. & Tayebi, A. (2012). Differential item functioning in terms of gender in reading comprehension subtest of a high-stakes test. Iranian Journal of Applied Language Studies. 4(1). 135- 168.
Salehi, H. & Yunus, M.M., (2012a). The Washback Effect of the Iranian Universities Entrance Exam: Teachers’ Insights. GEMA Online™ Journal of Language Studies. 12(2),. 609- 628.
Salehi, H. & Yunus, M.M., (2012b). University Entrance Exam in Iran: A bridge or a dam. Journal of Applied Sciences Research, 8(2): 1005-1008, 2012. ISSN 1819-544X
Scheuneman, J. D., & Bleistein, C. A. (1989). A Consumer’s Guide to Statistics for Identifying Differential Item Functioning. Applied Measurement in Education, 2(3), 255–275. https://doi.org/10.1207/s15324818ame0203_6
Song, X., & He, L. (2015). The Effect of a National Education Policy on Language Test Performance: A Fairness Perspective. Language Testing in Asia, 5(1), 1–14. https://doi.org/10.1186/s40468-014-0011-z
Spolsky, B., & Bachman, L. F. (1991). Fundamental Considerations in Language Testing. The Modern Language Journal, 75(4). https://doi.org/10.2307/329499
Tae, P. (2004). Gender effect on reading comprehension with Korean EFL learners. System, 32, 265-281.
Tahmasbi, S., & Yamini, M.. (2012). Teachers’ Interpretations and Power in a High-Stakes Test: A CLA Perspective. English Linguistics Research, 1(2), 53. https://doi.org/10.5430/elr.v1n2p53.
Terry, R. M., Genesee, F., & Upshur, J. A. (1998). Class-Room-Based Evaluation in Second Language Education. The Modern Language Journal, 82(1). https://doi.org/10.2307/328719
The Glossary of Education Reform. (2014). 11 Ways to Improve School Communications and Community Engagement. https://www.edglossary.org/school-communications/
Wiberg, M. (2007). Measuring and Detecting Differential Item Functioning in Criterion-Referenced Licensing Test : A Theoretic Comparison of Methods. In Educational Measurement, technical report N. 2.
Xi, X. (2010) How do we go about investigating test fairness? Language Testing, 27(2), 147-170.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors can enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) before and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).