El Assessing artificial intelligence and professors’ calibration in English as a foreign language writing courses at a Costa Rican public university

artificial intelligence
assessment
higher education
second language instruction
writing (composition)
inteligencia artificial
evaluación
educación superior
enseñanza de una lengua extranjera
expresión escrita

How to Cite

Charpentier-Jiménez, W. (2024). El Assessing artificial intelligence and professors’ calibration in English as a foreign language writing courses at a Costa Rican public university . Actualidades Investigativas En Educación, 24(1), 1–25. https://doi.org/10.15517/aie.v24i1.55612

Abstract

This article paper explores the evaluation of artificial intelligence (AI) in English as a Foreign Language (EFL) writing courses and the importance of calibration in writing evaluations. The role of calibration has received little attention in language contexts, while the role of artificial intelligence has gained increased attention in the last couple of years. This investigation, conducted from August 2022 to March 2023, involved eight TESOL students enrolled in an English as a Foreign Language (EFL) major at a Costa Rican public university, ten TESOL university professors, and one AI piece of software. It used a quantitative, quasi-experimental design, and a language elicitation data collection process. Data was collected by means of a rubric-based writing assessment. Quantitative data were analyzed using descriptive statistics. Data analyses indicate that: 1) human-created paragraphs (X̄ = 7,56) and AI writing (X̄ = 7,61) yield similar results when evaluated; 2) some criteria may favor human creativity or computer, rule-oriented writing; and 3) professors’ ratings reveal inconsistencies when grading human writing in particular. These findings demonstrate that AI matches, at least to a basic level, human writing skills. Furthermore, data show that students may be falling behind in aspects such as grammar, vocabulary, and mechanics. Finally, the analysis indicates that professors’ grading lacks consistency, and a calibration model should be incorporated as part of regular training workshops.

https://doi.org/10.15517/aie.v24i1.55612

PDF

Word

XML

EPUB

References

Abd-Elaal, El-Sayed., Gamage, Sithara., and Mills, Julie. (2022). Assisting academics to identify computer generated writing. European Journal of Engineering Education, 47(5), 725-745. https://doi.org/10.1080/03043797.2022.2046709

Adamopoulou, Eleni., and Moussiades, Lefteris. (2020). An Overview of Chatbot Technology. In Ilias Maglogiannis, Lazaros Iliadis, and Elias Pimenidis (Eds.), Artificial Intelligence Applications and Innovations (Vol. 584, pp. 373–383). Springer International Publishing. https://doi.org/10.1007/978-3-030-49186-4_31

Adler-Kassner, Linda., and O’Neill, Peggy. (2010). Reframing writing assessment to improve teaching and learning. Utah State University Press.

Arindra, Margaretha Yola., and Ardi, Priyatno. (2020). The Correlation between Students’ Writing Anxiety and the Use of Writing Assessment Rubrics. LEARN Journal: Language Education and Acquisition Research Network, 13(1), 76–93. https://eric.ed.gov/?id=EJ1242955

Arora, Varun. (2022). Artificial intelligence in schools: a guide for teachers, administrators, and technology leaders. Routledge.

Bernard, Etienne. (2021). Introduction to machine learning. Wolfram Media.

Beyduz, Baris. (2023). The Parent`s Guide to Artificial Intelligence and Education: Helping your Child Adapt and Succeed in a Rapidly Changing World: How A.I. Will Shape Our Kids. Independently published.

Booth, Melanie. (n.d.). College-Level Writing Rubric. Saint Mary’s College. https://my.smccme.edu/ICS/icsfs/College_Writing_Rubric.pdf?target=7037f7b6-6809-4d28-86a5-f9ed01f0acf0

Bourg, David M., and Seemann, Glenn. (2004). AI for game developers. O’Reilly.

Brown, H. Douglas., and Abeywickrama, Priyanvada. (2019). Language assessment: principles and classroom practices (3th ed.). Pearson Education.

Brown, H. Douglas., and Lee, Heekyeong. (2015). Teaching by principles: an interactive approach to language pedagogy (4th ed.). Pearson Education.

Cameron, Ryan M. (2019). A.I. - 101: a primer on using artifical intelligence in education. Exceedly Press.

Campbell, Madelaine. (2019). Teaching Academic Writing in Higher Education. Education Quarterly Reviews, 2(3). https://doi.org/10.31014/aior.1993.02.03.92

Carr, Nathan T. (2000). A Comparison of the Effects of Analytic and Holistic Rating Scale Types in the Context of Composition Tests. Issues in Applied Linguistics, 11(2). https://doi.org/10.5070/L4112005035

Cheung, Yin Ling. (2016). Teaching Writing. In Willy A. Renandya and Handoyo Puji Widodo (Eds.), English Language Teaching Today: Linking Theory and Practice (1st ed. 2016). Springer International Publishing: Imprint: Springer.

Clark, Donald. (2020). Artificial intelligence for learning: how to use AI to support employee development. Kogan Page Limited.

Congdon, Peter J., and McQueen, Joy. (2000). The Stability of Rater Severity in Large-Scale Assessment Programs. Journal of Educational Measurement, 37(2), 163-178. https://doi.org/10.1111/j.1745-3984.2000.tb01081.x

Coombe, Christine A., Folse, Keith S., and Hubley, Nancy J. (2007). A practical guide to assessing English laugage learners. University of Michigan.

CopyAI, Inc. (2022). Copy.ai (July 14 version) [Large language model]. https://copy.ai

Creswell, John. (2019). Educational research: planning, conducting, and evaluating quantitative and qualitative research (6th ed.). Pearson.

Dunn, Michael. (2021). The Challenges of Struggling Writers: Strategies That Can Help. Education Sciences, 11(12), 795. https://doi.org/10.3390/educsci11120795

Ericsson, Patricia., and Haswell, Richard. (2006). Machine Scoring of Student Essays: Truth and Consequences. USU Press Publications. https://digitalcommons.usu.edu/usupress_pubs/139

Ferris, Dana., and Hedgcock, John S. (2023). Teaching L2 composition: purpose, process, and practice (4th ed.). Routledge.

Fulcher, Glenn. (2010). Practical language testing. Hodder Education.

Ghalib, Thikra., and Al-Hattami, Abdulghani. (2015). Holistic versus Analytic Evaluation of EFL Writing: A Case Study. English Language Teaching, 8(7), p225. https://doi.org/10.5539/elt.v8n7p225

Giansiracusa, Noah. (2021). Crafted by Computer: Artificial Intelligence Now Generates Headlines, Articles, and Journalists. In Noah Giansiracusa, How Algorithms Create and Prevent Fake News (pp. 17–39). Apress. https://doi.org/10.1007/978-1-4842-7155-1_2

Glass, Kathy Tuchman. (2005). Curriculum design for writing instruction: creating standards-based lesson plans and rubrics. Corwin Press.

Gulson, Kalervo N., Sellar, Sam., and Webb, P. Taylor. (2022). Algorithms of education: how datafication and artificial intelligence shape policy. University of Minnesota Press.

Gunnell, K. L., Fowler, D., and Colaizzi, K. (2016). Inter-rater reliability calibration program: critical components for competency-based education. The Journal of Competency-Based Education, 1(1), 36-41. https://doi.org/10.1002/cbe2.1010

Gwet, Kilem Li. (2014). Handbook of inter-rater reliability: the definitive guide to measuring the extent of agreement among raters (Fourth edition). Advances Analytics, LLC.

Hamdan, Allam Mohammed Mousa., Hassanien, Aboul Ella., Khamis, Reem., Alareeni, Bahaaeddin., Razzaque, Ajum., and Awwad, Bahaa Sobhi Abde Latif. (Eds.). (2021). Applications of artificial intelligence in business, education and healthcare. Springer.

Harmer, Jeremy. (2011). How to teach writing (9a. impr). Longman, Pearson Education.

Hernández Sampieri, Roberto., Fernández Collado, Carlos., and Baptista Lucio, Pilar. (2010). Metodología de la investigación (5a. ed). McGraw-Hill.

Holmes, Wayne., and Porayska-Pomsta, Kaska. (Eds.). (2023). The ethics of artificial intelligence in education: practices, challenges, and debates. Routledge, Taylor and Francis Group.

Hyland, Ken. (2019). Second language writing (2nd ed.). Cambridge University Press.

Johnston, Michael. (2023). The Artificial Intelligence Disruption: How to Adapt and Succeed in the Age of Intelligent Machines. Self Published.

Jones, Herbert. (2018). Deep Learning: An Essential Guide to Deep Learning for Beginners Who Want to Understand How Deep Neural Networks Work and Relate to Machine Learning and Artificial Intelligence. CreateSpace Independent Publishing Platform.

Kent, David. (2022). Artificial intelligence in education: fundamentals for educators. Kotesol DDC.

Kochmar, Ekaterine. (2022). Getting started with Natural Language Processing. Manning Publications.

Lasry, Brigitte., and Kobayashi, Hael. (Eds.). (2018). Human decisions: thoughts on AI. United Nations Educational, Scientific and Cultural Organization.

Li, Wentao. (2022). Scoring rubric reliability and internal validity in rater-mediated EFL writing assessment: Insights from many-facet Rasch measurement. Reading and Writing, 35(10), 2409–2431. https://doi.org/10.1007/s11145-022-10279-1

Luo, Bei., Lau, Raymond Y. K., Li, Chunping., and Si, Yain‐Whar. (2022). A critical review of state‐of‐the‐art chatbot designs and applications. WIREs Data Mining and Knowledge Discovery, 12(1). https://doi.org/10.1002/widm.1434

Ma, Wenyue. (2022). What the analytic versus holistic scoring of international teaching assistants can reveal: Lexical grammar matters. Language Testing, 39(2), 239–264. https://doi.org/10.1177/02655322211040020

Mackey, Alison, and Gass, Susan. (2016). Second language research: methodology and design (2nd ed.). Routledge.

Martín-Marchante, Beatriz. (2022). The use of ICTs and artificial intelligence in the revision of the writing process in Valencian public universities. Research in Education and Learning Innovation Archives, (28), 16-31. https://doi.org/10.7203/realia.28.20622

McAllister, Ken., and White, Edward. (2006). Interested Complicities: The Dialectic of Computer-Assisted Writing Assessment. In Patricia Ericsson and Richard Haswell, Machine Scoring of Student Essays: Truth and Consequences (pp. 8-27). USU Press Publications. https://digitalcommons.usu.edu/usupress_pubs/139

McRoy, Susan. (2021). Principles of natural language processing. Susan McRoy.

Mertler, Craig. (2019). Introduction to educational research (2nd ed.). SAGE Publications, Inc.

Murray, Denise E., and Christison, MaryAnn. (2011). What English language teachers need to know. Routledge.

Nation, Paul. (2009). Teaching ESL/EFL reading and writing. Routledge.

Nosratinia, Mania., and Razavi, Faezeh. (2016). Writing Complexity, Accuracy, and Fluency among EFL Learners: Inspecting Their Interaction with Learners’ Degree of Creativity. Theory and Practice in Language Studies, 6(5), 1043-1052. https://doi.org/10.17507/tpls.0605.19

Oh, Saerhim. (2020). Second Language Learners’ Use of Writing Resources in Writing Assessment. Language Assessment Quarterly, 17(1), 60–84. https://doi.org/10.1080/15434303.2019.1674854

Page, Ellis. (1966). The Imminence of... Grading Essays by Computer. The Phi Delta Kappan, 47(5), 238-243. http://www.jstor.org/stable/20371545

Page, Ellis., and Dieter, Paulus. (1968). The Analysis of Essays by Computer (Final Report of U.S. Office of Education Project No. 6-1318). Washington, DC: Department of Health, Education, and Welfare. ERIC Document Reproduction Service, ED 028 633. https://archive.org/details/ERIC_ED028633/mode/2up

Page, Ellis., and Petersen, Nancy. (1995). The computer moves into essay grading: Updating the ancient test. Phi Delta Kappan, 76(7), 561. https://www.proquest.com/docview/218533317?pq-origsite=gscholar&fromopenview=true

Peaci̇, Davut. (2020). Writing evaluation in university English preparatory programs: Two universities of Turkey and Saudi Arabia. Dil ve Dilbilimi Çalışmaları Dergisi, 16(1), 253–264. https://doi.org/10.17263/jlls.712798

Popenici, Stefan. (2023). Artificial intelligence and learning futures: critical narratives of technology and imagination in higher education. Routledge.

Raaijmakers, Stephan. (2022). Deep learning for natural language processing. Manning Publications Co.

Raynor, William J. (2009). International dictionary of artificial intelligence (2. ed., new ed). Global Professional Publ.

Reid, Joy M. (2006). Essentials of teaching academic writing. Houghton Mifflin.

Ricker-Pedley, Kathryn L. (2011). An examination of the link between rater calibration performance and subsequent scoring accuracy in graduate record examinations® (GRE ®) writing. ETS Research Report Series, 2011(1), i–22. https://doi.org/10.1002/j.2333-8504.2011.tb02239.x

Roberts, Daniel A. (2022). The principles of deep learning theory: an effective theory approach to understanding neural networks. Cambridge University Press.

Roumate, Fatima. (2023). Artificial intelligence in higher education and scientific research: future development. SPRINGER VERLAG, SINGAPOR.

Salas-Pilco, Sdenka Zobeida., and Yang, Yuqin. (2022). Artificial intelligence applications in Latin American higher education: a systematic review. International Journal of Educational Technology in Higher Education, 19(1), 21. https://doi.org/10.1186/s41239-022-00326-w

Scheel, Carrie., Mecham, Jim., Zuccarello, Vic., and Mattes, Ryan. (2018). An evaluation of the inter-rater and intra-rater reliability of OccuPro’s functional capacity evaluation. Work, 60(3), 465-473. https://doi.org/10.3233/WOR-182754

Sethuraman, Mekala., and Radhakrishnan, Geetha. (2020). Promoting Cognitive Strategies in Second Language Writing. Eurasian Journal of Educational Research, (88), 1–17. https://doi.org/10.14689/ejer.2020.88.5

Shabani, Enayat A., and Panahi, Jaleh. (2020). Examining consistency among different rubrics for assessing writing. Language Testing in Asia, 10(1), 12. https://doi.org/10.1186/s40468-020-00111-4

Sharples, Mike., and Pérez y Pérez, Rafael. (2022). Story Machines: How Computers Have Become Creative Writers. Routledge. https://doi.org/10.4324/9781003161431

Smith, Adam. (2022). Revolutionizing Education with Artificial Intelligence. Independently published.

Schiff, Daniel. (2022). Education for AI, not AI for Education: The Role of Education and Ethics in National AI Policy Strategies. International Journal of Artificial Intelligence in Education, 32(3), 527–563. https://doi.org/10.1007/s40593-021-00270-2

Sparks, Jesse R., Song, Yi., Brantley, Wyman., and Liu, Ou Lydia. (2014). Assessing Written Communication in Higher Education: Review and Recommendations for Next-Generation Assessment: Assessing Written Communication. ETS Research Report Series, 2014(2), 1-52. https://doi.org/10.1002/ets2.12035

Srinivasan, Rajeev. (2018). The Ethical Dilemmas of Artificial Intelligence. In Brigitte Lasry and Hael Kobayashi (Eds.), Human decisions: thoughts on AI (pp. 103-107). United Nations Educational, Scientific and Cultural Organization.

Sundqvist, Pia., Sandlund, Erica., Skar, Gustaf B., and Tengberg, Michael. (2020). Effects of Rater Training on the Assessment of L2 English Oral Proficiency. Nordic Journal of Modern Language Methodology, 8(1), 3-29. https://doi.org/10.46364/njmlm.v8i1.605

Tillema, Marion. (2012). Writing in first and second language: empirical studies on text quality and writing processes. Netherlands Graduate School of Linguistics.

Tzen, MonZen., and Moquet, Xavier. (2018). A.I and big data: what kind of education and what kind of place is there for the citizen? In Brigitte Lasry and Hael Kobayashi (Eds.), Human decisions: thoughts on AI (pp. 108-111). United Nations Educational, Scientific and Cultural Organization.

Wendler, Cathy., Glazer, Nancy., and Cline, Frederick. (2019). Examining the Calibration Process for Raters of the GRE ® General Test. ETS Research Report Series, 2019(1), 1–19. https://doi.org/10.1002/ets2.12245

Weir, Cyril. (2005). Language testing and validation: An evidence-based approach. Houndmills UK: Palgrave Macmillan. https://ztcprep.com/library/tesol/Language_Testing_and_Validation/Language_Testing_and_Validation_(www.ztcprep.com).pdf

Wilhelm, Anne Garrison., Rouse, Amy Gillespie., and Jones, Francesca. (2018). Exploring Differences in Measurement and Reporting of Classroom Observation Inter-Rater Reliability. Practical Assessment, Research, and Evaluation, 23. https://doi.org/10.7275/AT67-MD25

Yu, Shengquan., and Yu, Lu. (2021). An introduction to artificial intelligence in education. Springer Nature.

Zimmerman, Michelle Renée. (2018). Teaching AI: exploring new frontiers for learning. International Society for Technology in Education.

##plugins.facebook.comentarios##

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Downloads

Download data is not yet available.

Most read articles by the same author(s)

William Charpentier-Jiménez, University students´ perception of exposure to various English accents and their production , Actualidades Investigativas en Educación: Vol. 19 No. 2: (Mayo - Agosto)
William Charpentier-Jiménez, Students’ Perceptions of Ethics in Applied Linguistics Research at a Costa Rican Public University , Actualidades Investigativas en Educación: Vol. 23 No. 1 (2023): (January-April)