Revolutionising Essay Evaluation: A Cutting-Edge Rubric for AI-Assisted Writing

Revolutionising Essay Evaluation: A Cutting-Edge Rubric for AI-Assisted Writing

Hassan Saleh Mahdi (Arab Open University, Saudi Arabia) and Ahmed Alkhateeb (King Faisal University, Saudi Arabia)
DOI: 10.4018/IJCALLT.368226
Article PDF Download
Open access articles are freely available for download

Abstract

This study aims to develop a robust rubric for evaluating artificial intelligence (AI)–assisted essay writing in English as a Foreign Language (EFL) contexts. Employing a modified Delphi technique, we conducted a comprehensive literature review and administered Likert scale questionnaires. This process yielded nine key evaluation criteria, forming the initial rubric. The rubric was applied to evaluate 33 AI-assisted essays written by students as part of an intensive course assignment. Statistical analysis revealed significant inter-rater reliability and convergent validity coefficients, supporting the adoption and further development of such rubrics across higher education institutions. The developed rubric was subsequently used to evaluate these essays using two AI tools: ChatGPT and Claude. The results indicated that both AI tools evaluated the essays with similar scores, demonstrating consistency in their assessment capabilities.
Article Preview
Top

Introduction

Writing assessment is an important part of language teachers’ duties. It creates many objections among students. To decrease these objections, rubrics can be used to evaluate the writing assignments. In addition to rubrics, some other factors are essential to maintain consistency in assessment. They include knowledge of the content (Shabani & Panahi, 2020), raters' linguistic backgrounds, previous experiences, and prior training in assessment (Huang, 2009). Writing assessment is also used to give feedback, which is crucial to improve students’ motivation (Hyland, 2003).

The arrival of artificial intelligence (AI) tools has altered writing significantly. The integration of AI into the writing process has brought big changes to composition, editing, and proofreading. Alharbi (2023) noted that these AI systems aided students in all stages of writing. These systems offer human-like sentence completion suggestions and text generation. Furthermore, Grimes and Warschauer (2010) pointed out that AI tools used in writing can provide real-time feedback on various writing aspects, and eventually they enhance the overall quality of written work.

Assessing AI-generated writing is a developing field in language learning research. Scholars expect the education sector will intensively focus on the effective, ethical, and accountable use of AI in writing (Zou & Huang, 2024). As AI tools advance and become more predominant, scholars may develop strategies to differentiate between human-written and AI-generated texts. For instance, Elkhatat et al. (2023) examined several AI content detection tools to assess their effectiveness in distinguishing between human- and AI-generated content. Additionally, Moya et al. (2024) conducted a scoping review to examine the recent understanding of academic integrity and AI in higher education. Their findings highlighted both ethical concerns and opportunities connected with AI. The review also provided recommendations for addressing the integration of AI in higher education.

To date, rubrics have shown their usefulness as assessment tools, supporting teachers in providing more consistent scoring. However, the expansion of AI tools has introduced both opportunities and challenges in this area. Traditional rubrics are suitable for essays written by students themselves with no assistance from AI. The use of AI tools in writing requires updated rubrics. There is a need for more rigorous rubrics to precisely evaluate AI-generated essays. Therefore, this study aims to develop a comprehensive rubric to guarantee a fair, valid, and consistent evaluation of student writing that may be written with the help of AI tools. The proposed rubric will consider the strengths and weaknesses of current rubrics and add some elements that are required to evaluate essays generated with AI tools. This rubric will consider the strengths and weaknesses of current rubrics, address the unique challenges of AI-generated writing, and incorporate ethical considerations. For example, the rubric will not be biased based on race, gender, or any other characteristics. The rubric will also be transparent. Users (e.g., language instructors and students) will know the criteria and how scores will be assigned.

The aim of this study is to create a rubric that can be used to assess student achievement of learning outcomes in writing essays. The research questions (RQs) are:

  • RQ1. What are the key criteria that should be included in a rubric designed to evaluate AI-assisted writing?

  • RQ2. How does the use of an AI-specific rubric impact the consistency and validity of essay assessments compared to traditional rubrics?

  • RQ3. How do the assessment outcomes differ among various AI writing tools when evaluating student essays?

Top

Literature Review

The development of AI tools and their use by students require the development of policies on their use and the readiness of educators to incorporate them into teaching practices (Power, 2024). However, many instructors are not fully equipped to do this task. Research reveals that AI tools can significantly boost learner engagement and reduce academic delay, supporting their implementation in diverse educational settings (Ma & Chen, 2024).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2025)
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 5 Issues (2022)
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing