Learner Fit in Scaling Up Automated Writing Evaluation

Learner Fit in Scaling Up Automated Writing Evaluation

Elena Cotos, Sarah Huffman
DOI: 10.4018/ijcallt.2013070105
(Individual Articles)
No Current Special Offers


Valid evaluations of automated writing evaluation (AWE) design, development, and implementation should integrate the learners’ perspective in order to ensure the attainment of desired outcomes. This paper explores the learner fit quality of the Research Writing Tutor (RWT), an emerging AWE tool tested with L2 writers at an early stage of its development. Employing a mixed-methods approach, the authors sought to answer questions regarding the nature of learners’ interactional modifications with RWT and their perceptions of appropriateness of its feedback about the communicative effectiveness of research article Introductions discourse. The findings reveal that RWT’s move, step, and sentence-level feedback provides various opportunities for learners to engage with the revision task at a useful level of difficulty and to stimulate interaction appropriate to their individual characteristics. The authors also discuss insights about usefulness, user-friendliness, and trust as important concepts inherent to appropriateness.
Article Preview


Since its beginnings over fifty years ago, Automated Writing Evaluation (AWE) has gained increased popularity as well-considerable technological advancement. Early AWE software, developed to reduce teachers’ workload by automating the scoring of student essays, analyzed the quality of texts by examining language at the surface level (Page, 2003). Modern day AWE software, such as Criterion (Educational Testing Service), MY Access! (Vantage Learning), and Intelligent Essay Assessor (Pearson Knowledge Technologies), employ natural language processing techniques to enable more complex analyses of writing for performance-specific feedback. These products’ scoring and feedback affordances are promoted as being capable of meeting the needs of L2 learners, writing teachers, and institutional administrators.

However, despite the promising potential of AWE, its effectiveness has been the subject of a strenuous debate. On the one hand, AWE programs are deemed to support process writing approaches valued for multiple drafting and scaffolding feedback (Hyland, 2003; Hyland & Hyland, 2006). On the other hand, AWE effects on the development of writing skills are doubted and even considered harmful (Cheville, 2004). However, the debate largely feeds on empirically unsupported arguments about whether or not AWE should be used rather than how it should be used to better serve the end users (Chen & Cheng, 2008; Grimes & Warschauer, 2010). It has also been pointed out that the debate over AWE effectiveness overlooks design issues such as the lack of relevant theoretical grounding, heavily form-focused feedback, and unspecified learner needs, which are bound to affect AWE impact on L2 writing if not accounted for as the programs are designed (Cotos, 2012). AWE developers have relied on psychometric evidence of accuracy and reliability, but disregarded the possible consequences of re-purposing automated scoring technology from intended summative to formative assessment, ignoring the need to re-conceptualize AWE design. Along these lines, we believe that AWE technologies should be evaluated from the earliest stages of their development, and that the learners’ perspective on the use of AWE for a given task, in particular, should be a fundamentally significant viewpoint in conceptualizing the design, development, and implementation of such tools in order to enhance their effectiveness.

These issues have been considered in the design of the Research Writing Tutor (RWT), an innovative, genre-specific, web-based tool that analyzes the research article Introduction, Methods, Results, and Discussion/Conclusion sections in terms of discourse units that build the communicative effectiveness of each of these sections. RWT represents a scale-up from an earlier prototype - IADE, a program informed by Interactionist SLA, skill acquisition theory, systemic functional linguistics, and genre analysis (Cotos, 2009). IADE analyzes research article Introductions by classifying texts into rhetorical moves1 (Swales, 1981, 2004) and generates color-coded feedback on the discourse structure of student texts. It also compares student texts with a corpus of Introductions published in fifty academic domains and provides numeric feedback on how well students’ writing approximates the writing in their field. The approach to IADE’s design and empirical evaluation (Cotos, 2010) have motivated scaling up to a more fine-grained operational design of RWT, which not only includes improved functionality of features, but also draws from systematic analyses of formative data obtained from test implementations aimed at validating design decisions and informing continuous development of this emerging tool.

Complete Article List

Search this Journal:
Volume 13: 1 Issue (2023)
Volume 12: 5 Issues (2022)
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing