This chapter showcases research initiatives on the validation of English language assessments used in the Canadian public sphere. Canadian-based assessment developers and researchers have created a rich literature documenting language assessment validation practices in different contexts. The authors first explain the purpose of assessment validation and several contemporary frameworks used for this purpose. They then describe assessments across the lifespan, for both English as a first language (L1) and second or additional language (L2). The authors conclude by describing a small-scale qualitative validation study of an English for academic purposes (EAP) listening assessment, highlighting emergent findings related to audio-visual input and test-taker's cognitive processing, as well as construct appropriateness and authenticity.
TopIntroduction
In the Canadian context, language assessment is ubiquitous across the lifespan, taking many forms and serving a variety of purposes. Dozens of language assessments exist for children in primary and pre-primary school alone (Heppner, 2020), and even babies are routinely assessed to monitor potential need for language interventions (Moharir et al., 2014). During compulsory schooling in Canada (ages 6 to 18), language assessments gauge first language (L1) users’ academic language skills (i.e., language arts exams in the language of instruction, or LOI), while students using the LOI as a second language (L2) may be assessed for language support needs. Post-secondary institutions assess the language skills of international students who use the LOI as an L2 for purposes of admission and placement. Successful immigration to Canada is contingent on demonstrating language proficiency in English or French, Canada’s two official languages. Further, some professional designations in Canada also require profession-specific language assessments for internationally trained applicants (i.e., nursing and optometry). Clearly, language assessment is an integral part of Canadian educational and professional life, especially for people who have migrated to Canada from abroad and have diverse linguistic backgrounds.
According to Standards for Educational and Psychological Testing (AERA/APA/NCME, 2014), assessment validation is the process by which data are collected and analyzed in light of a specified theoretical framework of language competency, in order to demonstrate the extent to which an assessment's scores represent what they claim to represent. Validation entails the systematic, analytic, and critical process through which documentation and evidence are gathered and crafted into a cohesive argument for the assessment’s interpretation and use. If the argument can withstand reasoned critiques, the validation process can be considered successful for the assessment’s given purpose (Kane, 2006). Questions that drive the validation process are: what is the quality of the conceptual framework of language competence underpinning the assessment? Do the assessed language constructs clearly align with that framework and the skills required in the target language domain? Is the assessment's level of difficulty appropriate? Do scores exhibit any patterns of irrelevant fluctuation or variance unrelated to the target skills? (AERA/APA/NCME, 2014). These are just some of the questions to be answered in order to determine if a test-taker’s assessment scores appropriately represent their real-world language skills.
Validation is the shared responsibility between the assessment developer and the assessment user (the decision-maker), but the developer is responsible for the majority of validation evidence: providing the evidence and rationale for score interpretations for the assessment’s intended use (AERA/APA/NCME, 2014). The score user, on the other hand, is responsible for evaluating this evidence in their unique assessment context. Validation of large-scale language assessments can be a complex process, requiring analysis of assessment content and elicited constructs, the processes assessment-takers use when they engage with assessment items, the consistency of the assessment results and their relationship to other variables of interest, and the consequences of score use (AERA/APA/NCME, 2014). Yet, for assessments used in local contexts, a small-scale validation process can be performed with a small budget and can provide test developers with essential and targeted information about the validity of their assessments.