Concordancing 2.0: On Custom-Made Corpora in the Classroom

Concordancing 2.0: On Custom-Made Corpora in the Classroom

Jaroslaw Krajka (Warsaw School of Social Psychology, Poland)
Copyright: © 2009 |Pages: 21
DOI: 10.4018/978-1-60566-190-2.ch022
OnDemand PDF Download:
No Current Special Offers


This chapter contrasts the use of corpora and concordancing in the Web 1.0 era with the opportunities presented to the language teachers by the Web 2.0 stand-alone concordancing software, which make it much easier to access, compile, and consult the corpora that are more relevant for particular classroom contexts. It is argued here that once trained in the basic corpus consultation procedures with demo interfaces, teachers can exercise their autonomy by using texts available locally and globally to compile custom-made collections. In the chapter the two basic approaches to custom-made concordancing, namely the Web as Corpus and the compilation of ad-hoc collections will be described, together with a summary of sample tools. It is hoped that given careful selection of relevant sources, the learning process will become significantly enhanced thanks to more authentic and relevant language data, promoting teacher autonomy and discovery-based procedures.
Chapter Preview

Background: Opportunities And Drawbacks Of In-Class Corpus Consultation Procedures

There are numerous studies reporting the investigation of the effectiveness of corpus-based procedures in foreign language instruction. These range from the use of small corpora tailored to students’ needs (Aston, 1997) to promoting large corpus concordancing (Bernardini, 2000; de Schryver, 2002); improving writing performance at lower (Yoon & Hirvela, 2004; Gaskell & Cobb, 2004) and advanced levels (Chambers & O’Sullivan, 2004); grammar presentation (Hadley, 2002) and rule inferencing (St. John, 2001). An extensive body of research can be, quite naturally, found in the area of vocabulary acquisition (Cobb, 1997; Cobb, 1998) and teaching foreign language reading, not only assisted by concordancers themselves, but performed in the wider context of a resource-assisted environment, encompassing for instance concordance, dictionary, cloze-builder, hypertext, and a database with the interactive self-quizzing feature (Cobb et al., 2001; Horst et al., 2005). Some studies reported on the relation between the effectiveness of corpus-consultation procedures and strategy training (Kennedy & Miceli, 2001; St. John, 2001; Chambers, 2005), indicating the need to reflect on the conscious and gradual introduction of the tool in the classroom. The perspective that is most relevant for the purposes of the present chapter is represented by the increase of writing proficiency due to learner corpus self-compilation (Lee & Swales, 2006).

Key Terms in this Chapter

Corpus: A collection of linguistic data, either written texts or a transcription of recorded speech, which can be used as a starting-point of linguistic description or as a means of verifying hypotheses about a language.

The Web as Corpus: A movement in computational linguistics, advocating the use of pre-selected or randomly chosen websites or discussion group postings as sources for custom-made corpus collections, usually going with dedicated concordancing solutions.

Corpus Compilation: The process of collecting samples of language according to predefined criteria, such as medium, register, genre, etc., and putting them together either in a single file or a file collection to serve as data for concordance queries.

Custom-Made/Do-It-Yourself/Ad-Hoc Corpus: A collection prepared by a particular teacher/translator to address specific needs of a teaching/translating context, compiled by spotting and retrieving relevant texts either on the Web or locally.

Concordancer: A tool, either an online form or an installable piece of software, enabling formulating queries of different levels of sophistication and browsing a selected or customized corpus for instances of use, producing the KWIC (Key Word In Context) output.

British National Corpus: Sometimes referred to as the BNC, this corpus includes up to 100 million examples of written and spoken language thus presenting an extremely wide representation of British English. The latest edition dates from 2007 and includes extracts from all areas of contemporary British life, from newspapers to periodicals as well as radio and television programmes.

Concordancing: The procedure of browsing a corpus (either ready-made or custom-made) for occurrences of particular words or phrases, used to assist dictionary lookup, observe language use in particular registers or test hypotheses about collocations.

Complete Chapter List

Search this Book: