Although corpus linguistic methods and research have had a considerable impact on language teaching in the last few decades, the corpus is still mostly regarded as a device in the hands of the teacher rather than the students. This is most probably due to a number of serious problems that need to be considered when allowing students unlimited access to the wealth of corpus-linguistic data. In this chapter it is argued that these problems are to be taken seriously and that a fundamental prerequisite for student use of corpora is what I call ‘corpus competence.’ Such a corpus competence will not only help students to make use of corpora in the classroom but will also prepare the grounds for use of corpora in noninstitutionalized contexts and as a tool for life-long learning.
Key Terms in this Chapter
Concordancer: A tool to exploit corpus data. A concordancer searches a given corpus for an individual (usually lexical) item or a set of items, identifies each occurrence of the search expression in the corpus and provides all instances together with its surrounding context.
Representativeness: No corpus can capture a complete language. Instead corpora are regarded as samples of a given language. Being a sample, corpora try to be representative of a particular language or parts thereof. The BROWN corpus, for instance, tries to represent written American English from 1961.
Corpus: A collection of annotated or unannotated texts used for linguistic analysis.
Frequency, Normalized: Since absolute frequencies depend on the size of the corpus, absolute frequencies are usually normalized, that is divided by the total number of words in the corpus. Normalized frequencies of individual corpora are comparable even if the corpora are of different sizes.
Frequency, Absolute: The number of occurrences of a particular item in a corpus.
Genre: A way of categorizing texts which does not involve any linguistic knowledge or expertise. Examples are the categorizations found in libraries and categories like ‘spoken,’ ‘written,’ ‘crime fiction,’ ‘newspaper texts.’
Register: A way of categorizing texts according to the situation in which these texts are used. The concept of ‘register’ tries to capture the intuition that the situation of use has an impact on the way language is used. Illustrative examples are the differences in language use when talking to intimate friends as opposed to talking to superiors at work.