Article Preview
TopIntroduction
Collocations, recurrent word combinations, have been widely recognized as an important aspect of vocabulary knowledge (Firth, 1957; Lewis 2008; Nattinger & DeCarrico, 1992; Sinclair, 1991, Nation, 2013). Dictionaries and most recently, corpus analysis tools (i.e. concordancers and the like) are the two main resources that learners draw on while acquiring such knowledge. Printed dictionaries are the popular and traditional collocation learning resource—for example, the BBI Combinatory Dictionary of English (Benson et al., 1984), the Dictionary of Selected Collocations (Hill and Lewis 1997) and the Oxford Collocation Dictionary (2009)—dedicated to assist learners in mastering this essential knowledge. With corpus analysis tools, learners can enter a word and explore what words are most likely to occur before or after it. These tools, whether web-based (e.g. the Collins COBUILD Corpus, WebCorp, WebCollocate, Mark Davies’ Brigham Young Corpora, COCA) or stand-alone (e.g., WordSmith Tools, AntConC) are specifically designed for linguists, and come with different interfaces, search functions and the presentation of results that are mostly in the form of keyword-in-context (KWIC) fragments and incomplete sentences. In terms of retrieving collocations, they facilitate the search for two-or three-word collocations.
Corpus-based tools have been explored by many researchers and teachers to facilitate collocation learning with promising results as demonstrated in the literature (Boulton & Cobb, 2017). They have been used in helping students find correct word combinations (e.g. “casue a problem” vs. “bring a problem”) (Yoon, 2008; Chen, 2011; Daskalovska, 2015; Vyatkina, 2016), understand the subtle meaning of certain verbs that lack direct L1 equivalents: synonyms (e.g. construct, build, and establish), hypernyms (e.g. create and compose) (Chan and Liou, 2005), and identify common word choice errors in student writing (Chambers and O’Sullivan, 2004; Wu, et al. 2009). Johns (1991) used the term “data-driven learning” (DDL) to describe this approach that centers on fostering learners’ skills in becoming a “language researcher”. Despite DDL’s great potential as presented in the literature, DDL has not been widely accepted by mainstream language educators (Leńko-Szymańska & Boulton, 2015, p. 3). Technical challenges that face both teachers and students go some way to explain the reluctance in implementing DDL in classroom. This view is supported by the results of a large-scale survey conducted by Tribble on using corpora in language teaching (Tribble, 2015).
User-friendliness and free access are reported to be two major factors in influencing the willingness of respondents to use corpora, while “don’t know how to”, “are not familiar with” are among the reasons for not using corpora. DDL researchers have also reported several factors that may hinder corpus use, including requirements of metalinguistic knowledge (e.g., part-of-speech tags) to formulate queries, unfamiliarity with complex search interfaces and functions, overwhelming results, and difficulties in locating and interpreting target language features in concordances (Yoon & Hirvela, 2004; O'Sullivan & Chambers, 2006; Yeh, Li, & Liou, 2007; Chen, 2011; Rodgers et al., 2011; Boulton, 2012a; Chang, 2014; Geluso & Yamaguchi, 2014; Daskalovska, 2015). For example, Chang (2014) asserts that the differing interfaces and functions of various corpus tools further increases the technical challenge whereby learners generally need to learn a new system in order to access a different corpus.