Combinatory Categorial Grammar for Computer-Assisted Language Learning

Combinatory Categorial Grammar for Computer-Assisted Language Learning

Simon Delamarre (Telecom Bretagne, France) and Maryvonne Abraham (Telecom Bretagne, France & Université Européenne de Bretagne, France & Laboratoire LaLICC, Paris Sorbonne, France)
DOI: 10.4018/978-1-61350-447-5.ch017
OnDemand PDF Download:
List Price: $37.50


This chapter intends to demonstrate how Applicative and Combinatory Categorial Grammar (ACCG) can be drawn on to design powerful software applications for the teaching of languages. To this end, the authors present some modules from their “pictographic translator,” software that performs syntactical analysis of sentences in natural language directly written by the user, and then dynamically displays series of pictograms that illustrate the words and structure of the user’s sentences. After a short presentation of the application and an introduction to ACCG, the chapter examines how this formalism enables the building of several high-level functions in the system, such as disambiguation, structure exhibition, and grammatical correction/validation. The chapter concludes with a short discussion concerning the potential (and limits) of this architecture with regards to multilingualism.
Chapter Preview


According to certain studies concerning learning and retention, we usually memorize about 90% of what we do, versus only 10% of what we read. And indeed, however forced this explicit quantification may seem, it is a fact that any learning activity, to be efficient, has to be active. It is in this spirit that, while the huge majority of current pedagogical software programs mainly rely on cross-the-correct-answer/fill-the-gap exercises, we conceived this interactive pictographic translator. It consists of a graphic interface, in which the user is invited to build a sentence, by typing his own words. When the programs detects that a new word has been entered (using a tokenizer), it computes a syntactical analysis of the currently built sentence, and then, retrieves (by performing a lemmatization) and displays a pictogram that represents this word (see Figure 1). As often as it remains relevant, we use the rule of correspondence: one word, one pictogram. Situations in which an only pictogram for several words would be preferable (e.g. washing machine) are treated in a post-processing of the tokenisation step, by merging tokens that form a wider component present in the lexicon ((washing,machine) → (washing machine)).

Figure 1.

The pictographic representation for “Je souris à la petite souris” (I’m smiling at the little mouse). The pictograms appear dynamically as the words are entered.

At the same time, by using several graphical effects, such as colored borders, subtitles, transparencies, small pictograms to indicate functions (plural, tenses...), the program gives to the user direct visual control of the sentence he is writing, enabling him to detect problems if there are any (see figure 2). If however the user is not able to correct one or several mistakes by himself, he may ask the program to indicate these for him. Thus, little by little, through this process of writing/visual control/correction, the learner is led to build by himself correct sentences, without needing any other exterior help, which will bring him, beside the satisfaction of autonomy, a better understanding of the structures and mechanisms of language. Furthermore, in the case of pupils learning reading, this ludic activity of “making pictures to appear” may conduce them to grasp the expressive power of words.

Figure 2.

Another example, “J’aime écouter le vent souffler dans les arbres” (I like listening the wind blowing in the trees). One can see here the errors highlighted and the correction function. Here the user fails in writing the infinitive verb “écouter” (to listen) -confusion with past participle “écouté”- and forgot the final s in the plural word “arbres” (trees).

This chapter aims to present the main mechanisms and ideas our pictographic translator is based on. After giving to the reader some elemental knowledge about the formalism used (ACCG) and the theoretical issues involved, we will present some of the practical applications and features this approach enables to implement for computer-assisted language teaching. We will then explore the question of extending such an architecture to a multilingual context, in particular we will study the invariants between languages that can be captured by, on one hand, Categorial Grammars and, on the other hand, by the pictographic representation.

The software program has been initially developed for the French language, and already supplies operational linguistic coverage. Our latest version also implements experimental extensions to English and Spanish, with reduced vocabulary and grammar. It should be noted that our implementation uses some elements of the excellent Michael White’s Java API for categorial grammars, OPEN CCG, and also Jason Baldridge’s associated format DotCCG for grammar definition (see

Complete Chapter List

Search this Book: