APECS: An Adaptively Parameterised Model of Associative Learning and Memory

APECS: An Adaptively Parameterised Model of Associative Learning and Memory

I.P.L. McLaren (University of Exeter, UK)
DOI: 10.4018/978-1-60960-021-1.ch007
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In this chapter the author will first give an overview of the ideas behind Adaptively Parameterised Error Correcting Learning (APECS) as introduced in McLaren (1993). It will take a somewhat historical perspective, tracing the development of this approach from its origins as a solution to the sequential learning problem identified by McCloskey and Cohen (1989) in the context of paired associate learning, to its more recent application as a model of human contingency learning.
Chapter Preview
Top

Background: The Sequential Learning Problem

The development of novel connectionist algorithms (Rumelhart, Hinton, and Williams, 1986; Ackley, Hinton, and Sejnowski, 1985) capable of driving learning in multi-layer networks can be seen as one of the major developments in cognitive science in the nineteen-eighties. One of these algorithms, Back Propagation (Rumelhart, Hinton, and Williams, 1986) used gradient descent to learn input / output relationships, and was typically instantiated in feed-forward architectures. This otherwise successful approach, however, came up against the sequential learning problem identified by McCloskey and Cohen (1989) and further analysed by Ratcliff (1990). A general statement of this problem is that if a network employing Back Propagation is first taught one set of input / output relations, and then some other mapping is learnt whose input terms are similar to those first used in training, then a near complete loss of performance on the first mapping is observed on test. We can say that the new learning wipes out the old. This is not a necessary characteristic of the feed-forward architecture, because, if training alternates between the two mappings, repeatedly teaching first one and then the other, eventually a solution is reached that captures both sets of input / output relationships. Thus, this “catastrophic interference”, when new learning erases old, is only seen if the two mappings are learnt in sequence. This does not mean that this property of the learning algorithm can be ignored, however, as learning (in humans and networks) often takes place within a sequential format (eg see Ratcliff, 1990; Hinton and Plaut, 1987; Sejnowski and Rosenberg, 1987).

As a simple example of this general type of problem, consider modelling a paired-associate experiment (based on Barnes and Underwood, 1959) in which human subjects are required to learn a list (list 1) of eight nonsense syllable - adjective pairs to a criterion of 100%. That is, after some number of training trials, the subject is able to provide the correct adjectival response to each nonsense syllable stimulus. After learning list 1, the subjects learn list 2, which employs the same nonsense syllables as the first, but new adjectives paired with them. Training continues until subjects are near perfect on this list (>90%). They are then asked to recall the original list 1 adjectival responses for each nonsense syllable. Performance drops to around 50% for this list, which is taken to be an instance of retroactive interference (control groups suggest that it is not simply the passage of time that is responsible for this decline in performance).

As McCloskey and Cohen (1989) showed, this task can be modelled in a feed-forward two layer network running Back Propagation. The list 'context' and the nonsense syllables (eg dax, teg) are the input, and the adjectives (e.g. regal, sleek) are the output (see Figure 1 which shows both the network in question and the experimental design).

Figure 1.

Top panel: The design of Barnes and Underwood's (1959) experiment reduced to a two item list for simulation purposes. Bottom panel: A feed-forward architecture running back propagation used to simulate performance on the task outlined in the top panel. This network is an adaptation of one of the networks used by McCloskey and Cohen (1989) and uses two item-pair lists. Each node in the input and output layers stands for some stimulus, list context, or response; signalling its presence or absence via its activity. Learning proceeds by changing the values of the connection strengths between nodes (called 'weights') so that the input nodes can transmit activation, via the hidden units, to the output nodes

Complete Chapter List

Search this Book:
Reset