In their heyday, artificial neural networks promised a radically new approach to cognitive modelling. The connectionist approach spawned a number of influential, and controversial, cognitive models. In this article, we consider the main characteristics of the approach, look at the factors leading to its enthusiastic adoption, and discuss the extent to which it differs from earlier computational models. Connectionist cognitive models have made a significant impact on the study of mind. However connectionism is no longer in its prime. Possible reasons for the diminution in its popularity will be identified, together with an attempt to identify its likely future. The rise of connectionist models dates from the publication in 1986 by Rumelhart and McClelland, of an edited work containing a collection of connectionist models of cognition, each trained by exposure to samples of the required tasks. These volumes set the agenda for connectionist cognitive modellers and offered a methodology that subsequently became the standard. Connectionist cognitive models have since been produced in domains including memory retrieval and category formation, and (in language) phoneme recognition, word recognition, speech perception, acquired dyslexia, language acquisition, and (in vision) edge detection, object and shape recognition. More than twenty years later the impact of this work is still apparent.
Seidenberg and McClelland’s (1989) model of word pronunciation is a well-known connectionist example. They used backpropagation to train a three-layer network to map an orthographic representation of words and non-words onto a distributed phonological representation, and an orthographic output representation. The model is claimed to provide a good fit to experimental data from human subjects. Humans can make rapid decisions about whether a string of letters is a word or not, (in a lexical decision task), and can readily pronounce both words and non-words. The time they take to do both is affected by a number of factors, including the frequency with which words occur in language, and the regularity of their spelling. The trained artificial neural network outputs both a phonological and an orthographic representation of its input. The phonological representation is taken as the equivalent to pronouncing the word or non-word. The orthographic representation, and the extent to which it duplicates the original input, is taken to be the equivalent of the lexical decision task
The past tense model (McClelland & Rumelhart, 1986) has also been very influential. The model mirrors several aspects of human learning of verb endings. It was trained on examples of the root form of the word as input, and of the past-tense form as output. Each input and output was represented as a set of context-sensitive phonological features, coded and decoded by means of a fixed encoder/decoder network. A goal of the model was to simulate the stage-like sequences of past tense learning shown by humans. Young children first correctly learn the past tense of a few verbs, both regular (e.g. looked) and irregular (e.g. went, or came). In stage 2 they often behave as though they have inferred a general rule for creating the past tense, (adding –ed to the verb stem). But they often over-generalise this rule, and add –ed to irregular verbs (e.g comed). There is a gradual transition to the final stage in which they learn to produce the correct past tense form of both regular and exception words. Thus their performance exhibits a U-shaped function for irregular verbs (initially correct, then often wrong, then correct again).
The model was trained in stages on 506 English verbs. First, it was trained on 10 high frequency verbs (regular, and irregular). Then medium frequency verbs (mostly regular) were introduced and trained for a number of epochs. A dip in performance on the irregular verbs occurred shortly after the introduction of the medium frequency verbs – a dip followed by a gradual improvement that resembled the U-shaped curve found in human performance.
Key Terms in this Chapter
Generalisation: Artificial neural networks, once trained, are able to generalise beyond the items on which they were trained and to produce a similar output in response to inputs that are similar to those encountered in training
Eliminative Connectionism: The eliminative connectionist is concerned to provide an account of cognition that eschews symbols, and operates at the subsymbolic level. For instance, the concept of “dog” could be captured in a distributed representation as a number of input features (e.g. four-footed, furry, barks etc) and would then exist in the net in the form of the weighted links between its neuron like units.
Connectionism: Connectionism is the term used to describe the application of artificial neural networks to the study of mind. In connectionist accounts, knowledge is represented in the strength of connections between a set of artificial neurons.
Classical Symbol Processing: The classical view of cognition was that it was analogous to symbolic computation in digital computers. Information is represented as strings of symbols, and cognitive processing involves the manipulation of these strings by means of a set of rules. Under this view, the details of how such computation is implemented are not considered important.
Chinese Room: In Searle’s thought experiment, he asks us to imagine a man sitting in a room with a number of rule books. A set of symbols is passed into the room. The man processes the symbols according to the rule books, and passes a new set of symbols out of the room. The symbols posted into the room correspond to a Chinese question, and the symbols he passes out are the answer to the question, in Chinese. However, the man following the rules has no knowledge of Chinese. The example suggests a computer program could similarly follow rules in order to answer a question without any understanding.
Implementational Connectionism: In this less extreme version of connectionism, the goal is to find a means of implementing classical symbol processing using artificial networks – and to find a way of accounting for symbol processing at the level of neurons.
Lexical Decision: Lexical decision tasks are a measure devised to look at the processes involved in word recognition. A word or pseudoword (a meaningless string of letters, conforming to spelling rules) is presented, and the reader is asked to press a button to indicate whether the display was a word or not. The time taken to make the decision is recorded in milliseconds. The measure can provide an indication of various aspects of word processing – for instance how familiar the word is to the reader.