This chapter deals with applications of artificial neural networks in classification and regression problems. Based on theoretical analysis it demonstrates that in classification problems one should use cross-entropy error function rather than the usual sum-of-square error function. Using gradient descent method for finding the minimum of the cross entropy error function, leads to the well-known backpropagation of error scheme of gradient calculation if at the output layer of the neural network the neurons with logistic or softmax output functions are used. The author believes that understanding the underlying theory presented in this chapter will help researchers in medical informatics to choose more suitable network architectures for medical applications and that it helps them to carry out the network training more effectively.
Medicine involves decision-making and classification or prediction is an important part of it. However, medical classification or prediction is usually a very complex and hard process at least from the following reasons:
Much of the data that are relevant to classification or prediction, especially those received from laboratories, are complex or difficult to comprehend and can be interpreted only by experts.
For reliable classification or prediction a large amount of data is frequently needed and some important anomalies in the data may be overlooked.
When monitoring a patient, some for the patient dangerous events can be too rare and therefore it may be difficult to identify them in the continuous stream of data.
Thus computer-assisted support could be of significant help. With the increasing number of clinical databases it is likely that machine-learning applications will be necessary to detect rare conditions and unexpected outcomes. The needful methods and algorithms can be found first of all in the domain of mathematical statistics and artificial intelligence. From means of artificial intelligence rule based experts systems were primarily used. Today the most important applications utilize algorithms based on neural networks, fuzzy systems, neurofuzzy systems or evolution algorithms.
Classical statistical methods require certain assumptions about the distribution of data. Neural networks can constitute a good alternative when some of these assumptions cannot be verified. From this point of view neural networks constitute a special kind of nonparametric statistical methods. Therefore the most successful neural network architectures are implemented in standard statistical software packages, as there is for example STATISTICA. Thus the eligibility of neural network algorithms for the given decision making task can be easily tested on the sample data using one of these software packages.
From abstract mathematical point of view medical classification and prediction tasks fall into the scope of either classification or regression problems. A classification or pattern recognition can be viewed as a mapping from a set of input variables to an output variable representing the class label. In classification problems the task is to assign new inputs to labels of classes or categories. In a regression problem we suppose that there exists underlying continuous mapping y = f(x) and we estimate the unknown value of y using the known value of x.
Typical case of classification problem in medicine is medical diagnostics. As input data the patient’s anamnesis, subjective symptoms, observed symptoms and syndromes, measured values (e.g. blood pressure, body temperature etc.) and results of laboratory tests are taken. This data are coded by vector x, the components of which are binary or real numbers. The patients are classified into categories D1, . . ., Dm that correspond to their possible diagnoses d1, . . ., dm .
Many successful applications of neural networks in medical diagnostics can be found in literature (Gant, 2001). The processing and interpretation of electrocardiograms (ECG) with neural networks was intensively studied, because evaluation of long term ECG recordings is a time consuming procedure and requires automated recognition of events that occur infrequently (Silipo, 1998). In radiology neural networks have been successfully applied to X-ray analysis in the domains of chest radiography (Chen, 2002) and (Coppini, 2003), mammography (Halkiotis, 2007) and computerized tomography (Lindahl, 1997), (Gletsos, 2003) and (Suzuki, 2005). Also classification of ultrasound images was successful (Yan Sun, 2005). Neural networks have been also successfully applied to diagnose epilepsy (Walczak, 2001) or to detect seizures from EEG patterns (Alkan, 2005).
Also prediction of the patient’s state can be stated as classification problem. On the basis of examined data represented with vector x patients are categorized into several categories P1, . . ., Pm that correspond to different future states. For example five categories with the following meaning can be considered: P1 can mean death, P2 deterioration of the patient’s state, P3 steady state, P4 improvement of the patient’s state and P5 recovery.