The Exposition of Fuzzy Decision Trees and Their Application in Biology

The Exposition of Fuzzy Decision Trees and Their Application in Biology

Malcolm J. Beynon (Cardiff University, UK) and Kirsty Park (University of Stirling, UK)
DOI: 10.4018/978-1-59904-996-0.ch020
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

This chapter employs the fuzzy decision tree classification technique in a series of biological based application problems. With its employment in a fuzzy environment, the results, in the form of fuzzy ‘if .. then ..’ decision rules, bring with them readability and subsequent interpretability. The two contrasting applications considered concern, the age of abalones and the lengths of torpor bouts of hibernating Greater Horseshoe bats. Emphasis is on the visual results presented, including the series of membership functions used to construct the linguistic variables representing the considered attributes and the final fuzzy decision trees constructed. Technical details presented further offer the opportunity to readers to future employ the technique in other biological applications.
Chapter Preview
Top

Introduction

Fuzzy set theory (FST) stands out as a general methodology that has contributed to the development of already established techniques used throughout many areas of science, including biology and medicine (recent examples include, Morato et al., 2006; Mastorocostas and Theocharis, 2007). Since its introduction in Zadeh (1965), the utilisation of FST in such development has been with the inclusion of the acknowledgement of the presence of vagueness and ambiguity during its operation. Further, it has also invoked the ability to interpret the structuring and subsequent results from data analysis in a linguistic orientated language (Grzegorzewski and Mrówka, 2005). Indeed, artificial intelligence, with respect to FST, is closely associated with the mimicry of human cognition and linguistic language (Trillas and Guadarrama, 2005).

The issue of interpretability is particularly relevant in classification based problems, but overlooked, since so often the concomitant analysis is more oriented to the resultant classification accuracy, rather than interpretability. Indeed, Breiman (2001), in an informative discussion on the cultures of statistical modelling, comes down heavily on the need to accommodate the ability to interpret results in analysis undertaken. Their discussion offers a pertinent illustrative argument, describing the example of a medical doctor, with experimental data, in a choice between accuracy and interpretability they would choose interpretability. This issue of interpretability over accuracy is also pertinent in general biology.

This chapter considers fuzzy decision trees (FDTs), an example of an already established technique that has been developed using FST. The fundamentals of the decision tree technique, within a crisp or fuzzy environment, is concerned with the classification of objects described by a number of attributes, with concomitant decision rules derived in the constructed decision tree. The inherent structure a consequence of the partitioning algorithm used to discern the classification impact of the attributes. An early FDT reference is attributed to Chang and Pavlidis (1977). In the area of medicine, for example, Podgorelec et al. (2002) offer a good review of decision trees, with a most recent employment of FDTs presented in Armand et al. (2007), which looked into gait deviations. Relative to this, there is a comparative dearth of their employment in a biological setting, one exception being Beynon et al. (2004a).

An important feature of FDTs is the concomitant sets of fuzzy ‘if .. then ..’ decision rules constructed, whose condition and decision parts, using concomitant attributes, can be described in linguistics terms (such as low, medium or high). The suggested FDT approach employed here was presented in Yuan and Shaw (1995) and Wang et al. (2000), and attempts to include the cognitive uncertainties evident in the data values. This FDT approach has been used in Beynon et al. (2004b) and Beynon et al. (2004a), the latter investigating the songflight of the Sedge Warbler, expositing the relationship between the birds’ characteristics like, repertoire size and territory size, against their song flight duration.

Central to the utilisation of FDTs is the fuzzification of the considered data set, through the employment of FST related membership functions (MFs), which further enable the linguistic representation of the attributes considered (Kecman, 2001), present also in the subsequent decision rules constructed. Alongside the exposition of FDTs in this chapter, the results from two contrasting biology based applications are considered; the first using a well known data set from the UCI data repository and relates to the prediction of the age of abalones (Waugh, 1995), the second is a more contemporary application looking at the torpor bouts of hibernating Greater Horseshoe bats (Park et al., 2000).

Complete Chapter List

Search this Book:
Reset