Article Preview
Top1. Introduction
The advantage of using a map to visually represent the complex information present in spatial data is that it is possible to use people’s innate powers of discrimination and interpretation (i.e. the ability to understand colors, patterns and spatial relevance). However, when preparing thematic maps and other materials using geographic information systems, there is a risk of overlooking characteristics of the original data, or causing misjudgment, if inadequate attention is paid to the representation method. In other words, it is necessary to consider the method of representation—i.e. “What is the best way to represent (map) data?” When visually understanding analysis results, it is necessary to consider not only the problems of uncertainty in the data, but also the uncertainty, which arises in the processes of data processing and visualization (Goodchild, Guoqing, & Shiren, 1992).
When visualizing spatial data for which attribute values are quantitatively defined, there is always a need for classification. More specifically, the general method is to classify data within a certain range into the same class, and indicate it with the same color. Existing geographic Information systems are equipped with a number of different methods for automatically performing classification (Umesh, 1988), but the displayed thematic maps vary greatly in appearance depending on the method used. For example, the spatial distributions of the “the number of industries” (used as an illustrative example in this research, Figure 5) appear completely different—to the extent that a viewer would not think that the same data is being used. Existing classification methods should be used selectively to suit the purpose of analysis or the properties of the spatial data to be displayed, but at the early stage of analysis, or for general end-users, it is necessary to have a method which is simpler, applicable to any type of spatial data, and enables display of the properties of the data without bias.
Basically, it is possible to faithfully represent the detailed distribution characteristics in the original data if the number of classes is increased. However, if the number of classes is too large, the legend becomes cumbersome, complicated and difficult to understand. Conversely, if the range of each class is too large, there is a risk of losing visibility of the spatial distribution characteristics in the original data—such as components, which vary in small increments and information relating to peaks and pits. In other words, it is necessary to consider the following two aspects of classification problems (MacEachren, 1994):
For problem (a), an algorithm has been developed for effectively classifying data, which does not have information regarding the number of classes (Umesh, 1988). Other methods, such as Jenks’ optimization (Jenks, 1967), have been proposed for problem (b). The Jenks’ method determines boundary values so as to minimize the average variance within each class. Various other methods have been devised, and incorporated into existing geographic information systems. Andrienko et al. (Andrienko, Andrienko, & Savinov, 2001) developed a set of tools for classification that facilitate looking on data from various viewpoints and thereby investigate different aspects of the data. In order to balance between these requirements in search of an acceptable compromise solution, they employed the interactive tools for classification.