Classification and Space Cluster for Visualizing GeoInformation

Classification and Space Cluster for Visualizing GeoInformation

Toshihiro Osaragi (Tokyo Institute of Technology, Japan)
Copyright: © 2019 |Pages: 20
DOI: 10.4018/IJDWM.2019010102
OnDemand PDF Download:
No Current Special Offers


It is necessary to classify numerical values of spatial data when representing them on a map so that, visually, it can be as clearly understood as possible. Inevitably some loss of information from the original data occurs in the process of this classification. A gate loss of information might lead to a misunderstanding of the nature of original data. At the same time, when we understand the spatial distribution of attribute values, forming spatial clusters is regarded as an effective means, in which values can be regarded as statistically equivalent and distribute continuous in the same patches. In this study, a classification method for organizing spatial data is proposed, in which any loss of information is minimized. Also, a spatial clustering method based on Akaike's Information Criterion is proposed. Some numerical examples of their applications are shown using actual spatial data for the Tokyo metropolitan area.
Article Preview

1. Introduction

The advantage of using a map to visually represent the complex information present in spatial data is that it is possible to use people’s innate powers of discrimination and interpretation (i.e. the ability to understand colors, patterns and spatial relevance). However, when preparing thematic maps and other materials using geographic information systems, there is a risk of overlooking characteristics of the original data, or causing misjudgment, if inadequate attention is paid to the representation method. In other words, it is necessary to consider the method of representation—i.e. “What is the best way to represent (map) data?” When visually understanding analysis results, it is necessary to consider not only the problems of uncertainty in the data, but also the uncertainty, which arises in the processes of data processing and visualization (Goodchild, Guoqing, & Shiren, 1992).

When visualizing spatial data for which attribute values are quantitatively defined, there is always a need for classification. More specifically, the general method is to classify data within a certain range into the same class, and indicate it with the same color. Existing geographic Information systems are equipped with a number of different methods for automatically performing classification (Umesh, 1988), but the displayed thematic maps vary greatly in appearance depending on the method used. For example, the spatial distributions of the “the number of industries” (used as an illustrative example in this research, Figure 5) appear completely different—to the extent that a viewer would not think that the same data is being used. Existing classification methods should be used selectively to suit the purpose of analysis or the properties of the spatial data to be displayed, but at the early stage of analysis, or for general end-users, it is necessary to have a method which is simpler, applicable to any type of spatial data, and enables display of the properties of the data without bias.

Basically, it is possible to faithfully represent the detailed distribution characteristics in the original data if the number of classes is increased. However, if the number of classes is too large, the legend becomes cumbersome, complicated and difficult to understand. Conversely, if the range of each class is too large, there is a risk of losing visibility of the spatial distribution characteristics in the original data—such as components, which vary in small increments and information relating to peaks and pits. In other words, it is necessary to consider the following two aspects of classification problems (MacEachren, 1994):

  • (a)

    What is the best number of classes?

  • (b)

    Where should the class boundary values be set?

For problem (a), an algorithm has been developed for effectively classifying data, which does not have information regarding the number of classes (Umesh, 1988). Other methods, such as Jenks’ optimization (Jenks, 1967), have been proposed for problem (b). The Jenks’ method determines boundary values so as to minimize the average variance within each class. Various other methods have been devised, and incorporated into existing geographic information systems. Andrienko et al. (Andrienko, Andrienko, & Savinov, 2001) developed a set of tools for classification that facilitate looking on data from various viewpoints and thereby investigate different aspects of the data. In order to balance between these requirements in search of an acceptable compromise solution, they employed the interactive tools for classification.

Complete Article List

Search this Journal:
Open Access Articles
Volume 18: 4 Issues (2022): Forthcoming, Available for Pre-Order
Volume 17: 4 Issues (2021): 2 Released, 2 Forthcoming
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing