# Handling Imprecise Data in Geographic Databases

Cyril de Runz (LIASD, University of Paris 8, France & CRestic, University of Reims, France), Herman Akdag (LIASD, University of Paris 8, France) and Asma Zoghlami (LIASD, University of Paris 8, France)
DOI: 10.4018/978-1-4666-5888-2.ch172

Top

## Background

### Imprecision in Geographic Data

Even though the use of GIS is common, the consideration of imprecision from the design of the information system to the data exploitation, is still a current issue and is the subject of this article. There is imprecision whenever the exact value of the truth status of a proposition of interest is not established uniquely, i.e., whenever its truth status is equivocal (Smets, 1995).

Indeed, by observing and modeling reality, the building of each dataset includes some imperfections. The data integration also produces other imperfections. In spatial science, a principal issue is how to deal with boundaries: it is hard to precisely and accurately delineate frontiers (Fisher, 1999).

Therefore, handling spatially imperfect data is essential. There are various methods for dealing with imperfection. The main mathematical theories are: probability theory; fuzzy set and possibility theory (Zadeh, 1965); rough set theory; theory of evidence.

According to the literature (Fisher, 1999), the fuzzy set theory is a good choice for dealing with imprecision.

### Fuzzy Set Theory: Main Principles

Imprecision should be considered in the modeling of the information. As the sorites paradox makes it evident that probabilities are not adapted to imprecision, Zadeh (1965) introduced the fuzzy set theory. Indeed, the fuzzy set theory defines the notion of partial and valued membership of a value to a class. A fuzzy set A is characterized by a membership µA function taking values in [0, 1]. For each domain value x, a membership degree µA(x), defined in [0, 1], is proposed. Therefore, concepts like young, old, etc. may be easily modeled by fuzzy sets.

An α-cut Aα, for all α > 0, is the set of the domain values (the set of x) having a membership degree higher or equal to α (µA(x) ≥ α). By convention, A0 is the set of x such as µA(x) > 0.

A fuzzy set A is connected if, and only if, for all α in [0; 1] Aα is connected. Aα is connected if for all nonempty sets B and C, such as Aα, is their union, and there exists at least one point of B adhering to C or one point of C adhering to B. On R, Aα is connected if, and only if, it is an interval. In other words, a fuzzy set A is connected if, and only if, for all α in [0; 1], Aα is not composed with separate sets.

The use of connected α-cuts allows us to store different values of the imprecise data in the form of a multivalued set. Their use enables to draw the boundaries between a very low confidence membership (the 0-cut), a rather low confidence membership, a moderately low confidence membership, a low confidence membership, etc., which may also be interpreted as a range of values between almost impossible and very possible (Figure 1).

Figure 1.

Interpretation examples on connected α-cuts

## Key Terms in this Chapter

Data Model: A description of the objects represented by a computer system together with their properties and relationships.

Imprecision: There is imprecision whenever the exact value of the truth status of a proposition of interest is not established uniquely.

Geographical Information System: A system built for the capture, the storage, the handling, the analysis and the visualization of geographical data.

Modelling Language: An artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules.

Geographical Data: It describes an object with spatial reference on the Earth's surface.

Information System: A system designed to capture, store, manipulate and manage data.

Database: A structured collection of data.

## Complete Chapter List

Search this Book:
Reset