Spatial data management has been an active area of intensive research for more than two decades. In order to support spatial objects in a database system several important issues must be taken into account such as: spatial data models, indexing mechanisms and efficient query processing. A spatial database system (SDBS) is a database system that offers spatial data types in its data model and query language and supports spatial data types in its implementation, providing at least spatial indexing and efficient spatial query processing (Güting, 1994). The main reason that has caused the active study of spatial database management systems (SDBMS) comes from the needs of the existing applications such as geographical information systems (GIS), computer-aided design (CAD), very large scale integration design (VLSI), multimedia information systems (MIS), data warehousing, multi-criteria decision making, location-based services, etc.
Background In Spatial Queries And Processing
From the query processing point of view, the following three properties characterize the differences between spatial and relational databases (Brinkhoff et al., 1993): (1) unlike relational databases, spatial databases do not have a fixed set of operators that serve as building blocks for query evaluation; (2) spatial databases deal with extremely large volumes of complex objects, which have spatial extensions and cannot be sorted in one dimension; (3) computationally expensive algorithms are required to test the spatial operators, and the assumption that I/O costs dominate CPU costs is no longer valid.
Key Terms in this Chapter
Buffer Query: This spatial query involves two spatial datasets and a distance threshold d. The answer is a set of pairs of spatial objects from the two input datasets that are within distance d from each other.
Spatial Database System (SDBS): A spatial database system is a database system that offers spatial data types in its data model and query language and supports spatial data types in its implementation, providing at least spatial indexing and efficient spatial query processing.
Spatial Query: It is a set of spatial conditions characterized by spatial operators that form the basis for the retrieval of spatial information from a spatial database system.
Iceberg Distance Join: This spatial query involves two spatial datasets, a distance threshold d and a cardinality threshold K (K=1). The answer is a set of pairs of objects from the two input datasets that are within distance d from each other, provided that the first object appears at least K times in the join result.
Spatial Data Types: Spatial data types provide a fundamental abstraction for modeling the structure of geometric entities in space (geometry) as well as their relationships (topology), e.g. points, lines, polygons, regions, etc. A spatial object is an object with at least one attribute of a spatial data type.
Spatial Operators: Spatial operators represent the spatial relationships between spatial objects. The most representative spatial relationships are: (1) Topological relationships, such as adjacent, inside, disjoint, etc. are invariant under topological transformations like translation, scaling, and rotation; (2) Direction relationships, for example, above, below, north_of, southwest_of, etc.; and (3) Metric relationships, e.g. distance < 100.
Top-K Most Influential Site Query: Given a set of sites S, a set of weighted objects O, a spatial region Q and an integer K (K=1), the top-K most influential site query retrieves K sites in Q with the largest influence. Where, the influence of a site s?S is the total weight of objects in O that have s as the nearest site.
Spatial Query Processing: It focuses on extracting information from a large amount of spatial data without actually changing the spatial database. It is different to the concept of query optimization that focuses in finding the best query evaluation plan that minimizes the most relevant performance measure (e.g. CPU, I/O, etc.).
K Closest Pairs Query: This spatial query involves two spatial datasets and a cardinality threshold K (K=1). It discovers the K distinct pairs of objects from the two input datasets that have the K smallest distances between them.
K Nearest Neighbors Join: This spatial query involves two spatial datasets and a cardinality threshold K (K=1). The answer is a set of pairs from the two input datasets that includes, for each of the spatial objects of the first dataset, the pairs formed with each of its K nearest neighbors in the second dataset.
Filter-Refine Paradigm: Algorithms that follow this paradigm are two-step algorithms. Filter step: an approximation of each spatial object is used to produce a set of candidates (and, possibly, a set of actual answers), which is a superset of the answer set consisting of actual answers and false hits. Refinement step: each candidate from the filter step is then examined with respect to its exact geometry in order to produce the answer set by eliminating false hits.