Article Preview
TopIntroduction
Many large research efforts are currently focused on the problem known as Big Data (Boyd & Crawford, 2012, Michael & Miller, 2013). Issues for this involve effectively utilizing a vast amount of heterogeneous information from a variety of sources (Shekar, et al., 2012). Currently there is a great emphasis on the geophysical data that has a spatial basis or spatial aspects (Overpeck, et al., 2011). Advances in instrumentation and sensors have hugely increased the volume, velocity and variety of remote sensed data. For example the imagery data archived at the NASA EOSDIS (Earth Observing System Data and Information System) exceeds 3 PB (Petabytes) and is generating 5 TB (Terabytes) of data per day. To effectively utilize such volumes of data, data mining techniques are very critical (Vatsavi, et al., 2012). One factor that must be considered in particular is how to deal with the inherent uncertainty involved with the huge amount of such spatial data in databases.
Data mining or knowledge discovery (Witten, Frank & Hall 2011; Kantardzic, 2011) generally refers to a variety of techniques that have developed in the fields of databases, machine learning (Alpaydin 2004) and pattern recognition (Han and Kamber 2006). The intent is to uncover useful patterns and associations from large databases. For complex data such as that found in spatial databases (Shekar & Chawla 2003) the problem of data discovery is more involved (Lu et al., 1993, Miller & Han 2009).
Spatial data has traditionally been the domain of geography with various forms of maps as the standard representation. With the advent of computerization of maps, geographic information systems (GIS) have come to fore with spatial databases storing the underlying point, line and area structures needed to support GIS (Longley et al., 2010). A major difference between data mining in ordinary relational databases (Elmasri & Navathe 2010) and in spatial databases is that attributes of the neighbors of some object of interest may have an influence on the object and therefore have to be considered as well. The explicit location and extension of spatial objects define implicit relations of spatial neighborhood (such as topological, distance and direction relations), which are used by spatial data mining algorithms (Ester et al., 2000).
Additionally when wish to consider vagueness or uncertainty in the spatial data mining process (Burrough & Frank 1996, Zhang & Goodchild 2002), an additional level of difficulty is added. In this chapter we describe one of the most common data mining approaches, discovery of association rules, for spatial data for which we consider uncertainty in the extraction rules as represented by both fuzzy set and rough set techniques.