Data Hierarchies for Generalization of Imprecise Data

Data Hierarchies for Generalization of Imprecise Data

Frederick Petry, Ronald R. Yager
Copyright: © 2023 |Pages: 14
DOI: 10.4018/978-1-7998-9220-5.ch117
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Issues related to managing imprecise data in areas as diverse as spatial and environmental data, forensic evidence, and economics must be dealt with for effective decision making. To make use of such information, the authors settle on how the various pieces of data can be used to make a decision or take an action. This involves some sort of summarization and generalization of the pieces of data as to what conclusions they can support. To address these issues, use of fuzzy, interval valued, and intuitionistic concept hierarchies for generalization can extend previous approaches to deal with the uncertainty of data. A number of approaches to characterizing such decompositions for the resolution of the evidence using these hierarchies is also needed. The characterization of hierarchies indicates that set decompositions are needed to represent the uncertainty of the hierarchies. To characterize these decompositions, granularity measures and overlap measures must be developed and examples of each discussed. Additionally, information measures can be introduced to be used for these evaluations.
Chapter Preview
Top

Introduction

Issues related to managing imprecise data in areas as diverse as spatial and environmental data, forensic evidence and economics must be dealt with for effective decision making. In order to make use of such information, we have to settle on how the various pieces of data can be used to make a decision or take an action. This can involve some sort of summarization and generalization of the pieces of data as to what conclusions they can support (Yager, 1991; Kacpryzk, 1999; Dubois & Prade, 2000). A currently emerging issue is the management of uncertain information arising from multiple sources and of many forms that appear in the everyday activities and decisions of humans. This can range from data / information obtained by sensors to the subjective information from individuals or analysts. Today ever more massive amounts of multi-source heterogeneous data / information is prevalent such as in systems managing the problem of Big Data (Miller & Miller, 2013; Richards & Rowe, 1999) However while effective decision-making should be able to make use of all the available, relevant information about such combined uncertainty, assessment of the value of a generalization result is critical. One possible approach for such a generalization process can be found in the use of concept hierarchical generalization (Raschia & Mouaddib, 2002; Yager & Petry, 2006). In previous research the problem of evidence resolution was studied for crisp concept hierarchies (Petry & Yager, 2008).

As one example of where data generalization is needed for decision making is with data related to criminal forensics. Federal Bureau of Investigation (FBI) researchers have made use of GIS data for forensic evidence evaluation in criminal cases. Spatial distribution of soils (Pye, 2007), pollens (Brown, Smith & Elmhurst, 2002) and other trace evidence are represented by individual layers with uncertainty as to the exact spatial areas of such information. These are then overlaid, generalized and the areas aggregated to focus on possible sites of interest for further investigation of a crime.

For example, in the case of a suspicious death, depending on the environmental conditions, a medical examiner may provide a likely range of time of death, but allow a possible wider time interval. So, overlap between the above time of death and a temporal interval when a potential suspect might have been in the area of the murder could be crucial in an investigation. Also, forensic anthropology is concerned with evaluations of skeletal-age-at- death (Hoppa & Vaupel, 2002) and must deal with the uncertainties of missing remains and weathering effects to provide estimates of possible temporal age intervals.

To address these issues the use of fuzzy, interval valued and intuitionistic concept hierarchies for generalization can extend previous approaches to deal with the uncertainty of data. A number of approaches to characterizing such decompositions for the resolution of the evidence using these hierarchies is also needed. The characterization of hierarchies indicates that set decompositions are needed to represent the uncertainty of the hierarchies. To characterize these decompositions granularity measures and overlap measures must be developed and examples of each discussed. Additionally, information measures can introduce to be used for these evaluations.

Key Terms in this Chapter

Intuitionistic Fuzzy Set Theory: Intuitionistic fuzzy set theory extends ordinary fuzzy set theory by allowing both positive and negative memberships to be specified.

Rough Sets: Rough set theory is a technique for dealing with uncertainty and for identifying cause-effect relationships. It is based on a partitioning of some domain into equivalence classes and the defining of lower and upper approximation regions based on this partitioning to denote certain and possible inclusion in the rough set.

Decision Making: The process of making choices among alternative based on sours of information or intelligence.

Forensics: Forensic scientists collect and analyze scientific evidence during the course of an investigation. Interpretation of such data by investigators is used in solving crimes.

Fuzzy Set Theory: The concept of fuzzy sets was introduced by Lotfi Zadeh. In ordinary sets a data values either belongs or does not belong to the set. However fuzzy set theory allows a gradual assessment of the membership of data values in a set described by of a membership function Where elements can either belong or not belong to a regular set, with fuzzy sets elements can belong to the set to a certain degree with zero indicating not an element, one indicating complete membership, and values between zero and one indicating partial or uncertain membership in the set. Fuzzy set theory has been used in a wide range of applications in which information is incomplete or imprecise.

Partitions: A partition of a set is a grouping of its elements into nonempty subsets, in such a way that every element is included in exactly one subset.

Dempster-Shafer Theory: Dempster-Shafer theory is a well-known approach to modeling uncertainty providing representation of non-specific forms of uncertainty. A Dempster-Shafer belief structure consists of non-empty crisp subsets of the data where a probability is given for each subset. An important difference with probability is that these probabilities do not have to sum to one.

Hesitation: In an intuitionistic fuzzy set, hesitation is the amount of uncertainty in which the set values are indeterminate, sum to less than one.

Concept Hierarchies: A way to organize concepts defined in a way to organize concepts defined in a knowledge domain. It can be collection of objects, events, or other items with common properties arranged in a multilevel structure.

Complete Chapter List

Search this Book:
Reset