It is well known that in decision making under uncertainty, while we are guided by a general (and abstract) theory of probability and of statistical inference, each specific type of observed data requires its own analysis. Thus, while textbook techniques treat precisely observed data in multivariate analysis, there are many open research problems when data are censored (e.g., in medical or bio-statistics), missing, or partially observed (e.g., in bioinformatics). Data can be imprecise due to various reasons, for example, due to fuzziness of linguistic data. Imprecise observed data are usually called coarse data. In this chapter, we consider coarse data which are both random and fuzzy. Fuzziness is a form of imprecision often encountered in perception-based information. In order to develop statistical reference procedures based on such data, we need to model random fuzzy data as bona fide random elements, that is, we need to place random fuzzy data completely within the rigorous theory of probability. This chapter presents the most general framework for random fuzzy data, namely the framework of random fuzzy sets. We also describe several applications of this framework.
From Multivariate Statistical Analysis To Random Sets
What is a random set? An intuitive meaning. What is a random set? Crudely speaking, a random number means that we have different numbers with different probabilities; a random vector means that we have different vectors with different probabilities; and similarly, a random set means that we have different sets with different probabilities.
How can we describe this intuitive idea in precise terms? To provide such a formalization, let us recall how probabilities and random vectors are usually defined.
How probabilities are usually defined. To describe probabilities, in general, we must have a set of possible situations , and we must be able to describe the probability P of different properties of such situations. In mathematical terms, a property can be characterized by the set of all the situations which satisfy this property. Thus, we must assign to sets , the probability value .
According to the intuitive meaning of probability (e.g., as frequency), if we have two disjoint sets and , then we must have . Similarly, if we have countably many mutually disjoint sets , we must have .
A mapping which satisfies this property is called -additive.
It is known that even in the simplest situations, for example, when we randomly select a number from the interval , it is not possible to have a -additive function which would be defined on all subsets of . Thus, we must restrict ourselves to a class of subsets of . Since subsets represent properties, a restriction on subsets means restriction on properties. If we allow two properties and , then we should also be able to consider their logical combinations , , and – which in set terms correspond to union, intersection, and complement. Similarly, if we have a sequence of properties , then we should also allow properties and which correspond to countable union and intersection. Thus, the desired family should be closed under (countable) union, (countable) intersection, and complement. Such a family is called a -algebra.